[CRIU] [PATCH v2] early_init: Force to unlimit files on early init

Mike Rapoport rppt at linux.ibm.com
Wed Mar 13 08:49:48 MSK 2019


On Wed, Mar 13, 2019 at 12:28:51AM +0300, Cyrill Gorcunov wrote:
> It is so because when we checkpoint containers with huge number
> of files opened we get
> 
>  | (00.659733) Error (criu/util.c:403): Can't open 57950: Too many open files
>  | (00.659741) Error (criu/proc_parse.c:2399): Can't open 57950/task on procfs: Too many open files
>  | (00.675635) Error (criu/cr-dump.c:2111): Dumping FAILED.
>  | Failed to checkpoint the Container
> 
> simply because we're opening more files for criu itself. This mostly
> reverts a148e87469a36dc75e263259e39ed30336343fc8
> 
> v2:
>  - fix typo and naming
> 
> Signed-off-by: Cyrill Gorcunov <gorcunov at virtuozzo.com>

Reviewed-by: Mike Rapoport <rppt at linux.ibm.com>

> ---
>  criu/cr-restore.c | 32 --------------------------------
>  criu/crtools.c    | 33 +++++++++++++++++++++++++++++++++
>  2 files changed, 33 insertions(+), 32 deletions(-)
> 
> diff --git a/criu/cr-restore.c b/criu/cr-restore.c
> index af2ca2921d00..a6cb8f1b3e07 100644
> --- a/criu/cr-restore.c
> +++ b/criu/cr-restore.c
> @@ -2327,42 +2327,10 @@ int prepare_dummy_task_state(struct pstree_item *pi)
>  	return 0;
>  }
>  
> -static void rlimit_unlimit_nofile_self(void)
> -{
> -	struct rlimit new;
> -
> -	new.rlim_cur = kdat.sysctl_nr_open;
> -	new.rlim_max = kdat.sysctl_nr_open;
> -
> -	if (prlimit(getpid(), RLIMIT_NOFILE, &new, NULL)) {
> -		pr_perror("rlimir: Can't setup RLIMIT_NOFILE for self");
> -		return;
> -	} else
> -		pr_debug("rlimit: RLIMIT_NOFILE unlimited for self\n");
> -	service_fd_rlim_cur = kdat.sysctl_nr_open;
> -}
> -
>  int cr_restore_tasks(void)
>  {
>  	int ret = -1;
>  
> -	/*
> -	 * Service fd engine implies that file descriptors
> -	 * used won't be borrowed by the rest of the code
> -	 * and default 1024 limit is not enough for high
> -	 * loaded test/containers. Thus use kdat engine
> -	 * to fetch current system level limit for numbers
> -	 * of files allowed to open up and lift up own
> -	 * limits.
> -	 *
> -	 * Note we have to do it before the service fd
> -	 * get inited and we dont exit with errors here
> -	 * because in worst scenario where clash of fd
> -	 * happen we simply exit with explicit error
> -	 * during real action stage.
> -	 */
> -	rlimit_unlimit_nofile_self();
> -
>  	if (cr_plugin_init(CR_PLUGIN_STAGE__RESTORE))
>  		return -1;
>  
> diff --git a/criu/crtools.c b/criu/crtools.c
> index 3f64000c1e87..dcfd619ae3c1 100644
> --- a/criu/crtools.c
> +++ b/criu/crtools.c
> @@ -51,8 +51,41 @@
>  #include "sysctl.h"
>  #include "img-remote.h"
>  
> +static void rlimit_unlimit_nofile(void)
> +{
> +	struct rlimit new;
> +
> +	new.rlim_cur = kdat.sysctl_nr_open;
> +	new.rlim_max = kdat.sysctl_nr_open;
> +
> +	if (prlimit(getpid(), RLIMIT_NOFILE, &new, NULL)) {
> +		pr_perror("rlimit: Can't setup RLIMIT_NOFILE for self");
> +		return;
> +	} else
> +		pr_debug("rlimit: RLIMIT_NOFILE unlimited for self\n");
> +	service_fd_rlim_cur = kdat.sysctl_nr_open;
> +}
> +
>  static int early_init(void)
>  {
> +	/*
> +	 * Service fd engine implies that file descritprs
> +	 * used won't be borrowed by the rest of the code
> +	 * and default 1024 limit is not enough for high
> +	 * loaded test/containers. Thus use kdat engine
> +	 * to fetch current system level limit for numbers
> +	 * of files allowed to open up and lift up own
> +	 * limits.
> +	 *
> +	 * Note we have to do it before the service fd
> +	 * get inited and we dont exit with errors here
> +	 * because in worst scenario where clash of fd
> +	 * happen we simply exit with explicit error
> +	 * during real action stage.
> +	 */
> +	if (!kerndat_files_stat(true))
> +		rlimit_unlimit_nofile();
> +
>  	if (init_service_fd())
>  		return 1;
>  
> -- 
> 2.20.1
> 

-- 
Sincerely yours,
Mike.



More information about the CRIU mailing list