[CRIU] [PATCH 2/2] restore: correctly restore cgroup mounts inside a container

Pavel Emelyanov xemul at virtuozzo.com
Thu Mar 24 00:16:30 PDT 2016


On 03/24/2016 12:41 AM, Tycho Andersen wrote:
> Before the nsroot= mount option, we were just getting lucky because the
> cgroup superblocks "matched" when inspecting them from userspace, so we
> were actually getting a bind mount from the host when migrating from within
> cgroup namespaces.
> 
> Instead, let's actually do a new (i.e. not a bind mount) for cgroup
> namespaces. For this, we need two things:
> 
> 1. to prepare the cgroup namespace (and thus the cgroups) before the mount
>    ns, so when the mount() occurrs it is relative to the right cgroup path.
> 
> 2. not reject cgroup filesystems with no root. A cgroup ns mount looks
>    like:
> 
> 	 223 222 0:22 /lxc/unpriv /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd,nsroot=/lxc/unpriv
> 
>    i.e. it has /lxc/unpriv as its root, and thus doesn't look rooted to CRIU.
>    Let's allow cgroup mounts to be unrooted so we can deal with this.

Am I right that the only problem here is that criu doesn't support mounting
of anything that doesn't have root mount? If so, the correct fix would be
to support _this_ by creating a temporary root mount, then umounting it at
the very end.

> Signed-off-by: Tycho Andersen <tycho.andersen at canonical.com>
> ---
>  criu/cr-restore.c | 18 +++++++++---------
>  criu/mount.c      |  5 ++++-
>  2 files changed, 13 insertions(+), 10 deletions(-)
> 
> diff --git a/criu/cr-restore.c b/criu/cr-restore.c
> index ffd2f01..967d5fa 100644
> --- a/criu/cr-restore.c
> +++ b/criu/cr-restore.c
> @@ -1608,6 +1608,15 @@ static int restore_task_with_children(void *_arg)
>  			goto err;
>  	}
>  
> +	/*
> +	 * Call this _before_ forking to optimize cgroups
> +	 * restore -- if all tasks live in one set of cgroups
> +	 * we will only move the root one there, others will
> +	 * just have it inherited.
> +	 */
> +	if (prepare_task_cgroup(current) < 0)
> +		goto err;
> +
>  	/* Restore root task */
>  	if (current->parent == NULL) {
>  		if (restore_finish_stage(CR_STATE_RESTORE_NS) < 0)
> @@ -1640,15 +1649,6 @@ static int restore_task_with_children(void *_arg)
>  	if (prepare_mappings())
>  		goto err;
>  
> -	/*
> -	 * Call this _before_ forking to optimize cgroups
> -	 * restore -- if all tasks live in one set of cgroups
> -	 * we will only move the root one there, others will
> -	 * just have it inherited.
> -	 */
> -	if (prepare_task_cgroup(current) < 0)
> -		goto err;
> -
>  	if (prepare_sigactions() < 0)
>  		goto err;
>  
> diff --git a/criu/mount.c b/criu/mount.c
> index be79999..614a89a 100644
> --- a/criu/mount.c
> +++ b/criu/mount.c
> @@ -734,7 +734,7 @@ static int validate_mounts(struct mount_info *info, bool for_dump)
>  					 */
>  					ret = m->need_plugin ? 0 : -ENOTSUP;
>  
> -				if (ret < 0) {
> +				if (ret < 0 && m->fstype->code != FSTYPE__CGROUP) {
>  					if (ret == -ENOTSUP)
>  						pr_err("%d:%s doesn't have a proper root mount\n",
>  								m->mnt_id, m->mountpoint);
> @@ -2453,6 +2453,9 @@ static bool can_mount_now(struct mount_info *mi)
>  	if (mi->external)
>  		goto shared;
>  
> +	if (mi->fstype->code == FSTYPE__CGROUP)
> +		return true;
> +
>  	/*
>  	 * We're the slave peer:
>  	 *   - Make sure the master peer is already mounted
> 



More information about the CRIU mailing list