[CRIU] [PATCH v2 2/2] restore: correctly restore cgroup mounts inside a container
Tycho Andersen
tycho.andersen at canonical.com
Fri Mar 25 10:46:55 PDT 2016
On Fri, Mar 25, 2016 at 08:37:48PM +0300, Pavel Emelyanov wrote:
> On 03/24/2016 08:09 PM, Tycho Andersen wrote:
> > Before the nsroot= mount option, we were just getting lucky because the
> > cgroup superblocks "matched" when inspecting them from userspace, so we
> > were actually getting a bind mount from the host when migrating from within
> > cgroup namespaces.
> >
> > Instead, let's actually do a new (i.e. not a bind mount) for cgroup
> > namespaces. For this, we need two things:
> >
> > 1. to prepare the cgroup namespace (and thus the cgroups) before the mount
> > ns, so when the mount() occurrs it is relative to the right cgroup path.
> >
> > 2. not reject cgroup filesystems with no root. A cgroup ns mount looks
> > like:
> >
> > 223 222 0:22 /lxc/unpriv /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd,nsroot=/lxc/unpriv
> >
> > i.e. it has /lxc/unpriv as its root, and thus doesn't look rooted to CRIU.
> > Let's allow cgroup mounts to be unrooted so we can deal with this.
>
> I have a suggestion how to avoid the hackish checks in validate and can_mount_now().
>
> 1. Add ->read_img callback to fstype called when reading mount points
> images (collect_mnt_from_image)
> 2. For cgroupfs check for root_ns_mask to contain CLONE_NEWCGROUPNS and
> cut the mi->root to be "/" effectively turning the mount point into
> fsroot one
>
> and leave the hunk that moves tasks into cgroups earlier. Hopefully before
> setting up namespaces would work, all the more so we configure the namespaces
> at the very end.
>
> What do you think?
Sounds good to me, I'll rework the patch and resend.
Thanks!
Tycho
More information about the CRIU
mailing list