[CRIU] [PATCH 1/2] Re-create cgroups if necessary

Tue Jun 24 12:34:25 PDT 2014

On Tue, Jun 24, 2014 at 10:05 AM, Pavel Emelyanov <xemul at parallels.com>
wrote:

> On 06/24/2014 09:01 PM, Saied Kazemi wrote:
> >
> >
> >
> > On Tue, Jun 24, 2014 at 9:26 AM, Pavel Emelyanov <xemul at parallels.com
> <mailto:xemul at parallels.com>> wrote:
> >
> >     On 06/24/2014 06:12 PM, Serge Hallyn wrote:
> >
> >     >> Yes. Emply cgroups cannot be discovered through /proc/pid/cgroup
> file,
> >     >> we should walk the alive cgroup mount. But the problem is -- we
> cannot
> >     >> just take the system /sys/fs/cgroup/ directories, since there
> will be
> >     >> cgroups from other containers as well. We should find the root
> subdir
> >     >> of the container we dump and walk _this_ subtree.
> >     >
> >     > I volunteer to work on a proper cgroup c/r implementation, once
> Tycho
> >     > gets the very basics done.
> >
> >     Serge, Tycho, I think I need to clarify one more thing.
> >
> >     I believe, that once we do full cgroups hierarchy restore all the
> >     mkdirs would go away from the move_in_cgroup() routine. Instead,
> >     we will have some code, that would construct all the cgroup subtree
> >     before criu will start forking tasks. And once we have it, the
> >     move_in_cgroup() would (should) never fail. Thus this patch would
> >     be effectively reversed.
> >
> >     Thanks,
> >     Pavel
> >
> >
> > I agree.  Creation of the cgroup and its subtree should be done in one
> place as opposed
> > to being split apart (i.e., between prepare_cgroup_sfd() and
> move_in_cgroup() as is done
> > currently).
> >
> > Regarding the 4 items to do for cgroups in your earlier email, I believe
> that we should
> > have CLI options to tell CRIU what cgroups it needs to restore (almost
> like the way we
> > tell it about external bind mounts).
>
> I was thinking that if we take the root task, check cgroups it lives in and
> dump the whole subtree starting from it, this would work properly and would
> not require and CLI hints.
>
> Do you mean, that we need to tell criu where in cgroup hierarchy to start
> recreating the subtree it dumped?
>
> > This way we can handle the empty cgroups as well as dumping and
> restoring on the same
> > machine versus on a different machine (i.e., migration).  For migration,
> CRIU definitely
> > needs to be told how to handle cgroups name collision.
>
> But if we ask criu to restore tasks in a fresh new sub-cgroup, why would
> this
> collision happen?
>
> > This is not something that it can handle at dump time.
> >
> > --Saied
>

I am not sure if I understand what is meant by "fresh new sub-cgroup".
 Since the process has to be restored in the same cgroup, I assume you mean
a new mountpoint.  But if the cgroup already exists, giving it a private
new mountpoint doesn't mean that it will set up a new hierarchy.  Consider
the following example:

# cat /sys/fs/cgroup/hugetlb/notify_on_release
# mkdir /mnt/foo
# mount -t cgroup -o hugetlb cgroup /mnt/foo
# cat /mnt/foo/notify_on_release
0
# echo 1 > /sys/fs/cgroup/hugetlb/notify_on_release
# cat /mnt/foo/notify_on_release
1
# echo 0 > /mnt/foo/notify_on_release
# cat /sys/fs/cgroup/hugetlb/notify_on_release
0
#

So I think we need a mechanism to tell CRIU whether it should expect the
cgroup already existing (e.g., restore on the same machine) or not (e.g.,
restore after reboot or on a different machine).

I am not a cgroups expert, but I hope it's more clear now.

--Saied
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20140624/943a7c22/attachment.html>