[CRIU] Missing Container Subdir in Cgroups
Pavel Emelyanov
xemul at parallels.com
Mon Jun 23 23:46:43 PDT 2014
On 06/24/2014 04:47 AM, Saied Kazemi wrote:
>
>
>
> On Mon, Jun 23, 2014 at 1:47 AM, Pavel Emelyanov <xemul at parallels.com <mailto:xemul at parallels.com>> wrote:
>
> On 06/21/2014 04:21 AM, Saied Kazemi wrote:
> > CRIU fails to restore a process running inside a Docker container with the error message:
> >
> > (00.035110) 1: Error (cgroup.c:278): cg: Can't move into blkio//docker/2fda692b0fd31c20197b84b8ca5e172679dfaf9028c7322b7bb43acf061626cf/tasks (-1/-1)
> >
> > This is because the container subdirectory (i.e., 64-character ID above) is not created under docker.
> >
> > I applied the following quick patch as a workaround and was able to successfully restore and resume a process
> > running inside a Docker container. But the issue requires more study and changes as simply recreating the
> > directories on demand may not be enough.
>
> Yes, the problem with cgroup FS tree is a bit deeper. Not only we should create the
> directories of cgroups tasks live in, but also create the sub-directories with no tasks
> e.g. to handle the case when a task does
>
> mkdir "cg/subcg"
> echo $pid > "cg/subcg/tasks"
>
> and we dump this app in between these two calls.
>
>
> I did sone experiments and below is what I found.
>
> As long as the cgroup is not deleted, we should be ok because the process will enter the
> cgroup after restore when echo is done. Isn't this like the case where "cg/subcg" is just
> a normal directory and we're trying to create a file in it? If the normal directory is
> deleted after dump and before restore, the echo will fail.
Sure, but your patch creates directories, which makes me think that some of them
are removed. Are they?
> Other than this we probably should take care of the configuration of these subcgroups,
> e.g. mem.* files in memsg and other stuff.
>
>
> The configuration information is not lost after dump as long as the cgroups (or subcgroups)
> are not deleted, so restore shouldn't have to worry about it. It's just like the case above.
> Am I missing something obvious here?
Well, for simple Dump-Restore loop it's true. But when live-migration is used, the
destination node may lack both -- the directories themselves and the configuration.
Thus we should somehow handle them.
>
> And the third thing -- there may be cgroup FS mountpoints inside the mount namespace
> we dump. These should be dumped as well.
>
>
> Yes, I created a cgroup mountpoint inside my mount namespace (at /cgroup_dir) which caused criu dump to fail with the error message:
>
> Error (mount.c:414): FS mnt ./cgroup_dir dev 0x22 root / unsupported id 4f
Yes. This would be the first thing to fix. Other than this subcgroup can be bind-mounted
into container, so we'd have not just mount it, but mount it with proper root offset.
> Fortunately, Docker containers I've been using so far do not create such internal cgroup mountpoints.
> Also, I don't think LXC does it either. So support for this case can have low priority.
I agree.
> At this point, we need to converge on a solution for creating cgroups subdirs. I am still using
> the patch that I sent you before because I tried the other patch I saw in the mailing list this
> morning but it didn't work.
Thanks,
Pavel
More information about the CRIU
mailing list