[CRIU] Missing Container Subdir in Cgroups

Mon Jun 23 23:46:43 PDT 2014

On 06/24/2014 04:47 AM, Saied Kazemi wrote:
> 
> 
> 
> On Mon, Jun 23, 2014 at 1:47 AM, Pavel Emelyanov <xemul at parallels.com <mailto:xemul at parallels.com>> wrote:
> 
>     On 06/21/2014 04:21 AM, Saied Kazemi wrote:
>     > CRIU fails to restore a process running inside a Docker container with the error message:
>     >
>     > (00.035110)      1: Error (cgroup.c:278): cg: Can't move into blkio//docker/2fda692b0fd31c20197b84b8ca5e172679dfaf9028c7322b7bb43acf061626cf/tasks (-1/-1)
>     >
>     > This is because the container subdirectory (i.e., 64-character ID above) is not created under docker.
>     >
>     > I applied the following quick patch as a workaround and was able to successfully restore and resume a process
>     > running inside a Docker container.  But the issue requires more study and changes as simply recreating the
>     > directories on demand may not be enough.
> 
>     Yes, the problem with cgroup FS tree is a bit deeper. Not only we should create the
>     directories of cgroups tasks live in, but also create the sub-directories with no tasks
>     e.g. to handle the case when a task does
> 
>     mkdir "cg/subcg"
>     echo $pid > "cg/subcg/tasks"
> 
>     and we dump this app in between these two calls.
> 
> 
> I did sone experiments and below is what I found.
> 
> As long as the cgroup is not deleted, we should be ok because the process will enter the
> cgroup after restore when echo is done.  Isn't this like the case where "cg/subcg" is just
> a normal directory and we're trying to create a file in it?  If the normal directory is
> deleted after dump and before restore, the echo will fail.

Sure, but your patch creates directories, which makes me think that some of them
are removed. Are they?

>     Other than this we probably should take care of the configuration of these subcgroups,
>     e.g. mem.* files in memsg and other stuff.
> 
> 
> The configuration information is not lost after dump as long as the cgroups (or subcgroups)
> are not deleted, so restore shouldn't have to worry about it.  It's just like the case above.  
> Am I missing something obvious here?

Well, for simple Dump-Restore loop it's true. But when live-migration is used, the
destination node may lack both -- the directories themselves and the configuration.
Thus we should somehow handle them.

> 
>     And the third thing -- there may be cgroup FS mountpoints inside the mount namespace
>     we dump. These should be dumped as well.
> 
> 
> Yes, I created a cgroup mountpoint inside my mount namespace (at /cgroup_dir) which caused criu dump to fail with the error message:
> 
>     Error (mount.c:414): FS mnt ./cgroup_dir dev 0x22 root / unsupported id 4f

Yes. This would be the first thing to fix. Other than this subcgroup can be bind-mounted
into container, so we'd have not just mount it, but mount it with proper root offset.

> Fortunately, Docker containers I've been using so far do not create such internal cgroup mountpoints.
>  Also, I don't think LXC does it either.  So support for this case can have low priority.

I agree.

> At this point, we need to converge on a solution for creating cgroups subdirs.  I am still using
> the patch that I sent you before because I tried the other patch I saw in the mailing list this 
> morning but it didn't work.

Thanks,
Pavel