[CRIU] Problem in Seizing Open File Descriptors?

Pavel Emelyanov xemul at parallels.com
Tue Jul 15 05:24:27 PDT 2014


On 07/15/2014 03:51 PM, Pavel Emelyanov wrote:
> On 07/15/2014 10:30 AM, Saied Kazemi wrote:
>> Hi Pavel,
>>
>> There seems to be a problem in or below parasite_drain_fds_seized() when seizing a process's open file descriptors.  Here is the problem I ran into:
>>
>> When a Docker container is started in the detached mode (-d flag), its stdin inside its own mount
>> namespace is set to its /dev/null as you can see below:

Actually we do this regularly in our zdtm tests. If you start the ns/static/env00 one you'd see

# ps
 2843 ?        Ss     0:00 ./env00 --pidfile=env00.pid --outfile=env00.out --envname=ENV_00_TEST
 2846 ?        Ss     0:00  \_ ./env00 --pidfile=env00.pid --outfile=env00.out --envname=ENV_00_TEST

These are container's init 2343 and the test itself 2846.
If we compare the namespaces

[root at localhost test]# ls -l /proc/self/ns/mnt 
lrwxrwxrwx 1 root root 0 Jul 15 16:21 /proc/self/ns/mnt -> mnt:[4026531840]
[root at localhost test]# ls -l /proc/2846/ns/mnt 
lrwxrwxrwx 1 root root 0 Jul 15 16:21 /proc/2846/ns/mnt -> mnt:[4026532201]

we see they live in different ones. And the test does opens /dev/null

[root at localhost test]# ls -l /proc/2846/fd
total 0
lrwx------ 1 root root 64 Jul 15 16:21 0 -> /dev/null
l-wx------ 1 root root 64 Jul 15 16:21 1 -> /zdtm/live/static/env00.out.inprogress
l-wx------ 1 root root 64 Jul 15 16:21 2 -> /zdtm/live/static/env00.out.inprogress

which is

[root at localhost test]# stat -L /proc/2846/fd/0
  File: ‘/proc/2846/fd/0’
  Size: 0         	Blocks: 0          IO Block: 4096   character special file
Device: fd01h/64769d	Inode: 40940       Links: 1     Device type: 1,3
...

And the host's /dev/null is

[root at localhost test]# stat /dev/null 
  File: ‘/dev/null’
  Size: 0         	Blocks: 0          IO Block: 4096   character special file
Device: 5h/5d	Inode: 6073        Links: 1     Device type: 1,3
...

And this tests gets dumped successfully. It looks like docker does open the /dev/null
from host before diving into namespaces.

>> $ docker run -d ubuntu:latest /bin/sh -c 'ls -l /proc/self/fd >> /LOG; stat /dev/null >> /LOG; sleep 3000'
>> 64bb55e56db391c11d3d8442fdb2f960252ce4c8edc6349d59d73b692d1b0b6c
>> $
>>
>> $ sudo cat /var/lib/docker/vfs/dir/64bb55e56db391c11d3d8442fdb2f960252ce4c8edc6349d59d73b692d1b0b6c/LOG
>> total 0
>> lr-x------ 1 root root 64 Jul 15 05:59 0 -> /dev/null
>> l-wx------ 1 root root 64 Jul 15 05:59 1 -> /LOG
>> l-wx------ 1 root root 64 Jul 15 05:59 2 -> pipe:[47269]
>> lr-x------ 1 root root 64 Jul 15 05:59 3 -> /proc/9/fd
>>   File: '/dev/null'
>>   Size: 0         Blocks: 0          IO Block: 4096   character special file
>> Device: 2ah/42dInode: 47496       Links: 1     Device type: 1,3
>> Access: (0666/crw-rw-rw-)  Uid: (    0/    root)   Gid: (    0/    root)
>> Access: 2014-07-15 05:59:48.235291004 +0000
>> Modify: 2014-07-15 05:59:48.235291004 +0000
>> Change: 2014-07-15 05:59:48.235291004 +0000
>>  Birth: -
>> $
>>
>> Apparently, what is recorded as the open file descriptor 0 during dump is the system's /dev/null in the global mount namespace, not the /dev/null in the container's mount namespace.  As a result, we get the following error in check_map_remap():
>>
>> (00.061198) Error (files-reg.c:605): Unaccessible path ./dev/null opened 42:47496, need 5:5294
> 
> OK, so this means, that path refers to 42:47496 file while descriptor to 5:5294. What version of criu do you use?
> Does your kernel exposes the mnt_id in /proc/pid/fdinfo/fd files?
> 
>> Notice that 5:5294 is system's /dev/null in the global mount namespace (see the stat command below) whereas 42:47496 is the container's /dev/null.
>>
>> $ stat /dev/null
>>   File: ‘/dev/null’
>>   Size: 0         Blocks: 0          IO Block: 4096   character special file
>> Device: 5h/5dInode: 5294        Links: 1     Device type: 1,3
>> Access: (0666/crw-rw-rw-)  Uid: (    0/    root)   Gid: (    0/    root)
>> Access: 2014-07-14 11:20:13.847273000 -0700
>> Modify: 2014-07-14 11:20:13.847273000 -0700
>> Change: 2014-07-14 11:20:13.847273000 -0700
>>  Birth: -
>> $
>>
>> Attached is dump.log.  Does this analysis make sense or am I missing something?
>>
>> --Saied
> 



More information about the CRIU mailing list