[CRIU] Problem in Seizing Open File Descriptors?

Saied Kazemi saied at google.com
Tue Jul 15 07:34:50 PDT 2014


Thanks for the quick feedback.  I am using CRIU 1.3-rc2 (at commit
e1b56c8fa) with Docker version 1.1.0 on Ubuntu 14.04 which does not provide
mnt_id in /proc/pid/fdinfo/fd files.

I will look into Docker source today.  Assuming that it does open /dev/null
before moving into the namespaces, can CRIU handle it?

--Saied



On Tue, Jul 15, 2014 at 5:24 AM, Pavel Emelyanov <xemul at parallels.com>
wrote:

> On 07/15/2014 03:51 PM, Pavel Emelyanov wrote:
> > On 07/15/2014 10:30 AM, Saied Kazemi wrote:
> >> Hi Pavel,
> >>
> >> There seems to be a problem in or below parasite_drain_fds_seized()
> when seizing a process's open file descriptors.  Here is the problem I ran
> into:
> >>
> >> When a Docker container is started in the detached mode (-d flag), its
> stdin inside its own mount
> >> namespace is set to its /dev/null as you can see below:
>
> Actually we do this regularly in our zdtm tests. If you start the
> ns/static/env00 one you'd see
>
> # ps
>  2843 ?        Ss     0:00 ./env00 --pidfile=env00.pid --outfile=env00.out
> --envname=ENV_00_TEST
>  2846 ?        Ss     0:00  \_ ./env00 --pidfile=env00.pid
> --outfile=env00.out --envname=ENV_00_TEST
>
> These are container's init 2343 and the test itself 2846.
> If we compare the namespaces
>
> [root at localhost test]# ls -l /proc/self/ns/mnt
> lrwxrwxrwx 1 root root 0 Jul 15 16:21 /proc/self/ns/mnt -> mnt:[4026531840
> ]
> [root at localhost test]# ls -l /proc/2846/ns/mnt
> lrwxrwxrwx 1 root root 0 Jul 15 16:21 /proc/2846/ns/mnt -> mnt:[4026532201
> ]
>
> we see they live in different ones. And the test does opens /dev/null
>
> [root at localhost test]# ls -l /proc/2846/fd
> total 0
> lrwx------ 1 root root 64 Jul 15 16:21 0 -> /dev/null
> l-wx------ 1 root root 64 Jul 15 16:21 1 ->
> /zdtm/live/static/env00.out.inprogress
> l-wx------ 1 root root 64 Jul 15 16:21 2 ->
> /zdtm/live/static/env00.out.inprogress
>
> which is
>
> [root at localhost test]# stat -L /proc/2846/fd/0
>   File: ‘/proc/2846/fd/0’
>   Size: 0               Blocks: 0          IO Block: 4096   character
> special file
> Device: fd01h/64769d    Inode: 40940       Links: 1     Device type: 1,3
> ...
>
> And the host's /dev/null is
>
> [root at localhost test]# stat /dev/null
>   File: ‘/dev/null’
>   Size: 0               Blocks: 0          IO Block: 4096   character
> special file
> Device: 5h/5d   Inode: 6073        Links: 1     Device type: 1,3
> ...
>
> And this tests gets dumped successfully. It looks like docker does open
> the /dev/null
> from host before diving into namespaces.
>
> >> $ docker run -d ubuntu:latest /bin/sh -c 'ls -l /proc/self/fd >> /LOG;
> stat /dev/null >> /LOG; sleep 3000'
> >> 64bb55e56db391c11d3d8442fdb2f960252ce4c8edc6349d59d73b692d1b0b6c
> >> $
> >>
> >> $ sudo cat
> /var/lib/docker/vfs/dir/64bb55e56db391c11d3d8442fdb2f960252ce4c8edc6349d59d73b692d1b0b6c/LOG
> >> total 0
> >> lr-x------ 1 root root 64 Jul 15 05:59 0 -> /dev/null
> >> l-wx------ 1 root root 64 Jul 15 05:59 1 -> /LOG
> >> l-wx------ 1 root root 64 Jul 15 05:59 2 -> pipe:[47269]
> >> lr-x------ 1 root root 64 Jul 15 05:59 3 -> /proc/9/fd
> >>   File: '/dev/null'
> >>   Size: 0         Blocks: 0          IO Block: 4096   character special
> file
> >> Device: 2ah/42dInode: 47496       Links: 1     Device type: 1,3
> >> Access: (0666/crw-rw-rw-)  Uid: (    0/    root)   Gid: (    0/    root)
> >> Access: 2014-07-15 05:59:48.235291004 +0000
> >> Modify: 2014-07-15 05:59:48.235291004 +0000
> >> Change: 2014-07-15 05:59:48.235291004 +0000
> >>  Birth: -
> >> $
> >>
> >> Apparently, what is recorded as the open file descriptor 0 during dump
> is the system's /dev/null in the global mount namespace, not the /dev/null
> in the container's mount namespace.  As a result, we get the following
> error in check_map_remap():
> >>
> >> (00.061198) Error (files-reg.c:605): Unaccessible path ./dev/null
> opened 42:47496, need 5:5294
> >
> > OK, so this means, that path refers to 42:47496 file while descriptor to
> 5:5294. What version of criu do you use?
> > Does your kernel exposes the mnt_id in /proc/pid/fdinfo/fd files?
> >
> >> Notice that 5:5294 is system's /dev/null in the global mount namespace
> (see the stat command below) whereas 42:47496 is the container's /dev/null.
> >>
> >> $ stat /dev/null
> >>   File: ‘/dev/null’
> >>   Size: 0         Blocks: 0          IO Block: 4096   character special
> file
> >> Device: 5h/5dInode: 5294        Links: 1     Device type: 1,3
> >> Access: (0666/crw-rw-rw-)  Uid: (    0/    root)   Gid: (    0/    root)
> >> Access: 2014-07-14 11:20:13.847273000 -0700
> >> Modify: 2014-07-14 11:20:13.847273000 -0700
> >> Change: 2014-07-14 11:20:13.847273000 -0700
> >>  Birth: -
> >> $
> >>
> >> Attached is dump.log.  Does this analysis make sense or am I missing
> something?
> >>
> >> --Saied
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20140715/97e7fa91/attachment-0001.html>


More information about the CRIU mailing list