[CRIU] Problem in Seizing Open File Descriptors?
Saied Kazemi
saied at google.com
Tue Jul 15 07:34:50 PDT 2014
Thanks for the quick feedback. I am using CRIU 1.3-rc2 (at commit
e1b56c8fa) with Docker version 1.1.0 on Ubuntu 14.04 which does not provide
mnt_id in /proc/pid/fdinfo/fd files.
I will look into Docker source today. Assuming that it does open /dev/null
before moving into the namespaces, can CRIU handle it?
--Saied
On Tue, Jul 15, 2014 at 5:24 AM, Pavel Emelyanov <xemul at parallels.com>
wrote:
> On 07/15/2014 03:51 PM, Pavel Emelyanov wrote:
> > On 07/15/2014 10:30 AM, Saied Kazemi wrote:
> >> Hi Pavel,
> >>
> >> There seems to be a problem in or below parasite_drain_fds_seized()
> when seizing a process's open file descriptors. Here is the problem I ran
> into:
> >>
> >> When a Docker container is started in the detached mode (-d flag), its
> stdin inside its own mount
> >> namespace is set to its /dev/null as you can see below:
>
> Actually we do this regularly in our zdtm tests. If you start the
> ns/static/env00 one you'd see
>
> # ps
> 2843 ? Ss 0:00 ./env00 --pidfile=env00.pid --outfile=env00.out
> --envname=ENV_00_TEST
> 2846 ? Ss 0:00 \_ ./env00 --pidfile=env00.pid
> --outfile=env00.out --envname=ENV_00_TEST
>
> These are container's init 2343 and the test itself 2846.
> If we compare the namespaces
>
> [root at localhost test]# ls -l /proc/self/ns/mnt
> lrwxrwxrwx 1 root root 0 Jul 15 16:21 /proc/self/ns/mnt -> mnt:[4026531840
> ]
> [root at localhost test]# ls -l /proc/2846/ns/mnt
> lrwxrwxrwx 1 root root 0 Jul 15 16:21 /proc/2846/ns/mnt -> mnt:[4026532201
> ]
>
> we see they live in different ones. And the test does opens /dev/null
>
> [root at localhost test]# ls -l /proc/2846/fd
> total 0
> lrwx------ 1 root root 64 Jul 15 16:21 0 -> /dev/null
> l-wx------ 1 root root 64 Jul 15 16:21 1 ->
> /zdtm/live/static/env00.out.inprogress
> l-wx------ 1 root root 64 Jul 15 16:21 2 ->
> /zdtm/live/static/env00.out.inprogress
>
> which is
>
> [root at localhost test]# stat -L /proc/2846/fd/0
> File: ‘/proc/2846/fd/0’
> Size: 0 Blocks: 0 IO Block: 4096 character
> special file
> Device: fd01h/64769d Inode: 40940 Links: 1 Device type: 1,3
> ...
>
> And the host's /dev/null is
>
> [root at localhost test]# stat /dev/null
> File: ‘/dev/null’
> Size: 0 Blocks: 0 IO Block: 4096 character
> special file
> Device: 5h/5d Inode: 6073 Links: 1 Device type: 1,3
> ...
>
> And this tests gets dumped successfully. It looks like docker does open
> the /dev/null
> from host before diving into namespaces.
>
> >> $ docker run -d ubuntu:latest /bin/sh -c 'ls -l /proc/self/fd >> /LOG;
> stat /dev/null >> /LOG; sleep 3000'
> >> 64bb55e56db391c11d3d8442fdb2f960252ce4c8edc6349d59d73b692d1b0b6c
> >> $
> >>
> >> $ sudo cat
> /var/lib/docker/vfs/dir/64bb55e56db391c11d3d8442fdb2f960252ce4c8edc6349d59d73b692d1b0b6c/LOG
> >> total 0
> >> lr-x------ 1 root root 64 Jul 15 05:59 0 -> /dev/null
> >> l-wx------ 1 root root 64 Jul 15 05:59 1 -> /LOG
> >> l-wx------ 1 root root 64 Jul 15 05:59 2 -> pipe:[47269]
> >> lr-x------ 1 root root 64 Jul 15 05:59 3 -> /proc/9/fd
> >> File: '/dev/null'
> >> Size: 0 Blocks: 0 IO Block: 4096 character special
> file
> >> Device: 2ah/42dInode: 47496 Links: 1 Device type: 1,3
> >> Access: (0666/crw-rw-rw-) Uid: ( 0/ root) Gid: ( 0/ root)
> >> Access: 2014-07-15 05:59:48.235291004 +0000
> >> Modify: 2014-07-15 05:59:48.235291004 +0000
> >> Change: 2014-07-15 05:59:48.235291004 +0000
> >> Birth: -
> >> $
> >>
> >> Apparently, what is recorded as the open file descriptor 0 during dump
> is the system's /dev/null in the global mount namespace, not the /dev/null
> in the container's mount namespace. As a result, we get the following
> error in check_map_remap():
> >>
> >> (00.061198) Error (files-reg.c:605): Unaccessible path ./dev/null
> opened 42:47496, need 5:5294
> >
> > OK, so this means, that path refers to 42:47496 file while descriptor to
> 5:5294. What version of criu do you use?
> > Does your kernel exposes the mnt_id in /proc/pid/fdinfo/fd files?
> >
> >> Notice that 5:5294 is system's /dev/null in the global mount namespace
> (see the stat command below) whereas 42:47496 is the container's /dev/null.
> >>
> >> $ stat /dev/null
> >> File: ‘/dev/null’
> >> Size: 0 Blocks: 0 IO Block: 4096 character special
> file
> >> Device: 5h/5dInode: 5294 Links: 1 Device type: 1,3
> >> Access: (0666/crw-rw-rw-) Uid: ( 0/ root) Gid: ( 0/ root)
> >> Access: 2014-07-14 11:20:13.847273000 -0700
> >> Modify: 2014-07-14 11:20:13.847273000 -0700
> >> Change: 2014-07-14 11:20:13.847273000 -0700
> >> Birth: -
> >> $
> >>
> >> Attached is dump.log. Does this analysis make sense or am I missing
> something?
> >>
> >> --Saied
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20140715/97e7fa91/attachment-0001.html>
More information about the CRIU
mailing list