[CRIU] failure dumping nginx in docker container

Andrew Vagin avagin at odin.com
Thu Jul 2 02:52:31 PDT 2015


On Tue, Jun 30, 2015 at 03:39:28PM -0700, Ross Boucher wrote:
> Just following up from our irc conversation, here's the daemon strace from the
> docker run -d nginx:
> 
> https://gist.github.com/boucher/07af3dd6faa480323698

Thanks. There is nothing wrong. The problem is that we found one
inhereted pipe a few times. My path fixes this case too.

You forgot to say that my patch works with criu 1.6. The issue with
cgroup is a new one and Cyrill is going to investigate it.

Thanks,
Andrew

> 
> 
> On Tue, Jun 30, 2015 at 11:12 AM, Ross Boucher <rboucher at gmail.com> wrote:
> 
>     Attached the failure log. It's quite short, though. I don't think I've
>     changed anything else, but if this makes no sense then I can try rebuilding
>     everything to ensure I've got the right setup.
> 
>     https://gist.github.com/boucher/74805891264e042881aa
> 
>     On Tue, Jun 30, 2015 at 8:29 AM, Andrew Vagin <avagin at odin.com> wrote:
> 
>         Hi Boucher,
> 
>         Could you try out the attached patch?
> 
>         On Tue, Jun 30, 2015 at 05:00:36PM +0300, Andrew Vagin wrote:
>         > On Wed, Jun 24, 2015 at 12:34:51PM -0700, Ross Boucher wrote:
>         > > Here's the output same procedure without the restore-sibling
>         option:
>         > > https://gist.githubusercontent.com/boucher/b18593b9da2782d17e95/raw
>         /strace.txt
>         > >
>         > > It's rather long. I'm not really sure how to read the strace
>         output.
>         >
>         > Thank you.
>         >
>         > 16272 write(1023, "(00.139818)      1: \t\tCreate transport fd /
>         crtools-fd-1-5\n", 58) = 58
>         > 16272 socket(PF_LOCAL, SOCK_DGRAM, 0)   = 0
>         > 16272 bind(0, {sa_family=AF_LOCAL, sun_path=@"/crtools-fd-1-5"}, 18)
>         = 0
>         > 16272 fcntl(5, F_GETFD)                 = -1 EBADF (Bad file
>         descriptor)
>         > 16272 dup2(0, 5)                        = 5
>         >
>         > 16272 write(1023, "(00.142054)      1: Found id pipe:[122747] (fd 8)
>         in inherit fd list\n", 69) = 69
>         > 16272 dup(8)                            = 9
>         > 16272 write(1023, "(00.142074)      1: File pipe:[122747] will be
>         restored from fd 9 duped from inherit fd 8\n", 90) = 90
>         > 16272 fcntl(5, F_GETFD)                 = 0
>         > 16272 write(1023, "(00.142095)      1: Error (util.c:131): fd 5
>         already in use (called at files.c:872)\n", 84) = 84
>         >
>         > Looks like we meet both ends of an inhereted pipe.
>         >
>         > The same problem can be reproduced by the pipes test with the
>         following
>         > path:
>         >
>         > diff --git a/test/pipes/pipe.c b/test/pipes/pipe.c
>         > index cb34703..03efccc 100644
>         > --- a/test/pipes/pipe.c
>         > +++ b/test/pipes/pipe.c
>         > @@ -232,7 +232,7 @@ int main(int argc, char *argv[])
>         >
>         >                 child_pid = getpid();
>         >
>         > -               close_safe(pipefd[READ_FD]);
>         > +//             close_safe(pipefd[READ_FD]);
>         >                 setsid();
>         >                 logfd = open_safe(OLD_LOG_FILE, O_WRONLY | O_APPEND |
>         O_CREAT);
>         >                 dup2_safe(logfd, 1);
>         >
>         > >
>         > > On Tue, Jun 23, 2015 at 11:39 PM, Pavel Emelyanov <
>         xemul at parallels.com> wrote:
>         > >
>         > >     On 06/24/2015 01:42 AM, Ross Boucher wrote:
>         > >     > If I run strace on the docker daemon, criu fails to restore
>         with a
>         > >     different error:
>         > >     >
>         > >     > https://gist.github.com/boucher/bef6e944ae700526a979
>         > >     > (I included both the restore log and the strace)
>         > >     >
>         > >     > Without strace, I get the same fd already in use error.
>         > >
>         > >     Hm... The new error is because criu tries to PTRACE_SEIZE the
>         init to do
>         > >     the
>         > >     --restore-sibling restore and can't do it since strace is
>         already there.
>         > >
>         > >     Can you (for experiment only) patch out the --restore-sibling
>         option from
>         > >     the
>         > >     code that calls criu? Or (!) call criu restore manually on the
>         existing
>         > >     images
>         > >     with all the options being "correct" by yet again w/o the
>         > >     --restore-sibling?
>         > >
>         > >     -- Pavel
>         > >
>         > >
>         > >
>         >
>         > > _______________________________________________
>         > > CRIU mailing list
>         > > CRIU at openvz.org
>         > > https://lists.openvz.org/mailman/listinfo/criu
>         >
>         > _______________________________________________
>         > CRIU mailing list
>         > CRIU at openvz.org
>         > https://lists.openvz.org/mailman/listinfo/criu
> 
> 
> 
> 


More information about the CRIU mailing list