[CRIU] failure dumping nginx in docker container
Andrew Vagin
avagin at odin.com
Thu Jul 2 02:52:31 PDT 2015
On Tue, Jun 30, 2015 at 03:39:28PM -0700, Ross Boucher wrote:
> Just following up from our irc conversation, here's the daemon strace from the
> docker run -d nginx:
>
> https://gist.github.com/boucher/07af3dd6faa480323698
Thanks. There is nothing wrong. The problem is that we found one
inhereted pipe a few times. My path fixes this case too.
You forgot to say that my patch works with criu 1.6. The issue with
cgroup is a new one and Cyrill is going to investigate it.
Thanks,
Andrew
>
>
> On Tue, Jun 30, 2015 at 11:12 AM, Ross Boucher <rboucher at gmail.com> wrote:
>
> Attached the failure log. It's quite short, though. I don't think I've
> changed anything else, but if this makes no sense then I can try rebuilding
> everything to ensure I've got the right setup.
>
> https://gist.github.com/boucher/74805891264e042881aa
>
> On Tue, Jun 30, 2015 at 8:29 AM, Andrew Vagin <avagin at odin.com> wrote:
>
> Hi Boucher,
>
> Could you try out the attached patch?
>
> On Tue, Jun 30, 2015 at 05:00:36PM +0300, Andrew Vagin wrote:
> > On Wed, Jun 24, 2015 at 12:34:51PM -0700, Ross Boucher wrote:
> > > Here's the output same procedure without the restore-sibling
> option:
> > > https://gist.githubusercontent.com/boucher/b18593b9da2782d17e95/raw
> /strace.txt
> > >
> > > It's rather long. I'm not really sure how to read the strace
> output.
> >
> > Thank you.
> >
> > 16272 write(1023, "(00.139818) 1: \t\tCreate transport fd /
> crtools-fd-1-5\n", 58) = 58
> > 16272 socket(PF_LOCAL, SOCK_DGRAM, 0) = 0
> > 16272 bind(0, {sa_family=AF_LOCAL, sun_path=@"/crtools-fd-1-5"}, 18)
> = 0
> > 16272 fcntl(5, F_GETFD) = -1 EBADF (Bad file
> descriptor)
> > 16272 dup2(0, 5) = 5
> >
> > 16272 write(1023, "(00.142054) 1: Found id pipe:[122747] (fd 8)
> in inherit fd list\n", 69) = 69
> > 16272 dup(8) = 9
> > 16272 write(1023, "(00.142074) 1: File pipe:[122747] will be
> restored from fd 9 duped from inherit fd 8\n", 90) = 90
> > 16272 fcntl(5, F_GETFD) = 0
> > 16272 write(1023, "(00.142095) 1: Error (util.c:131): fd 5
> already in use (called at files.c:872)\n", 84) = 84
> >
> > Looks like we meet both ends of an inhereted pipe.
> >
> > The same problem can be reproduced by the pipes test with the
> following
> > path:
> >
> > diff --git a/test/pipes/pipe.c b/test/pipes/pipe.c
> > index cb34703..03efccc 100644
> > --- a/test/pipes/pipe.c
> > +++ b/test/pipes/pipe.c
> > @@ -232,7 +232,7 @@ int main(int argc, char *argv[])
> >
> > child_pid = getpid();
> >
> > - close_safe(pipefd[READ_FD]);
> > +// close_safe(pipefd[READ_FD]);
> > setsid();
> > logfd = open_safe(OLD_LOG_FILE, O_WRONLY | O_APPEND |
> O_CREAT);
> > dup2_safe(logfd, 1);
> >
> > >
> > > On Tue, Jun 23, 2015 at 11:39 PM, Pavel Emelyanov <
> xemul at parallels.com> wrote:
> > >
> > > On 06/24/2015 01:42 AM, Ross Boucher wrote:
> > > > If I run strace on the docker daemon, criu fails to restore
> with a
> > > different error:
> > > >
> > > > https://gist.github.com/boucher/bef6e944ae700526a979
> > > > (I included both the restore log and the strace)
> > > >
> > > > Without strace, I get the same fd already in use error.
> > >
> > > Hm... The new error is because criu tries to PTRACE_SEIZE the
> init to do
> > > the
> > > --restore-sibling restore and can't do it since strace is
> already there.
> > >
> > > Can you (for experiment only) patch out the --restore-sibling
> option from
> > > the
> > > code that calls criu? Or (!) call criu restore manually on the
> existing
> > > images
> > > with all the options being "correct" by yet again w/o the
> > > --restore-sibling?
> > >
> > > -- Pavel
> > >
> > >
> > >
> >
> > > _______________________________________________
> > > CRIU mailing list
> > > CRIU at openvz.org
> > > https://lists.openvz.org/mailman/listinfo/criu
> >
> > _______________________________________________
> > CRIU mailing list
> > CRIU at openvz.org
> > https://lists.openvz.org/mailman/listinfo/criu
>
>
>
>
More information about the CRIU
mailing list