[CRIU] Dumping Process - lxc-ls -f Problem

Pavel Emelyanov xemul at parallels.com
Mon Jun 15 11:35:11 PDT 2015


On 06/15/2015 09:29 PM, Thouraya TH wrote:
> Hello;
> 
> I have done new tests on my old container and the same problem:
> 
> restore.log
> Warn  (cr-restore.c:1029): Set CLONE_PARENT | CLONE_NEWPID but it might cause restore problem,because 
> not all kernels support such clone flags combinations!

This warning is harmless, it just informs that on some kernels the
restore may fail. Fortunately no modern distros have such old kernels.

> RTNETLINK answers: File exists
> RTNETLINK answers: File exists
> RTNETLINK answers: File exists
>    920: Error (sk-inet.c:610): Can't bind inet socket: Invalid argument

This one is likely causing the fault. Can you show full restore.log please?

> 
> root at g-23:/tmp# lxc-ls -f
> ^CTraceback (most recent call last):
>   File "/usr/bin/lxc-ls", line 432, in <module>
>     containers = get_containers(root=True)
>   File "/usr/bin/lxc-ls", line 261, in get_containers
>     if container.controllable:
> KeyboardInterrupt
> 
> 
> /Do you have a way to reproduce this? I'm not sure what it means
> exactly, perhaps Pavel can elaborate. However, it would be nice to
> have a small testcase so I can try and fix it.
> 
> 
> /
> 
> In this old container,i have (lxc bridge network, jdk, some other libraries, i have modified /etc/sudoers, etc... )
> i can't understand the source of the problem !
> 
> Best Regards.
> 
> 
> 
> 2015-06-15 14:45 GMT+01:00 Tycho Andersen <tycho.andersen at canonical.com <mailto:tycho.andersen at canonical.com>>:
> 
>     On Sun, Jun 14, 2015 at 02:02:13PM +0100, Thouraya TH wrote:
>     > Hello all;
>     >
>     > I have done two tests:  (criu 1.6)
>     >
>     > 1- *Test 1:*
>     >    I have created a new container using:
>     >                      lxc-create -t ubuntu -n worker2 http_proxy=True
>     >
>     >  *I have not installed any tool in this container.*
>     >
>     >   lxc-checkpoint -s -D /tmp/dire -n worker2
>     >   lxc-checkpoint -r -D /tmp/dire -n worker2
>     >
>     > root at localhost:~# lxc-ls -f
>     > NAME     STATE    IPV4       IPV6  GROUPS  AUTOSTART
>     > ----------------------------------------------------
>     > worker   STOPPED  -          -     -       NO
>     > *worker2*  RUNNING  10.0.3.48  -     -       NO
>     >
>     > 2- *Test 2: *
>     >
>     > lxc-start -n worker  (it is an old worker: i have installed many tools in
>     > this container, jdk, etc ......)
>     > root at localhost:/tmp# lxc-ls -f
>     > NAME     STATE    IPV4        IPV6  GROUPS  AUTOSTART
>     > -----------------------------------------------------
>     > worker   RUNNING  10.0.3.109  -     -       NO
>     > worker2  RUNNING  10.0.3.48   -     -       NO
>     >
>     > lxc-checkpoint -s -D /tmp/direworker -n worker
>     > lxc-checkpoint -r -D /tmp/direworker -n worker
>     >
>     > lxc-ls -f
>     > ^CTraceback (most recent call last):
>     >   File "/usr/bin/lxc-ls", line 432, in <module>
>     >     containers = get_containers(root=True)
>     >   File "/usr/bin/lxc-ls", line 261, in get_containers
>     >     if container.controllable:
>     > KeyboardInterrupt
> 
>     This means that the restore failed and lxc-checkpoint didn't
>     understand that it failed, and is still waiting. I think we recently
>     fixed a bug (there is a patch about SIGCHLD) that will cause this not
>     to hang here.
> 
>     > dump.log:
>     > Warn  (fsnotify.c:188): fsnotify:       Handle 800003:3c605 cannot be opened
>     > Warn  (fsnotify.c:188): fsnotify:       Handle 800003:4a4ba cannot be opened
>     > Warn  (arch/x86/crtools.c:132): Will restore 4271 with interrupted system
>     > call
>     > Warn  (arch/x86/crtools.c:132): Will restore 4454 with interrupted system
>     > call
>     > Warn  (arch/x86/crtools.c:132): Will restore 4455 with interrupted system
>     > call
>     > Warn  (arch/x86/crtools.c:132): Will restore 4460 with interrupted system
>     > call
>     > Warn  (arch/x86/crtools.c:132): Will restore 4461 with interrupted system
>     > call
>     > Warn  (arch/x86/crtools.c:132): Will restore 4463 with interrupted system
>     > call
>     >
>     > restore.log:
>     > Warn  (cr-restore.c:1029): Set CLONE_PARENT | CLONE_NEWPID but it might
>     > cause restore problem,because not all kernels support such clone flags
>     > combinations!
>     > RTNETLINK answers: File exists
>     > RTNETLINK answers: File exists
>     > RTNETLINK answers: File exists
>     >
>     >
>     > Can you please explain to me what is the problem?  is it because of some
>     > tools installed in the container?
> 
>     Do you have a way to reproduce this? I'm not sure what it means
>     exactly, perhaps Pavel can elaborate. However, it would be nice to
>     have a small testcase so I can try and fix it.
> 
>     Thanks,
> 
>     Tycho
> 
>     >
>     > Thanks a lot for help.
>     > Best regards.
> 
>     > _______________________________________________
>     > CRIU mailing list
>     > CRIU at openvz.org <mailto:CRIU at openvz.org>
>     > https://lists.openvz.org/mailman/listinfo/criu
> 
> 



More information about the CRIU mailing list