[CRIU] lxc-checkpoint restore failed

Pavel Emelyanov xemul at parallels.com
Thu Oct 15 02:27:53 PDT 2015


On 10/14/2015 09:54 PM, Tycho Andersen wrote:

>>> Warn  (cr-restore.c:1041): Set CLONE_PARENT | CLONE_NEWPID but it might cause restore problem,because not all kernels support such clone flags combinations!
>>> RTNETLINK answers: File exists
>>> RTNETLINK answers: File exists
>>> RTNETLINK answers: File exists
>>> RTNETLINK answers: File exists
>>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36a8 peer 0 (name /run/systemd/notify dir -)
>>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36aa peer 0 (name /run/systemd/private dir -)
>>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36b4 peer 0 (name /run/systemd/shutdownd dir -)
>>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36b6 peer 0 (name /run/systemd/journal/dev-log dir -)
>>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36ba peer 0 (name /run/systemd/journal/stdout dir -)
>>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36bc peer 0 (name /run/systemd/journal/socket dir -)
>>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x5bad peer 0x70ea (name /run/systemd/journal/stdout dir -)
>>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6da7 peer 0x3788 (name /run/systemd/journal/stdout dir -)
>>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6da6 peer 0x5f21 (name /run/systemd/journal/stdout dir -)
>>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6da8 peer 0x784b (name /run/systemd/journal/stdout dir -)
>>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6da9 peer 0x6b10 (name /run/systemd/journal/stdout dir -)
>>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6daa peer 0x6159 (name /run/systemd/journal/stdout dir -)
>>>     68: Error (sk-packet.c:419): Can't bind packet socket: Invalid argument
>>> Error (cr-restore.c:1236): 3159 killed by signal 19
>>> Error (cr-restore.c:1236): 3159 killed by signal 19
>>> Error (cr-restore.c:1933): Restoring FAILED.
> 
> Here the real problem. bind() is failing, probably because the unlink
> above failed. Unfortunately, we don't log the reason for the bind()
> failing, can you try with the attached patch?

The unlink failed with ENOENT, so there's no conflict for bind. Also note,
that we unlink unix socket, but bind the packet one :) so the conflict is
not only not there, but not possible at all ;)

After looking at the kernel sources, I guess this is failure of criu -- there
are two types of packet socket and each imposes different requirements on the
sockaddr structure that is passed into bind. And criu looks like to support
only one of them :\

Jason, can you show the decoded contents of packetsk.img file?

> Pavel, perhaps we should apply this so it does report the error?

Applied. Would you send one more patch making the unlink_stale() void? :)

-- Pavel



More information about the CRIU mailing list