[CRIU] lxc-checkpoint restore failed

Pavel Emelyanov xemul at parallels.com
Thu Oct 15 05:04:30 PDT 2015


On 10/15/2015 02:58 PM, Jason Lee wrote:
> OK!
> Here it is:
> 
> root at dslab:/home/dslab/tools/criu# ./crit show /home/checkpoint/c2/packetsk.img 
> {
>     "magic": "PACKETSK", 
>     "entries": [
>         {
>             "id": 61, 
>             "type": 10, 

Here it is. This is SOCK_PACKET which we didn't support (and didn't put check
for it on dump). Which software uses this thing? AF_PACKET sockets are typically
SOCK_RAW or SOCK_DGRAM, SOCK_PACKET is, frankly speaking, new to me :)

>             "protocol": 768, 
>             "flags": "0x80002", 
>             "ifindex": 73, 
>             "fown": {
>                 "uid": 0, 
>                 "euid": 0, 
>                 "signum": 0, 
>                 "pid_type": 0, 
>                 "pid": 0
>             }, 
>             "opts": {
>                 "so_sndbuf": 212992, 
>                 "so_rcvbuf": 212992, 
>                 "so_snd_tmo_sec": 0, 
>                 "so_snd_tmo_usec": 0, 
>                 "so_rcv_tmo_sec": 0, 
>                 "so_rcv_tmo_usec": 0, 
>                 "reuseaddr": false, 
>                 "so_priority": 0, 
>                 "so_rcvlowat": 1, 
>                 "so_mark": 0, 
>                 "so_passcred": false, 
>                 "so_passsec": false, 
>                 "so_dontroute": false, 
>                 "so_no_check": false, 
>                 "so_filter": [
>                     1.1258999068426252e+16, 
>                     5911008870664192.0, 
>                     1.3510798882111512e+16, 
>                     5911000280727569.0, 
>                     1.125899906842626e+16, 
>                     1.942617143955456e+16, 
>                     4.982107087778613e+16, 
>                     2.026619832316725e+16, 
>                     5910978805891140.0, 
>                     1688854155231231.0, 
>                     1688849860263936.0
>                 ]
>             }, 
>             "version": 0, 
>             "reserve": 0, 
>             "aux_data": false, 
>             "orig_dev": false, 
>             "vnet_hdr": false, 
>             "loss": false, 
>             "timestamp": 0, 
>             "copy_thresh": 0
>         }
>     ]
> }
> 
> 
> 2015-10-15 17:27 GMT+08:00 Pavel Emelyanov <xemul at parallels.com <mailto:xemul at parallels.com>>:
> 
>     On 10/14/2015 09:54 PM, Tycho Andersen wrote:
> 
>     >>> Warn  (cr-restore.c:1041): Set CLONE_PARENT | CLONE_NEWPID but it might cause restore problem,because not all kernels support such clone flags combinations!
>     >>> RTNETLINK answers: File exists
>     >>> RTNETLINK answers: File exists
>     >>> RTNETLINK answers: File exists
>     >>> RTNETLINK answers: File exists
>     >>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36a8 peer 0 (name /run/systemd/notify dir -)
>     >>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36aa peer 0 (name /run/systemd/private dir -)
>     >>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36b4 peer 0 (name /run/systemd/shutdownd dir -)
>     >>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36b6 peer 0 (name /run/systemd/journal/dev-log dir -)
>     >>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36ba peer 0 (name /run/systemd/journal/stdout dir -)
>     >>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36bc peer 0 (name /run/systemd/journal/socket dir -)
>     >>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x5bad peer 0x70ea (name /run/systemd/journal/stdout dir -)
>     >>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6da7 peer 0x3788 (name /run/systemd/journal/stdout dir -)
>     >>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6da6 peer 0x5f21 (name /run/systemd/journal/stdout dir -)
>     >>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6da8 peer 0x784b (name /run/systemd/journal/stdout dir -)
>     >>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6da9 peer 0x6b10 (name /run/systemd/journal/stdout dir -)
>     >>>      1: Warn  (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6daa peer 0x6159 (name /run/systemd/journal/stdout dir -)
>     >>>     68: Error (sk-packet.c:419): Can't bind packet socket: Invalid argument
>     >>> Error (cr-restore.c:1236): 3159 killed by signal 19
>     >>> Error (cr-restore.c:1236): 3159 killed by signal 19
>     >>> Error (cr-restore.c:1933): Restoring FAILED.
>     >
>     > Here the real problem. bind() is failing, probably because the unlink
>     > above failed. Unfortunately, we don't log the reason for the bind()
>     > failing, can you try with the attached patch?
> 
>     The unlink failed with ENOENT, so there's no conflict for bind. Also note,
>     that we unlink unix socket, but bind the packet one :) so the conflict is
>     not only not there, but not possible at all ;)
> 
>     After looking at the kernel sources, I guess this is failure of criu -- there
>     are two types of packet socket and each imposes different requirements on the
>     sockaddr structure that is passed into bind. And criu looks like to support
>     only one of them :\
> 
>     Jason, can you show the decoded contents of packetsk.img file?
> 
>     > Pavel, perhaps we should apply this so it does report the error?
> 
>     Applied. Would you send one more patch making the unlink_stale() void? :)
> 
>     -- Pavel
> 
> 



More information about the CRIU mailing list