[CRIU] lxc-checkpoint restore failed
Pavel Emelyanov
xemul at parallels.com
Thu Oct 15 05:04:30 PDT 2015
On 10/15/2015 02:58 PM, Jason Lee wrote:
> OK!
> Here it is:
>
> root at dslab:/home/dslab/tools/criu# ./crit show /home/checkpoint/c2/packetsk.img
> {
> "magic": "PACKETSK",
> "entries": [
> {
> "id": 61,
> "type": 10,
Here it is. This is SOCK_PACKET which we didn't support (and didn't put check
for it on dump). Which software uses this thing? AF_PACKET sockets are typically
SOCK_RAW or SOCK_DGRAM, SOCK_PACKET is, frankly speaking, new to me :)
> "protocol": 768,
> "flags": "0x80002",
> "ifindex": 73,
> "fown": {
> "uid": 0,
> "euid": 0,
> "signum": 0,
> "pid_type": 0,
> "pid": 0
> },
> "opts": {
> "so_sndbuf": 212992,
> "so_rcvbuf": 212992,
> "so_snd_tmo_sec": 0,
> "so_snd_tmo_usec": 0,
> "so_rcv_tmo_sec": 0,
> "so_rcv_tmo_usec": 0,
> "reuseaddr": false,
> "so_priority": 0,
> "so_rcvlowat": 1,
> "so_mark": 0,
> "so_passcred": false,
> "so_passsec": false,
> "so_dontroute": false,
> "so_no_check": false,
> "so_filter": [
> 1.1258999068426252e+16,
> 5911008870664192.0,
> 1.3510798882111512e+16,
> 5911000280727569.0,
> 1.125899906842626e+16,
> 1.942617143955456e+16,
> 4.982107087778613e+16,
> 2.026619832316725e+16,
> 5910978805891140.0,
> 1688854155231231.0,
> 1688849860263936.0
> ]
> },
> "version": 0,
> "reserve": 0,
> "aux_data": false,
> "orig_dev": false,
> "vnet_hdr": false,
> "loss": false,
> "timestamp": 0,
> "copy_thresh": 0
> }
> ]
> }
>
>
> 2015-10-15 17:27 GMT+08:00 Pavel Emelyanov <xemul at parallels.com <mailto:xemul at parallels.com>>:
>
> On 10/14/2015 09:54 PM, Tycho Andersen wrote:
>
> >>> Warn (cr-restore.c:1041): Set CLONE_PARENT | CLONE_NEWPID but it might cause restore problem,because not all kernels support such clone flags combinations!
> >>> RTNETLINK answers: File exists
> >>> RTNETLINK answers: File exists
> >>> RTNETLINK answers: File exists
> >>> RTNETLINK answers: File exists
> >>> 1: Warn (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36a8 peer 0 (name /run/systemd/notify dir -)
> >>> 1: Warn (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36aa peer 0 (name /run/systemd/private dir -)
> >>> 1: Warn (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36b4 peer 0 (name /run/systemd/shutdownd dir -)
> >>> 1: Warn (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36b6 peer 0 (name /run/systemd/journal/dev-log dir -)
> >>> 1: Warn (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36ba peer 0 (name /run/systemd/journal/stdout dir -)
> >>> 1: Warn (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x36bc peer 0 (name /run/systemd/journal/socket dir -)
> >>> 1: Warn (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x5bad peer 0x70ea (name /run/systemd/journal/stdout dir -)
> >>> 1: Warn (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6da7 peer 0x3788 (name /run/systemd/journal/stdout dir -)
> >>> 1: Warn (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6da6 peer 0x5f21 (name /run/systemd/journal/stdout dir -)
> >>> 1: Warn (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6da8 peer 0x784b (name /run/systemd/journal/stdout dir -)
> >>> 1: Warn (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6da9 peer 0x6b10 (name /run/systemd/journal/stdout dir -)
> >>> 1: Warn (sk-unix.c:1229): sk unix: Can't unlink stale socket 0x6daa peer 0x6159 (name /run/systemd/journal/stdout dir -)
> >>> 68: Error (sk-packet.c:419): Can't bind packet socket: Invalid argument
> >>> Error (cr-restore.c:1236): 3159 killed by signal 19
> >>> Error (cr-restore.c:1236): 3159 killed by signal 19
> >>> Error (cr-restore.c:1933): Restoring FAILED.
> >
> > Here the real problem. bind() is failing, probably because the unlink
> > above failed. Unfortunately, we don't log the reason for the bind()
> > failing, can you try with the attached patch?
>
> The unlink failed with ENOENT, so there's no conflict for bind. Also note,
> that we unlink unix socket, but bind the packet one :) so the conflict is
> not only not there, but not possible at all ;)
>
> After looking at the kernel sources, I guess this is failure of criu -- there
> are two types of packet socket and each imposes different requirements on the
> sockaddr structure that is passed into bind. And criu looks like to support
> only one of them :\
>
> Jason, can you show the decoded contents of packetsk.img file?
>
> > Pavel, perhaps we should apply this so it does report the error?
>
> Applied. Would you send one more patch making the unlink_stale() void? :)
>
> -- Pavel
>
>
More information about the CRIU
mailing list