[CRIU] problem restoring unix queues?
Tycho Andersen
tycho.andersen at canonical.com
Mon Jul 20 07:28:31 PDT 2015
On Mon, Jul 20, 2015 at 02:13:52PM +0300, Pavel Emelyanov wrote:
> On 07/18/2015 02:17 AM, Tycho Andersen wrote:
> > On Fri, Jul 17, 2015 at 07:13:55PM +0300, Pavel Emelyanov wrote:
> >> On 07/17/2015 05:33 PM, Tycho Andersen wrote:
> >>> Hi all,
> >>>
> >>> Sometimes I see something like:
> >>>
> >>> (00.095976) 77: Error (sk-queue.c:238): Failed to send packet: Resource temporarily unavailable
> >>>
> >>> when restoring (full log here: http://paste.ubuntu.com/11893011/).
> >>>
> >>> The sk-queues.img is: http://paste.ubuntu.com/11893015/ so I don't
> >>> /think/ it should be filling the buffer, so I'm not sure why we'd get
> >>> EAGAIN.
> >>>
> >>> Thoughts?
> >>
> >> Is it stream or datagram socket? AFAIK the EAGAIN is only reported when
> >> you hit socket buffer limit, but we try to raise one :\
> >
> > Assuming the "id" is the same as "id_for" above, it looks like it is
> > SOCK_DGRAM (output from unixsk.img):
> >
> > {
> > "id": 14,
> > "ino": 16798503,
> > "type": 2, # SOCK_DGRAM == 2
> > "state": 7,
> > "flags": "0x80802",
> > "uflags": "0x0",
> > "backlog": 0,
> > "peer": 0,
> > "fown": {
> > "uid": 0,
> > "euid": 0,
> > "signum": 0,
> > "pid_type": 0,
> > "pid": 0
> > },
> > "opts": {
> > "so_sndbuf": 212992,
> > "so_rcvbuf": 16777216,
> > "so_snd_tmo_sec": 0,
> > "so_snd_tmo_usec": 0,
> > "so_rcv_tmo_sec": 0,
> > "so_rcv_tmo_usec": 0,
> > "reuseaddr": true,
> > "so_priority": 0,
> > "so_rcvlowat": 1,
> > "so_mark": 0,
> > "so_passcred": true,
> > "so_passsec": true,
> > "so_dontroute": false,
> > "so_no_check": false
> > },
> > "name": "L3J1bi9zeXN0ZW1kL2pvdXJuYWwvc29ja2V0AA==\n",
> > "file_perms": {
> > "mode": 49590,
> > "uid": 0,
> > "gid": 0
> > }
> > },
> >
> > Based on my read of the criu/kernel sources, we're setting
> > sk->sk_sndbuf, while the kernel source in unix_dgram_sndmsg is checking
> > sk->sk_receive_queue for sending EAGAIN; since we pass the backlog of
> > 0 to listen() (viz. above), I guess this is what's causing the
> > problem? Seems like a backlog of 0 must be wrong, since the queue is
> > of non-zero length.
>
> Wait a second, if you send packets to socket it cannot be in listen
> state. Moreover, according to your image the state of the socket is
> 7 which is CLOSED :)
Hrm. So perhaps we should not be sending stuff to this socket at all?
Should we add a check for TCP_CLOSED to guard the call to
dump_sk_queue?
Tycho
More information about the CRIU
mailing list