[CRIU] problem restoring unix queues?
Pavel Emelyanov
xemul at parallels.com
Mon Jul 20 07:43:38 PDT 2015
On 07/20/2015 05:28 PM, Tycho Andersen wrote:
> On Mon, Jul 20, 2015 at 02:13:52PM +0300, Pavel Emelyanov wrote:
>> On 07/18/2015 02:17 AM, Tycho Andersen wrote:
>>> On Fri, Jul 17, 2015 at 07:13:55PM +0300, Pavel Emelyanov wrote:
>>>> On 07/17/2015 05:33 PM, Tycho Andersen wrote:
>>>>> Hi all,
>>>>>
>>>>> Sometimes I see something like:
>>>>>
>>>>> (00.095976) 77: Error (sk-queue.c:238): Failed to send packet: Resource temporarily unavailable
>>>>>
>>>>> when restoring (full log here: http://paste.ubuntu.com/11893011/).
>>>>>
>>>>> The sk-queues.img is: http://paste.ubuntu.com/11893015/ so I don't
>>>>> /think/ it should be filling the buffer, so I'm not sure why we'd get
>>>>> EAGAIN.
>>>>>
>>>>> Thoughts?
>>>>
>>>> Is it stream or datagram socket? AFAIK the EAGAIN is only reported when
>>>> you hit socket buffer limit, but we try to raise one :\
>>>
>>> Assuming the "id" is the same as "id_for" above, it looks like it is
>>> SOCK_DGRAM (output from unixsk.img):
>>>
>>> {
>>> "id": 14,
>>> "ino": 16798503,
>>> "type": 2, # SOCK_DGRAM == 2
>>> "state": 7,
>>> "flags": "0x80802",
>>> "uflags": "0x0",
>>> "backlog": 0,
>>> "peer": 0,
>>> "fown": {
>>> "uid": 0,
>>> "euid": 0,
>>> "signum": 0,
>>> "pid_type": 0,
>>> "pid": 0
>>> },
>>> "opts": {
>>> "so_sndbuf": 212992,
>>> "so_rcvbuf": 16777216,
>>> "so_snd_tmo_sec": 0,
>>> "so_snd_tmo_usec": 0,
>>> "so_rcv_tmo_sec": 0,
>>> "so_rcv_tmo_usec": 0,
>>> "reuseaddr": true,
>>> "so_priority": 0,
>>> "so_rcvlowat": 1,
>>> "so_mark": 0,
>>> "so_passcred": true,
>>> "so_passsec": true,
>>> "so_dontroute": false,
>>> "so_no_check": false
>>> },
>>> "name": "L3J1bi9zeXN0ZW1kL2pvdXJuYWwvc29ja2V0AA==\n",
>>> "file_perms": {
>>> "mode": 49590,
>>> "uid": 0,
>>> "gid": 0
>>> }
>>> },
>>>
>>> Based on my read of the criu/kernel sources, we're setting
>>> sk->sk_sndbuf, while the kernel source in unix_dgram_sndmsg is checking
>>> sk->sk_receive_queue for sending EAGAIN; since we pass the backlog of
>>> 0 to listen() (viz. above), I guess this is what's causing the
>>> problem? Seems like a backlog of 0 must be wrong, since the queue is
>>> of non-zero length.
>>
>> Wait a second, if you send packets to socket it cannot be in listen
>> state. Moreover, according to your image the state of the socket is
>> 7 which is CLOSED :)
>
> Hrm. So perhaps we should not be sending stuff to this socket at all?
> Should we add a check for TCP_CLOSED to guard the call to
> dump_sk_queue?
We dump only read queue that can be alive for closed sockets. E.g. the
test zdtm/live/static/socket02.c does exactly that.
-- Pavel
More information about the CRIU
mailing list