[CRIU] problem restoring unix queues?

Mon Jul 20 07:43:38 PDT 2015

On 07/20/2015 05:28 PM, Tycho Andersen wrote:
> On Mon, Jul 20, 2015 at 02:13:52PM +0300, Pavel Emelyanov wrote:
>> On 07/18/2015 02:17 AM, Tycho Andersen wrote:
>>> On Fri, Jul 17, 2015 at 07:13:55PM +0300, Pavel Emelyanov wrote:
>>>> On 07/17/2015 05:33 PM, Tycho Andersen wrote:
>>>>> Hi all,
>>>>>
>>>>> Sometimes I see something like:
>>>>>
>>>>> (00.095976)     77: Error (sk-queue.c:238): Failed to send packet: Resource temporarily unavailable
>>>>>
>>>>> when restoring (full log here: http://paste.ubuntu.com/11893011/).
>>>>>
>>>>> The sk-queues.img is: http://paste.ubuntu.com/11893015/ so I don't
>>>>> /think/ it should be filling the buffer, so I'm not sure why we'd get
>>>>> EAGAIN.
>>>>>
>>>>> Thoughts?
>>>>
>>>> Is it stream or datagram socket? AFAIK the EAGAIN is only reported when
>>>> you hit socket buffer limit, but we try to raise one :\
>>>
>>> Assuming the "id" is the same as "id_for" above, it looks like it is
>>> SOCK_DGRAM (output from unixsk.img):
>>>
>>>         {
>>>             "id": 14, 
>>>             "ino": 16798503, 
>>>             "type": 2, # SOCK_DGRAM == 2
>>>             "state": 7, 
>>>             "flags": "0x80802", 
>>>             "uflags": "0x0", 
>>>             "backlog": 0, 
>>>             "peer": 0, 
>>>             "fown": {
>>>                 "uid": 0, 
>>>                 "euid": 0, 
>>>                 "signum": 0, 
>>>                 "pid_type": 0, 
>>>                 "pid": 0
>>>             }, 
>>>             "opts": {
>>>                 "so_sndbuf": 212992, 
>>>                 "so_rcvbuf": 16777216, 
>>>                 "so_snd_tmo_sec": 0, 
>>>                 "so_snd_tmo_usec": 0, 
>>>                 "so_rcv_tmo_sec": 0, 
>>>                 "so_rcv_tmo_usec": 0, 
>>>                 "reuseaddr": true, 
>>>                 "so_priority": 0, 
>>>                 "so_rcvlowat": 1, 
>>>                 "so_mark": 0, 
>>>                 "so_passcred": true, 
>>>                 "so_passsec": true, 
>>>                 "so_dontroute": false, 
>>>                 "so_no_check": false
>>>             }, 
>>>             "name": "L3J1bi9zeXN0ZW1kL2pvdXJuYWwvc29ja2V0AA==\n", 
>>>             "file_perms": {
>>>                 "mode": 49590, 
>>>                 "uid": 0, 
>>>                 "gid": 0
>>>             }
>>>         }, 
>>>
>>> Based on my read of the criu/kernel sources, we're setting
>>> sk->sk_sndbuf, while the kernel source in unix_dgram_sndmsg is checking
>>> sk->sk_receive_queue for sending EAGAIN; since we pass the backlog of
>>> 0 to listen() (viz. above), I guess this is what's causing the
>>> problem? Seems like a backlog of 0 must be wrong, since the queue is
>>> of non-zero length.
>>
>> Wait a second, if you send packets to socket it cannot be in listen
>> state. Moreover, according to your image the state of the socket is
>> 7 which is CLOSED :)
> 
> Hrm. So perhaps we should not be sending stuff to this socket at all?
> Should we add a check for TCP_CLOSED to guard the call to
> dump_sk_queue?

We dump only read queue that can be alive for closed sockets. E.g. the
test zdtm/live/static/socket02.c does exactly that.

-- Pavel