[CRIU] Error (sk-inet.c:202): Name resolved on unconnected socket

Pavel Emelyanov xemul at virtuozzo.com
Thu Jul 14 04:08:23 PDT 2016


On 07/13/2016 07:10 PM, Adrian Reber wrote:
> On Wed, Jul 13, 2016 at 05:48:16PM +0300, Pavel Emelyanov wrote:
>> On 07/13/2016 04:59 PM, Adrian Reber wrote:
>>> On Wed, Jul 13, 2016 at 03:54:58PM +0300, Pavel Emelyanov wrote:
>>>> On 07/05/2016 04:33 PM, Adrian Reber wrote:
>>>>> Now that CRIU can drop in-flight connections I get much better results
>>>>> checkpointing and restarting my test container while running ab against
>>>>> the tomcat server in the container: ab -n 1000000 -c 20
>>>>>
>>>>> I am still testing with LXC using lxc-checkpoint and once every 100 test
>>>>> runs I get following error:
>>>>>
>>>>> (00.232915) 6693 fdinfo 10: pos:                0 flags:          2000002/0x1
>>>>> (00.232917) fdinfo: type: 0x5 flags: 02000002/01 pos:        0 fd: 10
>>>>> (00.232923) 6693 fdinfo 11: pos:                0 flags:                2/0
>>>>> (00.232926)     Searching for socket 52f7d (family 2.6)
>>>>> (00.232928) Error (sk-inet.c:202): Name resolved on unconnected socket
>>>>> (00.232929) ----------------------------------------
>>>>> (00.232939) Error (cr-dump.c:1323): Dump files (pid: 6693) failed with -1
>>>>> (00.232954) Waiting for 6693 to trap
>>>>> (00.232961) Daemon 6693 exited trapping
>>>>>
>>>>> Can this also be solved in a way like the dropping of in-flight connections?
>>>>
>>>> This is a socket, that turned into connected state while we've been locking
>>>> the network, is it?
>>>
>>> I have no idea, that's why I am asking ;-) But it could be exactly this.
>>>
>>> This sounds like something that cannot be just dropped, because a
>>> established connection would then be missing during restore. If I
>>> understand it correctly the only option is to retry later, right?
>>
>> No, I'd say the way to go is to check the socket's state to be established
>> and dump it. It looks like we should first lock the network and only _after_
>> it collect the sockets via diag, not the vice-versa.
> 
> It sounds great if this is solvable, but I think I do not fully
> understand it. If I look at the log the network seems to get locked
> pretty early:
> 
> http://lisas.de/~adrian/dump.log-name-resolved-on-unconnected-socket

Yes, you're right. As Andrey points out network_lock() is called before
collect_namespaces(). Then we need to get more info about this socket,
in particular, the gen_uncon_sk() reads more stuff about socket and
the info.tcpi_state is the most interesting for us. What is it?

> If you can give me some hints where this needs to be fixed I can try to
> come up with a patch, but right now I am still lost ;-)
> 
> 		Adrian
> .
> 



More information about the CRIU mailing list