[CRIU] Error (sk-inet.c:202): Name resolved on unconnected socket

Adrian Reber adrian at lisas.de
Wed Jul 13 09:10:49 PDT 2016


On Wed, Jul 13, 2016 at 05:48:16PM +0300, Pavel Emelyanov wrote:
> On 07/13/2016 04:59 PM, Adrian Reber wrote:
> > On Wed, Jul 13, 2016 at 03:54:58PM +0300, Pavel Emelyanov wrote:
> >> On 07/05/2016 04:33 PM, Adrian Reber wrote:
> >>> Now that CRIU can drop in-flight connections I get much better results
> >>> checkpointing and restarting my test container while running ab against
> >>> the tomcat server in the container: ab -n 1000000 -c 20
> >>>
> >>> I am still testing with LXC using lxc-checkpoint and once every 100 test
> >>> runs I get following error:
> >>>
> >>> (00.232915) 6693 fdinfo 10: pos:                0 flags:          2000002/0x1
> >>> (00.232917) fdinfo: type: 0x5 flags: 02000002/01 pos:        0 fd: 10
> >>> (00.232923) 6693 fdinfo 11: pos:                0 flags:                2/0
> >>> (00.232926)     Searching for socket 52f7d (family 2.6)
> >>> (00.232928) Error (sk-inet.c:202): Name resolved on unconnected socket
> >>> (00.232929) ----------------------------------------
> >>> (00.232939) Error (cr-dump.c:1323): Dump files (pid: 6693) failed with -1
> >>> (00.232954) Waiting for 6693 to trap
> >>> (00.232961) Daemon 6693 exited trapping
> >>>
> >>> Can this also be solved in a way like the dropping of in-flight connections?
> >>
> >> This is a socket, that turned into connected state while we've been locking
> >> the network, is it?
> > 
> > I have no idea, that's why I am asking ;-) But it could be exactly this.
> > 
> > This sounds like something that cannot be just dropped, because a
> > established connection would then be missing during restore. If I
> > understand it correctly the only option is to retry later, right?
> 
> No, I'd say the way to go is to check the socket's state to be established
> and dump it. It looks like we should first lock the network and only _after_
> it collect the sockets via diag, not the vice-versa.

It sounds great if this is solvable, but I think I do not fully
understand it. If I look at the log the network seems to get locked
pretty early:

http://lisas.de/~adrian/dump.log-name-resolved-on-unconnected-socket

If you can give me some hints where this needs to be fixed I can try to
come up with a patch, but right now I am still lost ;-)

		Adrian


More information about the CRIU mailing list