[CRIU] Error (sk-inet.c:202): Name resolved on unconnected socket

Adrian Reber adrian at lisas.de
Thu Jul 21 07:19:49 PDT 2016


On Fri, Jul 15, 2016 at 09:43:33PM -0700, Andrew Vagin wrote:
> On Thu, Jul 14, 2016 at 02:08:23PM +0300, Pavel Emelyanov wrote:
> > On 07/13/2016 07:10 PM, Adrian Reber wrote:
> > > On Wed, Jul 13, 2016 at 05:48:16PM +0300, Pavel Emelyanov wrote:
> > >> On 07/13/2016 04:59 PM, Adrian Reber wrote:
> > >>> On Wed, Jul 13, 2016 at 03:54:58PM +0300, Pavel Emelyanov wrote:
> > >>>> On 07/05/2016 04:33 PM, Adrian Reber wrote:
> > >>>>> Now that CRIU can drop in-flight connections I get much better results
> > >>>>> checkpointing and restarting my test container while running ab against
> > >>>>> the tomcat server in the container: ab -n 1000000 -c 20
> > >>>>>
> > >>>>> I am still testing with LXC using lxc-checkpoint and once every 100 test
> > >>>>> runs I get following error:
> > >>>>>
> > >>>>> (00.232915) 6693 fdinfo 10: pos:                0 flags:          2000002/0x1
> > >>>>> (00.232917) fdinfo: type: 0x5 flags: 02000002/01 pos:        0 fd: 10
> > >>>>> (00.232923) 6693 fdinfo 11: pos:                0 flags:                2/0
> > >>>>> (00.232926)     Searching for socket 52f7d (family 2.6)
> > >>>>> (00.232928) Error (sk-inet.c:202): Name resolved on unconnected socket
> > >>>>> (00.232929) ----------------------------------------
> > >>>>> (00.232939) Error (cr-dump.c:1323): Dump files (pid: 6693) failed with -1
> > >>>>> (00.232954) Waiting for 6693 to trap
> > >>>>> (00.232961) Daemon 6693 exited trapping
> > >>>>>
> > >>>>> Can this also be solved in a way like the dropping of in-flight connections?
> > >>>>
> > >>>> This is a socket, that turned into connected state while we've been locking
> > >>>> the network, is it?
> > >>>
> > >>> I have no idea, that's why I am asking ;-) But it could be exactly this.
> > >>>
> > >>> This sounds like something that cannot be just dropped, because a
> > >>> established connection would then be missing during restore. If I
> > >>> understand it correctly the only option is to retry later, right?
> > >>
> > >> No, I'd say the way to go is to check the socket's state to be established
> > >> and dump it. It looks like we should first lock the network and only _after_
> > >> it collect the sockets via diag, not the vice-versa.
> > > 
> > > It sounds great if this is solvable, but I think I do not fully
> > > understand it. If I look at the log the network seems to get locked
> > > pretty early:
> > > 
> > > http://lisas.de/~adrian/dump.log-name-resolved-on-unconnected-socket
> > 
> > Yes, you're right. As Andrey points out network_lock() is called before
> > collect_namespaces(). Then we need to get more info about this socket,
> > in particular, the gen_uncon_sk() reads more stuff about socket and
> > the info.tcpi_state is the most interesting for us. What is it?
> 
> https://ci.openvz.org/job/CRIU/job/CRIU-x86_64-dedup/branch/criu-dev/570/console
> Start test
> ./socket-closed-tcp --pidfile=socket-closed-tcp.pid
> --outfile=socket-closed-tcp.out
> Run criu pre-dump
> Run criu pre-dump
> Run criu dump
> =[log]=> dump/zdtm/static/socket-closed-tcp/24/3/dump.log
> ------------------------ grep Error ------------------------
> (00.053676) Error (sk-inet.c:202): Name resolved on unconnected socket
> (00.053786) Error (cr-dump.c:1322): Dump files (pid: 24) failed with -1
> (00.056076) Error (cr-dump.c:1633): Dumping FAILED.
> ------------------------ ERROR OVER ------------------------
> 
> Does it the same error?

This looks like the same error. Do I still need to provide additional
informations or can this now be reproduced somehow else?

		Adrian


More information about the CRIU mailing list