[CRIU] Error (sk-inet.c:202): Name resolved on unconnected socket

Andrew Vagin avagin at virtuozzo.com
Fri Jul 15 21:43:33 PDT 2016


On Thu, Jul 14, 2016 at 02:08:23PM +0300, Pavel Emelyanov wrote:
> On 07/13/2016 07:10 PM, Adrian Reber wrote:
> > On Wed, Jul 13, 2016 at 05:48:16PM +0300, Pavel Emelyanov wrote:
> >> On 07/13/2016 04:59 PM, Adrian Reber wrote:
> >>> On Wed, Jul 13, 2016 at 03:54:58PM +0300, Pavel Emelyanov wrote:
> >>>> On 07/05/2016 04:33 PM, Adrian Reber wrote:
> >>>>> Now that CRIU can drop in-flight connections I get much better results
> >>>>> checkpointing and restarting my test container while running ab against
> >>>>> the tomcat server in the container: ab -n 1000000 -c 20
> >>>>>
> >>>>> I am still testing with LXC using lxc-checkpoint and once every 100 test
> >>>>> runs I get following error:
> >>>>>
> >>>>> (00.232915) 6693 fdinfo 10: pos:                0 flags:          2000002/0x1
> >>>>> (00.232917) fdinfo: type: 0x5 flags: 02000002/01 pos:        0 fd: 10
> >>>>> (00.232923) 6693 fdinfo 11: pos:                0 flags:                2/0
> >>>>> (00.232926)     Searching for socket 52f7d (family 2.6)
> >>>>> (00.232928) Error (sk-inet.c:202): Name resolved on unconnected socket
> >>>>> (00.232929) ----------------------------------------
> >>>>> (00.232939) Error (cr-dump.c:1323): Dump files (pid: 6693) failed with -1
> >>>>> (00.232954) Waiting for 6693 to trap
> >>>>> (00.232961) Daemon 6693 exited trapping
> >>>>>
> >>>>> Can this also be solved in a way like the dropping of in-flight connections?
> >>>>
> >>>> This is a socket, that turned into connected state while we've been locking
> >>>> the network, is it?
> >>>
> >>> I have no idea, that's why I am asking ;-) But it could be exactly this.
> >>>
> >>> This sounds like something that cannot be just dropped, because a
> >>> established connection would then be missing during restore. If I
> >>> understand it correctly the only option is to retry later, right?
> >>
> >> No, I'd say the way to go is to check the socket's state to be established
> >> and dump it. It looks like we should first lock the network and only _after_
> >> it collect the sockets via diag, not the vice-versa.
> > 
> > It sounds great if this is solvable, but I think I do not fully
> > understand it. If I look at the log the network seems to get locked
> > pretty early:
> > 
> > http://lisas.de/~adrian/dump.log-name-resolved-on-unconnected-socket
> 
> Yes, you're right. As Andrey points out network_lock() is called before
> collect_namespaces(). Then we need to get more info about this socket,
> in particular, the gen_uncon_sk() reads more stuff about socket and
> the info.tcpi_state is the most interesting for us. What is it?

https://ci.openvz.org/job/CRIU/job/CRIU-x86_64-dedup/branch/criu-dev/570/console
Start test
./socket-closed-tcp --pidfile=socket-closed-tcp.pid
--outfile=socket-closed-tcp.out
Run criu pre-dump
Run criu pre-dump
Run criu dump
=[log]=> dump/zdtm/static/socket-closed-tcp/24/3/dump.log
------------------------ grep Error ------------------------
(00.053676) Error (sk-inet.c:202): Name resolved on unconnected socket
(00.053786) Error (cr-dump.c:1322): Dump files (pid: 24) failed with -1
(00.056076) Error (cr-dump.c:1633): Dumping FAILED.
------------------------ ERROR OVER ------------------------

Does it the same error?

> 
> > If you can give me some hints where this needs to be fixed I can try to
> > come up with a patch, but right now I am still lost ;-)
> > 
> > 		Adrian
> > .
> > 
> 
> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu


More information about the CRIU mailing list