[CRIU] Dump failure

Pavel Emelyanov xemul at parallels.com
Fri Jul 17 05:09:18 PDT 2015


On 07/17/2015 02:33 AM, Ross Boucher wrote:
> I can't reproduce it reliably, but it seems to happen about 10% of the time in my setup
> (though, there isn't a ton of data at this point). I can try to gather some more information 
> for you if that would be helpful.

Yes, please. And I'll try to think what kind of debug can be useful for it.

> On Thu, Jul 16, 2015 at 2:30 AM, Pavel Emelyanov <xemul at parallels.com <mailto:xemul at parallels.com>> wrote:
> 
>     On 07/16/2015 02:56 AM, Ross Boucher wrote:
>     > I got this failure today when checkpointing a container in my system:
>     >
>     > https://gist.github.com/boucher/ac5ac25c358e5a24665b
>     >
>     > Any idea what the cause might be?
> 
>     Yup
> 
>     Error (sk-inet.c:188): Name resolved on unconnected socket
> 
>     We see reports about this from time to time. The error means, that there's
>     some socket in the system, that is owned by a process (via fd), but while
>     getting all the sockets via sock-diag API (in collect_sockets) this particular
>     one was _not_ there. This happens only if the socket is freshly created with
>     socket() call and is not yet bound or connected. The get_unconn_sk() function
>     is called for such sockets -- found via some task's fd, but not found in diag
>     output. We check that the socket in question is truly unbound and unconnected,
>     but in your case the check fails.
> 
>     That's the best guess we have, but we cannot check one, since this situation
>     occurs rarely. Do you know how to reproduce one more or less reliably?
> 
>     -- Pavel
> 
> 



More information about the CRIU mailing list