[CRIU] Checkpoint fails intermittently

Frederico Araujo araujof at gmail.com
Tue Oct 7 12:42:06 PDT 2014


Thanks a lot, Pavel!
I think this was the problem :)

Thanks for the great work on CRIU!

Fred



On Mon, Oct 6, 2014 at 6:24 AM, Pavel Emelyanov <xemul at parallels.com> wrote:

> On 10/06/2014 02:12 AM, Frederico Araujo wrote:
> > Dear list,
> >
> > I've notice that sometimes checkpoint fails with the following error
> (please see attached log):
> >
> > /Error (sk-tcp.c:66): Can't turn TCP repair mode ON: Operation not
> permitted/
> >
> > However, I'm executing criu as sudo. Could it be that the file
> descriptor for the socket being
> > dumped is set as RO or immutable? In which cases this error would occur?
>
> It also happens if the socket we're turning repair on is not in
> closed/established
> state. Since this error happens not every time, I think I know why this
> happens.
>
> The TCP socket dump goes in 3 steps
>
> 1. collect
>
>   Here we talk to kernel's tcp_diag subsys fetching the info about tcp
> sockets.
>   At this stage we get the state the socket is in and next two steps only
> occur
>   if the state is closed/established.
>
> 2. lock
>
>   Here we block the packets with netfilter
>
> 3. dump
>
>   Here we turn repair on and get the socket state into image file
>
>
> So, if between steps 1 and 2 the FIN packet arrives the socket would get
> turned
> into one of the closing states and we will fail to turn repair ON on it.
>
> Thanks,
> Pavel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20141007/2d80bcf6/attachment.html>


More information about the CRIU mailing list