[CRIU] Checkpoint fails intermittently
Pavel Emelyanov
xemul at parallels.com
Mon Oct 6 04:24:57 PDT 2014
On 10/06/2014 02:12 AM, Frederico Araujo wrote:
> Dear list,
>
> I've notice that sometimes checkpoint fails with the following error (please see attached log):
>
> /Error (sk-tcp.c:66): Can't turn TCP repair mode ON: Operation not permitted/
>
> However, I'm executing criu as sudo. Could it be that the file descriptor for the socket being
> dumped is set as RO or immutable? In which cases this error would occur?
It also happens if the socket we're turning repair on is not in closed/established
state. Since this error happens not every time, I think I know why this happens.
The TCP socket dump goes in 3 steps
1. collect
Here we talk to kernel's tcp_diag subsys fetching the info about tcp sockets.
At this stage we get the state the socket is in and next two steps only occur
if the state is closed/established.
2. lock
Here we block the packets with netfilter
3. dump
Here we turn repair on and get the socket state into image file
So, if between steps 1 and 2 the FIN packet arrives the socket would get turned
into one of the closing states and we will fail to turn repair ON on it.
Thanks,
Pavel
More information about the CRIU
mailing list