[CRIU] Checkpoint fails intermittently

Pavel Emelyanov xemul at parallels.com
Mon Oct 6 04:24:57 PDT 2014


On 10/06/2014 02:12 AM, Frederico Araujo wrote:
> Dear list,
> 
> I've notice that sometimes checkpoint fails with the following error (please see attached log):
> 
> /Error (sk-tcp.c:66): Can't turn TCP repair mode ON: Operation not permitted/
> 
> However, I'm executing criu as sudo. Could it be that the file descriptor for the socket being
> dumped is set as RO or immutable? In which cases this error would occur? 

It also happens if the socket we're turning repair on is not in closed/established
state. Since this error happens not every time, I think I know why this happens.

The TCP socket dump goes in 3 steps

1. collect

  Here we talk to kernel's tcp_diag subsys fetching the info about tcp sockets.
  At this stage we get the state the socket is in and next two steps only occur
  if the state is closed/established.

2. lock

  Here we block the packets with netfilter

3. dump

  Here we turn repair on and get the socket state into image file


So, if between steps 1 and 2 the FIN packet arrives the socket would get turned
into one of the closing states and we will fail to turn repair ON on it.  

Thanks,
Pavel



More information about the CRIU mailing list