[CRIU] criu + threaded program + TCP_REPAIR

Sowmini Varadhan sowmini.varadhan at oracle.com
Tue Oct 14 03:04:01 PDT 2014



On (10/14/14 10:34), Pavel Emelyanov wrote:
    :
> > 4. Add the server's address on dummy0 on the client
> >       client# ip link add dummy0 type dummy
> >       client# ip addr add <srvaddr>  dev dummy0
> > 
> > 5. Copy the checkpoint files over to the client (duplicate the dir structure)
> >    and restore
> 
> First of all, it's not enough to just copy the files. If you want
> to move a TCP connection you should at least make sure that
> 
> a) the same IP address as was on source node is available on destination

yes, you can see that I did that in step 4 (otherwise bind() would fail)

> b) the netfilter rule that CRIU created on dump to lock the connection
>    exists on the destination (http://criu.org/TCP_connection)

That web-site seems to say that it should be enough to use
--tcp-established on both dump and restore, was there something else I
needed to do? 

> >     pie: Restoring EXE link
> >     pie: Restoring scheduler params 0.0.0
> >     pie: Restoring scheduler params 0.0.0
> >     pie: Error (pie/restorer.c:351): Thread pid mismatch 10189/10188
> 
> This means, that the PID of the iperf process is busy on the destination
> node and CRIU cannot create the process (well, in this case thread) with
> the same PID as it used to have.

yes, that might have been my problem. I see 10188  being used
by another sshd process when I do 'ps -eLf'

> One of the ways is to run iperf inside the PID namespace. But provided
> you would also have to somehow manage the IP address, you night also
> want to use the net namespace too (in this case, btw, the connection
> locking would work the other way).

I see. I'm not sure I'd actually need the netns, the dummy interface
should suffice, no?

> Do you really need to copy the image files on another box? If you just
> want to play with it it's enough to restore from them on the same box.

I was actually trying to see if I could get a simplified version
of live-migration (I see the lxc migration work went in very recently,
and wanted something that was a little less than bleeding-edge, so
that I could get over my user-errors first..

Thanks for taking the time to respond!

--Sowmini




More information about the CRIU mailing list