[CRIU] Extraction/restoration of network state only

Pavel Emelyanov xemul at parallels.com
Wed Jul 29 09:17:40 PDT 2015


On 07/29/2015 06:11 PM, mark.wohlleben at gmx.de wrote:
> Dear all,
> 
> I am currently working on a project involving the migration of running http sessions among a number
> of instances of webservers for the purpose of load balancing. I am very excited about criu, which
> could be used for the job, however dumps a lot of state that I actually do not need and that renders
> my use case inefficient. What I actually want is to extract the state of a TCP socket + the state of
> the HTTP session and restore the whole session (TCP+HTTP) on a different machine.

Yup, makes sense.

> Thus, I was thinking about reusing the network part of criu in my webserver project. My first idea
> was to include the sk-inet.h, as the two methods 
> 
> extern int dump_one_tcp(int sk, struct inet_sk_desc *sd);
> extern int restore_one_tcp(int sk, struct inet_sk_info *si);
> 
> seem to do what I want.

Yes, they do, but they cannot be just used as-is.

First of all, they only dump the TCP state leaving the IP part to the caller. But
that IP part is essential as during restore connect() and bind() both need the IP
addresses.

The second issue is connection locking. When doing both dump and restore no packets
should appear on the socket, so in criu we lock the connection either with netfilter
or by unplugging the container virtual NIC. The dump_one_tcp() only does locking for
non-container case, while restore_one_tcp() implies it's already there (and does NOT
unlock it at the end).

And the last thing is that both routines dup() the socket descriptor for later 
repair-off call, e.g. after dump it's done in tcp_unlock_one(), after restore -- in
the rst_tcp_repair_off().

> However, before I start walking into the wrong direction, I would like to ask for some advice from 
> someone knowing the code.

Overall the direction is correct -- you need to save and restore info from 3 levels 
which is http, tcp and ip. CRIU can do last two, but the code doing this is partially 
scattered over the sources :) so patching is required.

And it's built-in into an executable, apparently moving it out into .so/.a is also
required.

-- Pavel


More information about the CRIU mailing list