[CRIU] Process Migration using Sockets - PATCH 1/2

Thu Oct 1 16:29:17 PDT 2015

Hi Cryill,

the code I am sending enables CRIU to transfer all image files from the node 
where the process is to the node where the process is going to be transfered.

Currently, to migrate a process from one node to another, you must have a way
to transfer both application resources (files) and dump images. In CRIU's 
webpage, you suggest NFS.

For my work, NFS wasn't an option so I needed to transfer all images directly
to the destination node. Using the page-server wasn't an option also because
it only handles memory pages. Besides, if implemented right, I believe that
transfering images directly is faster than using NFS (I'm not saying this code
is that efficient tough; but I will measure that in future).

To be able to change the image backend (from disk files to sockets), I added
a new command line flag "--remote" which tells CRIU that it should use sockets
as backend for image files. Whenever CRIU opens an image, instead of opening
a file, it opens a socket and returns the file descriptor. The rest of the
code is transparent. Therefore, the changes to the existing code are relatively 
small.

The image-cache and image-proxy come into play because:
1- CRIU dumps images in a different order than the one restore needs. Therefore, 
directing the images directly from CRIU dump to CRIU restore wouldn't work;
2- using these two components CRIU restore can start working before all the
images are fully transfered, even before CRIU dump finishes;
3- the code looks more modular and the changes to existing CRIU code are 
dramatically reduced.

Both image-cache and image-proxy are similar:
- both components have two ports listenning, one port is used for writing 
image files, and the other one is used for reading.
- both components receive images and keep them in memory (using a linked
list of buffers)

The difference between image-cache and image-proxy is that the proxy forwards
all images to the cache. Example:

Node A: CRIU dump -> (sends images using a local socket) -> 	image-proxy
									|
									V
Node B: CRIU restore <- (receives images from a local socket)	<- image-cache							

The image-proxy forwards all images pro-actively to image-cache as soon as it
finishes receiving them from the local CRIU dump.

Example of running all together:

Node A (destination node):
Step 1- start image-cache
/* restore will get stuck waiting for images that are still not available
   in the cache */
Step 2- start criu restore

Node B (source node):
Step 3- start image-proxy
Step 4- pre-dump process (optional and can be called multiple times)
Step 5- dump process

I think this is the overall idea. I have been using this to migrate HotSpot
JVMs running SPEC and DaCapo benchmarks and so far I had no problems.

I suggested this to CRIU because I think you might be interested in having
live migration with no NFS dependency.

best,

Rodrigo Bruno

On Thu, 1 Oct 2015 00:03:45 +0300
Cyrill Gorcunov <gorcunov at gmail.com> wrote:

> On Wed, Sep 30, 2015 at 10:26:31PM +0300, Pavel Emelyanov wrote:
> > On 09/27/2015 01:45 AM, Rodrigo Bruno wrote:
> > > Hi,
> > > 
> > > sorry about the previous patch, I obviously got it wrong...
> > 
> > Yes, this one is review-able :)
> > 
> > > I hope this one is right, otherwise I will re-iterate the process.
> > 
> > Well, one patch per-email is very welcome, this one has two. Plus,
> > each patch, especially THAT big deserved a good and descriptive
> > comment.
> > 
> > Plus, find more comments inline.
> 
> This all are just details :) Rodrigo, could you please explain
> _what_ we're tryin to achieve with this series? Is it some kind
> of underlied transport for image transferring between several
> nodes? Image-cache and image-proxy for what? How they are supposing
> to work in "general"?
> 
> Don't get me wrong, I simply don't understand and having some
> big general/common picture of what we're doing here would
> be really helpful.

-- 
Rodrigo Bruno <rbruno at gsd.inesc-id.pt>