[CRIU] Process Migration using Sockets v2 - Patch 1/2

Rodrigo Bruno rbruno at gsd.inesc-id.pt
Mon Oct 19 03:47:01 PDT 2015


Hi, 

On Thu, 15 Oct 2015 14:27:29 +0300
Pavel Emelyanov <xemul at parallels.com> wrote:

> On 10/13/2015 05:06 PM, Rodrigo Bruno wrote:
> 
> 
> >>>> And why should receiver check for any match? What can happen in the remote side,
> >>>> that the requestor gets different object than it asked for?
> >>>>
> >>>>> 4. Write connections only write the image header before sending the actual content.
> >>>>>
> >>>>> 5. the image header is a protobuf object that contains two strings: the image name
> >>>>> and the image namespace (the namespace identifies the process that created the image).
> >>>>>
> >>>>> 6. only connections between the image-proxy and the image-cache (which are TCP) check
> >>>>> for file boundaries. Imagine the image-proxy starts to forward an image to image-cache:
> >>>>> 	a) write image-header
> >>>>> 	b) write image size (uint64_t)
> >>>>
> >>>> Why isn't the size in the header?
> >>>
> >>> Good point. The image-header object is used in all connections (dump, proxy, cache, restore).
> >>> This image size is only used between the proxy and cache (because I know the size of the image).
> >>>
> >>> Maybe I can have a different image header with the size?
> >>
> >> I haven't yet fully understood the difference between criu-proxy/cache and proxy-cache
> >> protocols. But from what I have :) I think that it's worth having two different headers,
> >> one for "local" communications (criu-proxy/cache) and the other one for remote (cache-proxy).
> > 
> > Yes, one header with size for proxy->cache communication, and a header without size for
> > all others (dump<->proxy, and restore<-cache).
> > 
> > Both protocols are very similar. The only difference is that in cache<-proxy communication
> > I send the size of the file before the atual file.
> > 
> > Example:
> > open connection to write
> > write header (name + namespace) (protobuf object)
> > write size uint64_t
> > write image
> > close
> > 
> > The new version (discussed in these emails) will result in:
> > get the already open connection (inherited from the user)
> > write header (name + namespace + size) (protobuf object)
> > write image
> > write closing header (to replace the close)
> 
> Yup. This would be simple, extendable and easy to read :)
> 
> >>>>> +    static char buf[4096];
> >>>>> +    int n = 0;
> >>>>> +    unsigned long curr = 0;
> >>>>> +
> >>>>> +    for(; curr < len; ) {
> >>>>> +	    n = read(fd, buf, MIN(len - curr, 4096));
> >>>>
> >>>> This bytes skipping looks incorrect. The skip_img_bytes() is called for pages.img
> >>>> on low-level snapshots to correctly forward the file position. Thus, if we get here
> >>>> with criu restore --remote, this means that the low-level images should be already
> >>>> here, on the node, and there's no need in reading the data in vain.
> >>>
> >>> Sorry, I couldn't get your idea. This function replaces an lseek inside 
> >>> skip_pagemap_pages. Since lseek does not work in sockets (as far as I know),
> >>> I use this function.
> >>
> >> My point is -- you shouldn't get to this place in case of remote connection at all.
> >> The skip_img_bytes is called in the situation when we have a stack of images and
> >> we read data from the top-most one and want to skip the duplicate data from the
> >> lower one(s).
> >>
> >> Next, the restore happens when the whole stack is already obtained and if we get to
> >> the skip_img_bytes routine, this means that we want to skip bytes from some image
> >> namespace, excluding the top-most. But the non-top images cannot sit behind the socket,
> >> they have already been transferred.
> > 
> > Well, I added this method because it was not working and I found this could be a potential
> > bug of my solution (since lseek does not work for sockets).
> > 
> > I don't see much difference here between files and sockets. In remote mode, we have two
> > sockets (for example, one from a predump pages-1.img and other from a dump pages-1.img).
> > 
> > These two connections are retrieving data from the local image-cache. If you need to skip
> > bytes from one pages-1.img, you have to consume them from the socket.
> > 
> > Right?
> 
> Right, but in normal operations we do not skip bytes from the top-level image, only
> from the low level ones, and these (low) cannot (or can they?) sit on the proxy side
> at restore time.

At restore time, all images are cached at the image-cache. They have already been 
transferred from image-proxy to image-cache (if CRIU restore tries to open an image
that is not cached yet, the open call will block until the image is cached at 
image-cache).

Therefore, CRIU restore can have multiple unix socket connections to image-cache, 
one for each pages image. If CRIU restore needs to skip bytes from a particular 
image, it can, without interfering with other images (which come from separate 
connections).

It looks like this:

CRIU Restore      image-cache                 image-proxy
|<---(img1.img)-  |          |               |
|<---(img2.img)-  |          |<-----(TCP)----|
|<---(img3.img)-  |          |               |

> And since they can't there's nothing we should read from anywhere 
> to skip them.
> 
> -- Pavel


-- 
Rodrigo Bruno <rbruno at gsd.inesc-id.pt>


More information about the CRIU mailing list