[CRIU] Process Migration using Sockets v2 - Patch 1/2

Pavel Emelyanov xemul at parallels.com
Thu Oct 15 04:27:29 PDT 2015


On 10/13/2015 05:06 PM, Rodrigo Bruno wrote:


>>>> And why should receiver check for any match? What can happen in the remote side,
>>>> that the requestor gets different object than it asked for?
>>>>
>>>>> 4. Write connections only write the image header before sending the actual content.
>>>>>
>>>>> 5. the image header is a protobuf object that contains two strings: the image name
>>>>> and the image namespace (the namespace identifies the process that created the image).
>>>>>
>>>>> 6. only connections between the image-proxy and the image-cache (which are TCP) check
>>>>> for file boundaries. Imagine the image-proxy starts to forward an image to image-cache:
>>>>> 	a) write image-header
>>>>> 	b) write image size (uint64_t)
>>>>
>>>> Why isn't the size in the header?
>>>
>>> Good point. The image-header object is used in all connections (dump, proxy, cache, restore).
>>> This image size is only used between the proxy and cache (because I know the size of the image).
>>>
>>> Maybe I can have a different image header with the size?
>>
>> I haven't yet fully understood the difference between criu-proxy/cache and proxy-cache
>> protocols. But from what I have :) I think that it's worth having two different headers,
>> one for "local" communications (criu-proxy/cache) and the other one for remote (cache-proxy).
> 
> Yes, one header with size for proxy->cache communication, and a header without size for
> all others (dump<->proxy, and restore<-cache).
> 
> Both protocols are very similar. The only difference is that in cache<-proxy communication
> I send the size of the file before the atual file.
> 
> Example:
> open connection to write
> write header (name + namespace) (protobuf object)
> write size uint64_t
> write image
> close
> 
> The new version (discussed in these emails) will result in:
> get the already open connection (inherited from the user)
> write header (name + namespace + size) (protobuf object)
> write image
> write closing header (to replace the close)

Yup. This would be simple, extendable and easy to read :)

>>>>> +    static char buf[4096];
>>>>> +    int n = 0;
>>>>> +    unsigned long curr = 0;
>>>>> +
>>>>> +    for(; curr < len; ) {
>>>>> +	    n = read(fd, buf, MIN(len - curr, 4096));
>>>>
>>>> This bytes skipping looks incorrect. The skip_img_bytes() is called for pages.img
>>>> on low-level snapshots to correctly forward the file position. Thus, if we get here
>>>> with criu restore --remote, this means that the low-level images should be already
>>>> here, on the node, and there's no need in reading the data in vain.
>>>
>>> Sorry, I couldn't get your idea. This function replaces an lseek inside 
>>> skip_pagemap_pages. Since lseek does not work in sockets (as far as I know),
>>> I use this function.
>>
>> My point is -- you shouldn't get to this place in case of remote connection at all.
>> The skip_img_bytes is called in the situation when we have a stack of images and
>> we read data from the top-most one and want to skip the duplicate data from the
>> lower one(s).
>>
>> Next, the restore happens when the whole stack is already obtained and if we get to
>> the skip_img_bytes routine, this means that we want to skip bytes from some image
>> namespace, excluding the top-most. But the non-top images cannot sit behind the socket,
>> they have already been transferred.
> 
> Well, I added this method because it was not working and I found this could be a potential
> bug of my solution (since lseek does not work for sockets).
> 
> I don't see much difference here between files and sockets. In remote mode, we have two
> sockets (for example, one from a predump pages-1.img and other from a dump pages-1.img).
> 
> These two connections are retrieving data from the local image-cache. If you need to skip
> bytes from one pages-1.img, you have to consume them from the socket.
> 
> Right?

Right, but in normal operations we do not skip bytes from the top-level image, only
from the low level ones, and these (low) cannot (or can they?) sit on the proxy side
at restore time. And since they can't there's nothing we should read from anywhere 
to skip them.

-- Pavel


More information about the CRIU mailing list