[CRIU] Process Migration Using Sockets - PATCH

Rodrigo Bruno rbruno at gsd.inesc-id.pt
Mon Sep 21 16:04:15 PDT 2015


Hi, 

sorry for taking so long for replying. See answer inline please.

On Mon, 14 Sep 2015 13:41:43 +0300
Pavel Emelyanov <xemul at parallels.com> wrote:

> On 09/11/2015 06:38 PM, Rodrigo Bruno wrote:
> > 
> >>
> >>
> >>>> Can you describe your protocol then in more details. Why I don't quite understand
> >>>> is how you mix text info with binary data (images and pages) and how you define
> >>>> borders between objects.
> >>>
> >>> Yes. Two communications can exist: writing and reading remote images. They both use the
> >>> same protocol:
> >>>
> >>> 1. open socket on read or write port (both cache and proxy have one of each)
> >>> 2. write image pathname (32 bytes)
> >>> 3. write image namespace (32 bytes)
> >>> 4. read image pathname (32 bytes)
> >>> 5. read image namesapce (32 bytes)
> >>
> >> I see. I would still suggest to switch this header onto protobuf format so that we
> >> could extend one later (and easily drop the 32-bytes limitation).
> >>
> >>> The 32 bytes is a constant. It is a limitation on the size of the names and namespaces. It 
> >>> could be solved by adding a first field (size).
> >>>
> >>> Steps 4 and 5 are used to check if the image exists (if it doesn't, an error is read from the 
> >>> socket).
> >>>
> >>> Then, I simply return the socket file descriptor and let criu use it for writing (dump) or 
> >>> reading (restore).
> >>>
> >>> Note that I do not unpack the objects being sent (pagemap entries for example). Neither I 
> >>> check if the file was correctly sent. When the socket FD is closed I assume that the 
> >>> operation (reading or writing) is complete and I let the other side (restore for example)
> >>> to check if the image is correct.
> >>
> >> I see one problem with it. The other side cannot always verify the image correctness. E.g.
> >> if a single object is lost from image, this can only be found out at the actual restore
> >> time, but not always. E.g. fdtable.img contains file descriptors. If one is lost from there
> >> a task will be restored w/o one descriptor and criu has no glues to check this.
> >>
> >> We've seen this with the page-server, so to finalize the transfer we use special command
> >> (PS_IOV_FLUSH) that requires response from the server side indicating that everything is OK.
> > 
> > Okey, so two things to do:
> > 
> > a- initial "handshake" with protobuf object in which
> > 	1- the side making the request sends a protobuf object with the name and namespace and 
> > 	2- the side answering replies with another protobuf object with a ack/nack
> > 
> > b- final "handshake" in which
> > 	1- the sender reports that it is done writing and
> > 	2- the receiver answers that everything is okay.
> 
> And, probably, c) -- anything in between that describes objects being transferred.

I already implemented protobuf objects for handling image name and namespaces.

I have a question regarding the "final handshake". It is easy to do it for the pre-dump/dump
side where I only need to handle the situation when CRIU calls close_image.

The problem seems more difficult when we consider the restore side. To do it well (and considering
all types of image files), we would have to consider all locations where an image is being read 
and check for the handshake bytes. Right?

Since only the image-proxy and image-cache communicate over less reliable sockets (through a LAN or WAN),
wouldn't it suffice to do this "final handshake" between the image-proxy and the image-cache? The 
communication between dump/pre-dump > image-proxy and image-cache > restore is supposed to be local and
therefore very reliable.


> 
> > I will see how PS_IOV_FLUSH is implemented to make the second one.
> > 
> > 
> > You want me to do these changes before sending the patches per components, right?
> 
> Yes, please :)
> 


-- 
Rodrigo Bruno <rbruno at gsd.inesc-id.pt>


More information about the CRIU mailing list