[CRIU] Attempt for process migration
Pavel Emelyanov
xemul at parallels.com
Tue Mar 6 07:00:41 EST 2012
On 03/06/2012 03:19 PM, Adrian Reber wrote:
>
> I have been successfully using criu.
Cool! :)
> Now I looked into migrating one
> process from one machine to another. I written some code but before I am
> going to continue I wanted to ask if there has been already any
> discussion in that direction?
Not really. The plan is to do it similar to how we do it with the openvz -- we
just put the dump files locally and then copy them to the destination node
with plain scp (or use NFS for this).
I actually planned to use the similar with criu, but I'm open for discussion.
The other thing around it is that the images files in criu are now considered
to be seek-able at the restore time which is not so for sockets, so your
approach will require more fixes on the restore code.
The other problem with this set is that for every single image file we have
to set-up a TCP connection from the beginning. This is not very good, I suppose,
since typically image files are several bytes, thus the handshake latency will
kill all the performance.
And one more thing about this all -- we haven't yet thought of (but will need
to for sure) what to do with the filesystem used by tasks we're migrating. IOW,
you should make sure the filesystem as it was at the dumping time is the same
as it is at the restoration time. For openvz we use rsync for this, but it's
not necessarily -- we can assume tasks are on NFS or some sort of mirroring
like drbd is used.
Plus :) we have plans to implement the preliminary working set migration for criu.
This is -- you take the apps memory, push it to the destination host, then freeze
the tasks and push only the part of memory that has changed since last time (the
kernel support is not up to this, but we'll patch it). If we're going to write
images right into the opened connection, then we should think how to mix it with
the pre-migrating data as well.
> Also, the current code makes it hard to
> implement migration over TCP and therefore I wanted some feedback before
> continuing. Not that I am doing it completely wrong. Attached is a patch
> which I am using to write the image to another machine.
>
> It basically works that way:
>
> * user supplies with -r [host:port] on which machine a server is
> listening to receive the checkpoint image
>
> * after that I changed cr_fdset_open() to open a socket instead of
> a file to which the checkpoint can be transferred
>
> So far it still seems correct. Unfortunately I had to add at the opts
> structure to many functions as parameter to have it available in the
> cr_fdset_open() function. The problem I have with cr_fdset_open() is
> that it is writing the magic string just after the file has been opened
> and that is something I cannot do with the socket if I want to use only
> one TCP connection for the complete migration. Would it be somehow
> possible to move the writing of the magic string to the low level
> write_img_buf() function? Is it at all possible with the current code to
> use one socket to write all images serialized over the network to
> another machine?
Frankly speaking the whole cr_fdset machinery is rather raw at the moment. It was
adopted from the very beginning for simplicity, then was cleaned-out several times,
but we're open for changing it this way or that :)
But the biggest problem with it is not about where to write magics into it, but,
as I said -- we assume the images to be seek-able and this is the biggest obstacle
to live migration implemented in this way.
> Does it make any sense how I started with this?
>
> Are there ideas how correctly implement this?
As I said, the existing openvz scheme looks very sane to me (put files locally and
scp them), but if you see any problems with this, please share.
If implementing the ability to write images into a socket, then, first of all, we
need to fix the restore not to rely on seek. Then, I assume, we should take care
of developing some "migration protocol" which will allow us to push data we want
over the socket on the remote box. And it will most likely differ from simple
<type><lenght><payload> set of packets.
The Cyrill's proposal about writing +1 abstraction level looks sane, we we'll also
have to integrate criu with openvz live migration which has its own image format.
> Adrian
Thanks,
Pavel
More information about the CRIU
mailing list