[CRIU] Process Migration Using Sockets

Fri Jul 31 05:35:53 PDT 2015

On 07/31/2015 04:06 AM, Rodrigo Bruno wrote:
> On Thu, 30 Jul 2015 18:04:20 +0300
> Pavel Emelyanov <xemul at parallels.com> wrote:
> 
>> On 07/30/2015 03:42 AM, Rodrigo Bruno wrote:
>>> Hi,
>>>
>>> I am using CRIU and I extended it to support process live migration using sockets.
>>
>> Have you looked at the p.haul stuff we use for the same?
> 
> No, I will take a look.

Yup. It'd not yet 100% functional, but demonstrates the intention.
https://github.com/xemul/p.haul

>>
>>> The idea is to write to a file descriptor which corresponds to a socket instead of a file.
>>
>> You mean write the image files into a socket, don't you?
> 
> Yes.

OK. Yes, this is cool feature that is sometimes asked about.

>>
>>> The amount of code needed for this modification is very small.
>>>
>>> So far my experiments are running smoothly. With this, I do not need a background NFS
>>> deployment and performance is much better. The user only needs to specify, in the command
>>> line args, that this migration is done using sockets. For now I am using SSH tunnels to
>>> redirect and cipher the data between different hosts.
>>>
>>> I don't know if this will break any other functionality tough.
>>
>> Well, if it's all about image files, then two things to keep in mind.
>>
>> First, the contents of the pages.img files can already be sent to
>> sockets using page server (http://criu.org/Disk-less_migration).
> 
> I only realized that after having my solution working... However, as far as I 
> understood, this does not give a full live migration because other img files 
> still need to get transferred.

:)

>>
>> Second, image objects are read from images in different order from the
>> one they were written to. So right now it's not easily possible to
>> pipe-line CRIU dump into CRIU restore.
> 
> Right. I solved this problem with in-memory file caches (separate process) that hold files' 
> contents. On the dump side, the cache receives the files' contents and simply redirects it 
> to the restore side cache.

Do you hold them in some hand-made cache, or use tmpfs for this?

> The restore side cache holds all files in memory until they are 
> requested by the CRIU restore process. This enables multiple files to be sent concurrently,
> allowing the restore mechanism to start while the dump mechanism is still running.

That's interesting :) So you also "lock" reading from images when more objects
are requested, but they have not yet arrived, don't you?

>>
>>> I am sending this mail to ask you if this contribution is of any interest for the project.
>>
>> Of course!
>>
>>> If it is, I will be glad to help, providing a patch or whatever you need.
>>
>> Sure! The patch is always welcome.
> 
> I will wrap up my modifications and submit a patch. =)

Looking forward to see them :)

-- Pavel