[CRIU] [PATCH 01/10] p.haul: implement migration over existing connections

Pavel Emelyanov xemul at parallels.com
Wed Oct 21 05:12:56 PDT 2015


On 10/19/2015 11:55 PM, Tycho Andersen wrote:
> On Mon, Oct 19, 2015 at 11:39:35AM +0300, Pavel Emelyanov wrote:
>> On 10/15/2015 10:20 PM, Tycho Andersen wrote:
>>> On Thu, Oct 15, 2015 at 12:21:35PM +0300, Pavel Emelyanov wrote:
>>>> On 10/14/2015 10:27 PM, Tycho Andersen wrote:
>>>>> Hi Nikita,
>>>>>
>>>>> Thanks for this work, it will be very useful for us.
>>>>>
>>>>> On Fri, Oct 09, 2015 at 09:11:33PM +0400, Nikita Spiridonov wrote:
>>>>>> Remove standalone mode, p.haul now can work only over existing
>>>>>> connections specified via command line arguments as file
>>>>>> descriptors.
>>>>>>
>>>>>> Three arguments required - --fdrpc for rpc calls, --fdmem for c/r
>>>>>> images migration and --fdfs for disk migration. Expect that each
>>>>>> file descriptor represent socket opened in blocking mode with domain
>>>>>> AF_INET and type SOCK_STREAM.
>>>>>
>>>>> Do we have to require --fdfs here for anything? I haven't looked
>>>>> through the code to see why exactly it is required.
>>>>
>>>> The fd socket is required to copy filesystem, but (!) only if required.
>>>> If the storage the container's files are on is shared, then this fd
>>>> will effectively become unused.
>>>>
>>>> I think we can do it like -- one can omit this parameter, but if the
>>>> htype driver says that fs migration _is_ required, then p.haul will
>>>> fail with error "no data channel for fs migration". Does this sound
>>>> OK to you?
>>>
>>> Yep, that sounds fine.
>>>
>>>>> In LXD (and I guess openvz as well, with your ploop patch) we are
>>>>> managing our own storage backends, and have our own mechanism for
>>>>> transporting the rootfs. 
>>>>
>>>> Can you shed more light on this? :) If there's some backend that can
>>>> be used by us as well, maybe it would make sense to put migration code
>>>> into p.haul?
>>>
>>> Right now we have backends for zfs, lvm, btrfs, and just a regular
>>> directory on a filesystem. I'm not aware of us planning support for
>>> any other backends right now, but it's not out of the question.
>>> Additionally, we also want to migrate a container's snapshots when we
>>> migrate the container, which requires something to know about how we
>>> handle snapshotting for these various storage backends as well.
>>
>> Yup, pretty same for us :)
>>
>>> We also support non-live copying containers, so we need the code even
>>> without p.haul and ideally it would be good not to maintain it in two
>>> places, but,
>>
>> You mean off-line copying a container across nodes?
> 
> Yep exactly.
> 
>>>>> Ideally, I could invoke p.haul over an fd to
>>>>> just do the criu iterative piece, and potentially do some callbacks to
>>>>> tell LXD when the process is stopped so that we can do a final fs
>>>>> sync.
>>>>
>>>> The issue with fs sync is tightly coupled with memory migration iterations,
>>>> that's why I planned to put all this stuff into p.haul. If you do the
>>>> final fs sync and while doing this the amount of memory to be copied
>>>> increases, it might make sense to do one more iteration of pre-copy.
>>>> Without full p.haul control over both (memory and fs) it's hardly possible.
>>>
>>> What about passing p.haul a socket and inventing a messaging protocol?
>>> Then p.haul could ask LXD (or whoever) to sync the filesystem, but
>>> also report any errors during migration better than just exit(1).
>>
>> Let's try. Would you suggest how a protocol might look like?
> 
> What about something like,
> 
> enum phaulmsgtype {
> 	ERROR		= 0;
> 	SYNCFS		= 1;
> 	SUCCESS		= 2;
> 	/* other message types as necessary */
> }
> 
> message phaul {
> 	required phaulmsgtype	type		= 1;
> 
> 	/* for ERROR and SUCCESS, perhaps just the contents of the
> 	 * CRIU log?
> 	 */
> 	optional string		message		= 2;
> }
> 
> which you pass to p.haul via a --msgfd. I can think of a few ways it
> could work:
> 
> * if you pass msgfd, your client always has to move the filesystem.
>   This seems a little ugly though, as getting the logs (and not just
>   p.haul's exit code) may be useful for others, so they don't have to
>   know how p.haul drives CRIU to know where to look for the logs.
> 
> * when you pass msgfd, p.haul will send a SYNCFS message. If it gets
>   an UNSUP message back, it uses the htype driver's storage backend
>   (or fails if this also fails). If it is supported, the p.haul caller
>   either sends a SUCCESS or ERROR message depending on what happened.
> 
> Does that make sense? I haven't looked at the p.haul code much, so I
> could be totally off base.

At the first glance -- it has. Then we need some fs_faul_external.py module
that p_haul_lxc would return and which will take care of communicating to
the caller about moving the FS around :)

-- Pavel



More information about the CRIU mailing list