[CRIU] [PATCH v2 5/5] UFFD: Support lazy-pages restore between two hosts
Pavel Emelyanov
xemul at virtuozzo.com
Tue Mar 29 02:42:27 PDT 2016
On 03/29/2016 08:58 AM, Mike Rapoport wrote:
> On Mon, Mar 28, 2016 at 06:28:54PM +0300, Pavel Emelyanov wrote:
>> On 03/24/2016 06:52 PM, Adrian Reber wrote:
>>> From: Adrian Reber <areber at redhat.com>
>>
>> Here's my comments on the patch :) I probably should have sent them earlier,
>> so sorry for the not-so-fast response.
>>
>>> This enhances lazy-pages mode to work with two different hosts. Instead
>>> of lazy restoring a process on the same host this enables to keep the
>>> memory pages on the source system and actually only transfer the memory
>>> pages on demand from the source to the destination system.
>>>
>>> The previous, only on one host, lazy restore consisted of two process.
>>>
>>> criu restore --lazy-pages --address /path/to/unix-domain-socket
>>>
>>> and
>>>
>>> criu lazy-pages --address /path/to/unix-domain-socket
>>
>> I would say that's OK to have separate command to start the lazy server.
>> Mike's suggestion to spawn the server automatically after restore also
>> makes sense, I will accept such a patch, but for debugging purpose I'd
>> keep the separate lazy-pages action.
>>
>>> The unix domain socket was used to transfer the userfault FD (UFFD) from
>>> the 'criu restore' process to the 'criu lazy-pages' process. The 'criu
>>> lazy-pages' was then listening on the UFFD for userfaultfd messages
>>> which were used to retrieve the requested memory page from the
>>> checkpoint directory and transfer that page into the process to be
>>> restored.
>>>
>>> This commit introduces the ability to keep the pages on the remote host
>>> and only request the transfer of the required pages over TCP on demand.
>>> Therefore criu needs to be started differently than previously.
>>>
>>> Host1:
>>>
>>> criu restore --lazy-pages --address /path/to/unix-domain-socket
>>>
>>> and
>>>
>>> criu lazy-pages --address /path/to/unix-domain-socket \
>>> --lazy-client ADDR-Host2 --port 27
>>>
>>> Host2:
>>>
>>> criu lazy-pages --lazy-server --port 27
>>
>> And this patch definitely requires tuning. First, as Mike notices, we already
>> have the code that makes dump send pages over the network -- the dump
>> --page-server makes this work, so for the server side I'd just extend the
>> dump action with the --lazy-pages option that would feed _more_ data into
>> page_sever_xfer after dump.
>
> For actual post-copy, the dump action needs to remain active until all the
> pages are transfered to the destination. Moreover, dump action should be
> able to respond to requests rather than feed more data to the page_server
> running on the destination node. I think page_server should gain some
> symmetry and ability to run on the source node.
This fits well with the page_xfer driver :)
>> For the destination side, according to https://criu.org/Userfaultfd I planned
>> to see the 3rd page_read driver, that gets memory from the network and teach
>> the lazy_pages action (and daemon) to use this page read.
>
> Again, this 3rd page_reader driver will need a peer on the source side that
> will be able to respond to random page requests :)
-- Pavel
More information about the CRIU
mailing list