[CRIU] [PATCH v2 5/5] UFFD: Support lazy-pages restore between two hosts

Pavel Emelyanov xemul at virtuozzo.com
Mon Mar 28 08:17:13 PDT 2016


On 03/28/2016 04:04 PM, Mike Rapoport wrote:
> On Mon, Mar 28, 2016 at 01:19:00PM +0300, Pavel Emelyanov wrote:
>> On 03/28/2016 10:12 AM, Mike Rapoport wrote:
>>> Hi Adrian,
>>>
>>> On Thu, Mar 24, 2016 at 03:52:54PM +0000, Adrian Reber wrote:
>>>> From: Adrian Reber <areber at redhat.com>
>>>>
>>>> This enhances lazy-pages mode to work with two different hosts. Instead
>>>> of lazy restoring a process on the same host this enables to keep the
>>>> memory pages on the source system and actually only transfer the memory
>>>> pages on demand from the source to the destination system.
>>>>
>>>> The previous, only on one host, lazy restore consisted of two process.
>>>>
>>>>  criu restore --lazy-pages --address /path/to/unix-domain-socket
>>>>
>>>> and
>>>>
>>>>  criu lazy-pages --address /path/to/unix-domain-socket
>>>>
>>>> The unix domain socket was used to transfer the userfault FD (UFFD) from
>>>> the 'criu restore' process to the 'criu lazy-pages' process. The 'criu
>>>> lazy-pages' was then listening on the UFFD for userfaultfd messages
>>>> which were used to retrieve the requested memory page from the
>>>> checkpoint directory and transfer that page into the process to be
>>>> restored.
>>>>
>>>> This commit introduces the ability to keep the pages on the remote host
>>>> and only request the transfer of the required pages over TCP on demand.
>>>> Therefore criu needs to be started differently than previously.
>>>>
>>>> Host1:
>>>>
>>>>    criu restore --lazy-pages --address /path/to/unix-domain-socket
>>>>
>>>>   and
>>>>
>>>>    criu lazy-pages --address /path/to/unix-domain-socket \
>>>>    --lazy-client ADDR-Host2 --port 27
>>>>
>>>> Host2:
>>>>
>>>>    criu lazy-pages --lazy-server --port 27
>>>>
>>>> On Host1 the process is now restored (as criu always does) except that
>>>> the memory pages are not read from pages.img and that the appropriate
>>>> pages are marked as being userfaultfd handled. As soon as the restored
>>>> process tries to access one the pages a UFFD MSG is received by the
>>>> lazy-client (on Host1). This UFFD MSG is then transferred via TCP to the
>>>> lazy-sever (on Host2). The lazy-server retrieves the memory page from
>>>> the local checkpoint and returns a UFFDIO COPY answer back to the
>>>> lazy-client which can the forward this message to the local UFFD which
>>>> inserts the page into the restored process.
>>>>
>>>> The remote lazy restore has the same behavior as the local lazy restore
>>>> that, if after 5 seconds no more messages are received on the socket
>>>> waiting for UFFD MSG, it switches to copy remaining pages mode, where
>>>> all non-UFFD-requested pages are transferred into the restored process.
>>>>
>>>> TODO:
>>>>   * Create from the checkpoint directory a checkpoint without the memory
>>>>     pages which are UFFD handled. This would enable a real UFFD remote
>>>>     restore where the UFFD pages do not need to be transferred to the
>>>>     destination host.
>>>>
>>>> Signed-off-by: Adrian Reber <areber at redhat.com>
>>>> ---
>>>>  criu/uffd.c | 269 +++++++++++++++++++++++++++++++++++++++++++++++++++++-------
>>>>  1 file changed, 240 insertions(+), 29 deletions(-)
>>>  
>>> I have some concerns regarding the proposed design. It could be that I'm
>>> jumping late, and you've already had discussions about how to use
>>> userfaultfd in CRIU, but I'll share my thoughts anyway...
>>>
>>> I think that post-copy migration in CRIU may be implemented without
>>> creating so many entities taking care of different sides of userfaultfd.
>>> The restore part may create a daemon for userfault handling transparently
>>> to the user and dump side may be enhanced with ability to send pages on
>>> demand, again without starting another process explicitly.
>>>
>>> For instance, using 'criu dump --lazy-pages' would do everything required
>>> to serve pages for both on demand requests and for background copying and
>>> 'criu restore --lazy-pages' would be able to handle userfaultfd.
>>
>> I don't disagree with that, but this all applies to the API part of the
>> feature. In particular, the lazy pages daemon itself would be required,
>> the question is how t start one. The current code starts one explicitly,
>> you propose to fork() it as the part of the restore action, which is
>> also fine.
> 
> It's not just who does fork(), the user from shell or criu from code.
> For instance, criu already has a page-server that can receive pages. Maybe
> instead of adding another page server to uffd.c it would be worth teaching
> the exiting page-server to send pages.

Of course it should.

>>> Regarding the patch itself, I believe that mixing socket and uffd handling
>>> in the same functions is not very good idea. Although there are lots of
>>> similarities between their behaviour, socket and uffd are semantically
>>> different and should be handled by different code paths.
>>
>> Can you clarify this more? The same daemon should poll on two descriptors --
>> the uffd one to handle #PFs from restored tree and the socket to read
>> pages from the source node.
> 
> As far as I understood Adrian's patch, one daemon that runs on dst polls
> for uffd and the daemon that runs on src polls for the socket. In addition,
> the daemon that runs on dst may communicate with the daemon that runs on
> src to retreive the pages from there.

That seem to be correct.

> IMHO, using the same handle_requests for both daemons makes it more
> difficult to follow the way each one of the three cases is handled. Maybe
> finer grained division into smaller functions will be beneficial to both
> avoid code duplication and have clearer code structure.

Yes, the exact implementation worth refining. I was also about to send my
suggestions to the patch :)

-- Pavel


More information about the CRIU mailing list