[CRIU] [PATCH] UFFD: Support lazy-pages restore between two hosts

Pavel Emelyanov xemul at virtuozzo.com
Thu Mar 24 06:16:25 PDT 2016


On 03/24/2016 01:27 PM, Adrian Reber wrote:
> From: Adrian Reber <areber at redhat.com>
> 
> This enhances lazy-pages mode to work with two different hosts. Instead
> of lazy restoring a process on the same host this enables to keep the
> memory pages on the source system and actually only transfer the memory
> pages on demand from the source to the destination system.
> 
> The previous, only on one host, lazy restore consisted of two process.
> 
>  criu restore --lazy-pages --address /path/to/unix-domain-socket
> 
> and
> 
>  criu lazy-pages --address /path/to/unix-domain-socket
> 
> The unix domain socket was used to transfer the userfault FD (UFFD) from
> the 'criu restore' process to the 'criu lazy-pages' process. The 'criu
> lazy-pages' was then listening on the UFFD for userfaultfd messages
> which were used to retrieve the requested memory page from the
> checkpoint directory and transfer that page into the process to be
> restored.
> 
> This commit introduces the ability to keep the pages on the remote host
> and only request the transfer of the required pages over TCP on demand.
> Therefore criu needs to be started differently than previously.
> 
> Host1:
> 
>    criu restore --lazy-pages --address /path/to/unix-domain-socket
> 
>   and
> 
>    criu lazy-pages --address /path/to/unix-domain-socket \
>    --lazy-client ADDR-Host2 --port 27
> 
> Host2:
> 
>    criu lazy-pages --lazy-server --port 27
> 
> On Host1 the process is now restored (as criu always does) except that
> the memory pages are not read from pages.img and that the appropriate
> pages are marked as being userfaultfd handled. As soon as the restored
> process tries to access one the pages a UFFD MSG is received by the
> lazy-client (on Host1). This UFFD MSG is then transferred via TCP to the
> lazy-sever (on Host2). The lazy-server retrieves the memory page from
> the local checkpoint and returns a UFFDIO COPY answer back to the
> lazy-client which can the forward this message to the local UFFD which
> inserts the page into the restored process.
> 
> The remote lazy restore has the same behavior as the local lazy restore
> that, if after 5 seconds no more messages are received on the socket
> waiting for UFFD MSG, it switches to copy remaining pages mode, where
> all non-UFFD-requested pages are transferred into the restored process.
> 
> TODO:
>   * Create from the checkpoint directory a checkpoint without the memory
>     pages which are UFFD handled. This would enable a real UFFD remote
>     restore where the UFFD pages do not need to be transferred to the
>     destination host.
> 
> Signed-off-by: Adrian Reber <areber at redhat.com>
> ---
>  criu/crtools.c            |  25 ++-
>  criu/include/cr_options.h |   2 +
>  criu/uffd.c               | 482 ++++++++++++++++++++++++++++++++--------------

Adrian,

This patch slams the uffd.c quite heavily. Not in terms of lines added, it's
OK when a patch adds even 1k of lines, but when they all go in one hunk. I
mean here -- in terms of the overall amount of changes made.

Is there any chance to split the changes into several patches, so that it
would be easier to review it?

-- Pavel



More information about the CRIU mailing list