[CRIU] [PATCH] UFFD: Support lazy-pages restore between two hosts
Pavel Emelyanov
xemul at virtuozzo.com
Thu Mar 24 06:16:25 PDT 2016
On 03/24/2016 01:27 PM, Adrian Reber wrote:
> From: Adrian Reber <areber at redhat.com>
>
> This enhances lazy-pages mode to work with two different hosts. Instead
> of lazy restoring a process on the same host this enables to keep the
> memory pages on the source system and actually only transfer the memory
> pages on demand from the source to the destination system.
>
> The previous, only on one host, lazy restore consisted of two process.
>
> criu restore --lazy-pages --address /path/to/unix-domain-socket
>
> and
>
> criu lazy-pages --address /path/to/unix-domain-socket
>
> The unix domain socket was used to transfer the userfault FD (UFFD) from
> the 'criu restore' process to the 'criu lazy-pages' process. The 'criu
> lazy-pages' was then listening on the UFFD for userfaultfd messages
> which were used to retrieve the requested memory page from the
> checkpoint directory and transfer that page into the process to be
> restored.
>
> This commit introduces the ability to keep the pages on the remote host
> and only request the transfer of the required pages over TCP on demand.
> Therefore criu needs to be started differently than previously.
>
> Host1:
>
> criu restore --lazy-pages --address /path/to/unix-domain-socket
>
> and
>
> criu lazy-pages --address /path/to/unix-domain-socket \
> --lazy-client ADDR-Host2 --port 27
>
> Host2:
>
> criu lazy-pages --lazy-server --port 27
>
> On Host1 the process is now restored (as criu always does) except that
> the memory pages are not read from pages.img and that the appropriate
> pages are marked as being userfaultfd handled. As soon as the restored
> process tries to access one the pages a UFFD MSG is received by the
> lazy-client (on Host1). This UFFD MSG is then transferred via TCP to the
> lazy-sever (on Host2). The lazy-server retrieves the memory page from
> the local checkpoint and returns a UFFDIO COPY answer back to the
> lazy-client which can the forward this message to the local UFFD which
> inserts the page into the restored process.
>
> The remote lazy restore has the same behavior as the local lazy restore
> that, if after 5 seconds no more messages are received on the socket
> waiting for UFFD MSG, it switches to copy remaining pages mode, where
> all non-UFFD-requested pages are transferred into the restored process.
>
> TODO:
> * Create from the checkpoint directory a checkpoint without the memory
> pages which are UFFD handled. This would enable a real UFFD remote
> restore where the UFFD pages do not need to be transferred to the
> destination host.
>
> Signed-off-by: Adrian Reber <areber at redhat.com>
> ---
> criu/crtools.c | 25 ++-
> criu/include/cr_options.h | 2 +
> criu/uffd.c | 482 ++++++++++++++++++++++++++++++++--------------
Adrian,
This patch slams the uffd.c quite heavily. Not in terms of lines added, it's
OK when a patch adds even 1k of lines, but when they all go in one hunk. I
mean here -- in terms of the overall amount of changes made.
Is there any chance to split the changes into several patches, so that it
would be easier to review it?
-- Pavel
More information about the CRIU
mailing list