[CRIU] [PATCH] UFFD: Support lazy-pages restore between two hosts

Adrian Reber adrian at lisas.de
Thu Mar 24 06:35:21 PDT 2016


On Thu, Mar 24, 2016 at 04:16:25PM +0300, Pavel Emelyanov wrote:
> On 03/24/2016 01:27 PM, Adrian Reber wrote:
> > From: Adrian Reber <areber at redhat.com>
> > 
> > This enhances lazy-pages mode to work with two different hosts. Instead
> > of lazy restoring a process on the same host this enables to keep the
> > memory pages on the source system and actually only transfer the memory
> > pages on demand from the source to the destination system.
> > 
> > The previous, only on one host, lazy restore consisted of two process.
> > 
> >  criu restore --lazy-pages --address /path/to/unix-domain-socket
> > 
> > and
> > 
> >  criu lazy-pages --address /path/to/unix-domain-socket
> > 
> > The unix domain socket was used to transfer the userfault FD (UFFD) from
> > the 'criu restore' process to the 'criu lazy-pages' process. The 'criu
> > lazy-pages' was then listening on the UFFD for userfaultfd messages
> > which were used to retrieve the requested memory page from the
> > checkpoint directory and transfer that page into the process to be
> > restored.
> > 
> > This commit introduces the ability to keep the pages on the remote host
> > and only request the transfer of the required pages over TCP on demand.
> > Therefore criu needs to be started differently than previously.
> > 
> > Host1:
> > 
> >    criu restore --lazy-pages --address /path/to/unix-domain-socket
> > 
> >   and
> > 
> >    criu lazy-pages --address /path/to/unix-domain-socket \
> >    --lazy-client ADDR-Host2 --port 27
> > 
> > Host2:
> > 
> >    criu lazy-pages --lazy-server --port 27
> > 
> > On Host1 the process is now restored (as criu always does) except that
> > the memory pages are not read from pages.img and that the appropriate
> > pages are marked as being userfaultfd handled. As soon as the restored
> > process tries to access one the pages a UFFD MSG is received by the
> > lazy-client (on Host1). This UFFD MSG is then transferred via TCP to the
> > lazy-sever (on Host2). The lazy-server retrieves the memory page from
> > the local checkpoint and returns a UFFDIO COPY answer back to the
> > lazy-client which can the forward this message to the local UFFD which
> > inserts the page into the restored process.
> > 
> > The remote lazy restore has the same behavior as the local lazy restore
> > that, if after 5 seconds no more messages are received on the socket
> > waiting for UFFD MSG, it switches to copy remaining pages mode, where
> > all non-UFFD-requested pages are transferred into the restored process.
> > 
> > TODO:
> >   * Create from the checkpoint directory a checkpoint without the memory
> >     pages which are UFFD handled. This would enable a real UFFD remote
> >     restore where the UFFD pages do not need to be transferred to the
> >     destination host.
> > 
> > Signed-off-by: Adrian Reber <areber at redhat.com>
> > ---
> >  criu/crtools.c            |  25 ++-
> >  criu/include/cr_options.h |   2 +
> >  criu/uffd.c               | 482 ++++++++++++++++++++++++++++++++--------------
> 
> Adrian,
> 
> This patch slams the uffd.c quite heavily. Not in terms of lines added, it's
> OK when a patch adds even 1k of lines, but when they all go in one hunk. I
> mean here -- in terms of the overall amount of changes made.
> 
> Is there any chance to split the changes into several patches, so that it
> would be easier to review it?

Yes, you are right. That's also what I thought. I will try to split it
in multiple smaller changes.

		Adrian


More information about the CRIU mailing list