[CRIU] [PATCH v2 5/5] UFFD: Support lazy-pages restore between two hosts

Mike Rapoport mike.rapoport at gmail.com
Mon Mar 28 06:06:42 PDT 2016


On Mon, Mar 28, 2016 at 11:20:32AM +0200, Adrian Reber wrote:
> On Mon, Mar 28, 2016 at 10:12:39AM +0300, Mike Rapoport wrote:
> > > From: Adrian Reber <areber at redhat.com>
> > > 
> > > This enhances lazy-pages mode to work with two different hosts. Instead
> > > of lazy restoring a process on the same host this enables to keep the
> > > memory pages on the source system and actually only transfer the memory
> > > pages on demand from the source to the destination system.
> > > 
> > > The previous, only on one host, lazy restore consisted of two process.
> > > 
> > >  criu restore --lazy-pages --address /path/to/unix-domain-socket
> > > 
> > > and
> > > 
> > >  criu lazy-pages --address /path/to/unix-domain-socket
> > > 
> > > The unix domain socket was used to transfer the userfault FD (UFFD) from
> > > the 'criu restore' process to the 'criu lazy-pages' process. The 'criu
> > > lazy-pages' was then listening on the UFFD for userfaultfd messages
> > > which were used to retrieve the requested memory page from the
> > > checkpoint directory and transfer that page into the process to be
> > > restored.
> > > 
> > > This commit introduces the ability to keep the pages on the remote host
> > > and only request the transfer of the required pages over TCP on demand.
> > > Therefore criu needs to be started differently than previously.
> > > 
> > > Host1:
> > > 
> > >    criu restore --lazy-pages --address /path/to/unix-domain-socket
> > > 
> > >   and
> > > 
> > >    criu lazy-pages --address /path/to/unix-domain-socket \
> > >    --lazy-client ADDR-Host2 --port 27
> > > 
> > > Host2:
> > > 
> > >    criu lazy-pages --lazy-server --port 27
> > > 
> > > On Host1 the process is now restored (as criu always does) except that
> > > the memory pages are not read from pages.img and that the appropriate
> > > pages are marked as being userfaultfd handled. As soon as the restored
> > > process tries to access one the pages a UFFD MSG is received by the
> > > lazy-client (on Host1). This UFFD MSG is then transferred via TCP to the
> > > lazy-sever (on Host2). The lazy-server retrieves the memory page from
> > > the local checkpoint and returns a UFFDIO COPY answer back to the
> > > lazy-client which can the forward this message to the local UFFD which
> > > inserts the page into the restored process.
> > > 
> > > The remote lazy restore has the same behavior as the local lazy restore
> > > that, if after 5 seconds no more messages are received on the socket
> > > waiting for UFFD MSG, it switches to copy remaining pages mode, where
> > > all non-UFFD-requested pages are transferred into the restored process.
> > > 
> > > TODO:
> > >   * Create from the checkpoint directory a checkpoint without the memory
> > >     pages which are UFFD handled. This would enable a real UFFD remote
> > >     restore where the UFFD pages do not need to be transferred to the
> > >     destination host.
> > > 
> > > Signed-off-by: Adrian Reber <areber at redhat.com>
> > > ---
> > >  criu/uffd.c | 269 +++++++++++++++++++++++++++++++++++++++++++++++++++++-------
> > >  1 file changed, 240 insertions(+), 29 deletions(-)
> >  
> > I have some concerns regarding the proposed design. It could be that I'm
> > jumping late, and you've already had discussions about how to use
> > userfaultfd in CRIU, but I'll share my thoughts anyway...
> 
> No, there haven't been any design discussions. The patches I am submitting is
> the design. It is a design discussion by patches.
> 
> > I think that post-copy migration in CRIU may be implemented without
> > creating so many entities taking care of different sides of userfaultfd.
> > The restore part may create a daemon for userfault handling transparently
> > to the user and dump side may be enhanced with ability to send pages on
> > demand, again without starting another process explicitly.
> > 
> > For instance, using 'criu dump --lazy-pages' would do everything required
> > to serve pages for both on demand requests and for background copying and
> > 'criu restore --lazy-pages' would be able to handle userfaultfd.
> 
> Something like you described  would of course be the optimal solution.
> Right now I am trying to incrementally combine criu and userfaultfd to
> reach a fully integrated lazy restore. I do not think, however, that it
> makes much sense to implement the complete solution disconnected from
> the upstream repository. Upstream (criu) changes too much and too fast
> to implement the complete solution and submit it once it has been
> finished. The small steps I am currently taking already requires
> frequent rebases. So anybody is welcome to continue to better integrate
> lazy restore in the current code. To most involved persons userfaultfd
> is a new technology, so that it requires (at least for me) a better
> understanding how it works and how it can be integrated into criu.
> 
> > Regarding the patch itself, I believe that mixing socket and uffd handling
> > in the same functions is not very good idea. Although there are lots of
> > similarities between their behaviour, socket and uffd are semantically
> > different and should be handled by different code paths.
> 
> Currently that would mean the same code twice. Once for uffd and a
> second time for the socket. If necessary it still can be divided into
> two code paths and I rather like the current 'symmetry' and would like
> to avoid, right now, unnecessary over-engineering.

I personally think that the 'symmetry' reduces the readability. ;-)
 
> Depending on upstream's (Pavel's) decision I would rather see the
> current 'design' (if somebody really wants to call it that) or patches
> applied and then adapted. Assumed this last patch has no obvious
> mistakes. With criu's new criu-dev branch model the userfaultfd
> integration can happen in that branch and thus is has much more public
> visibility before the final solution can be merged into the master
> branch.
> 
> Lazy restore, indeed, needs still lot of integration into different criu
> parts.
> 
> 		Adrian

--
Sincerely yours,
Mike.


More information about the CRIU mailing list