[CRIU] [PATCH v2 5/5] UFFD: Support lazy-pages restore between two hosts
Mike Rapoport
mike.rapoport at gmail.com
Mon Mar 28 22:58:53 PDT 2016
On Mon, Mar 28, 2016 at 06:28:54PM +0300, Pavel Emelyanov wrote:
> On 03/24/2016 06:52 PM, Adrian Reber wrote:
> > From: Adrian Reber <areber at redhat.com>
>
> Here's my comments on the patch :) I probably should have sent them earlier,
> so sorry for the not-so-fast response.
>
> > This enhances lazy-pages mode to work with two different hosts. Instead
> > of lazy restoring a process on the same host this enables to keep the
> > memory pages on the source system and actually only transfer the memory
> > pages on demand from the source to the destination system.
> >
> > The previous, only on one host, lazy restore consisted of two process.
> >
> > criu restore --lazy-pages --address /path/to/unix-domain-socket
> >
> > and
> >
> > criu lazy-pages --address /path/to/unix-domain-socket
>
> I would say that's OK to have separate command to start the lazy server.
> Mike's suggestion to spawn the server automatically after restore also
> makes sense, I will accept such a patch, but for debugging purpose I'd
> keep the separate lazy-pages action.
>
> > The unix domain socket was used to transfer the userfault FD (UFFD) from
> > the 'criu restore' process to the 'criu lazy-pages' process. The 'criu
> > lazy-pages' was then listening on the UFFD for userfaultfd messages
> > which were used to retrieve the requested memory page from the
> > checkpoint directory and transfer that page into the process to be
> > restored.
> >
> > This commit introduces the ability to keep the pages on the remote host
> > and only request the transfer of the required pages over TCP on demand.
> > Therefore criu needs to be started differently than previously.
> >
> > Host1:
> >
> > criu restore --lazy-pages --address /path/to/unix-domain-socket
> >
> > and
> >
> > criu lazy-pages --address /path/to/unix-domain-socket \
> > --lazy-client ADDR-Host2 --port 27
> >
> > Host2:
> >
> > criu lazy-pages --lazy-server --port 27
>
> And this patch definitely requires tuning. First, as Mike notices, we already
> have the code that makes dump send pages over the network -- the dump
> --page-server makes this work, so for the server side I'd just extend the
> dump action with the --lazy-pages option that would feed _more_ data into
> page_sever_xfer after dump.
For actual post-copy, the dump action needs to remain active until all the
pages are transfered to the destination. Moreover, dump action should be
able to respond to requests rather than feed more data to the page_server
running on the destination node. I think page_server should gain some
symmetry and ability to run on the source node.
> For the destination side, according to https://criu.org/Userfaultfd I planned
> to see the 3rd page_read driver, that gets memory from the network and teach
> the lazy_pages action (and daemon) to use this page read.
Again, this 3rd page_reader driver will need a peer on the source side that
will be able to respond to random page requests :)
--
Sincerely yours,
Mike.
More information about the CRIU
mailing list