[CRIU] Remote lazy-restore design discussion
Pavel Emelyanov
xemul at virtuozzo.com
Mon Apr 4 06:06:50 PDT 2016
On 03/31/2016 05:25 PM, Adrian Reber wrote:
> Hello Pavel,
>
> after Mike asked if there have been any design discussions and after I
> am not 100% sure how the page-server fits into the remote restore, it
> seems to be a good idea to get a common understanding what the right
> implementation for remote lazy-restore should look like.
>
> I am using my implementation as a starting point for the discussion.
>
> I think we need three different process for remote lazy restore
> independent of how they are started. 'destination system' is the system
> the process should be migrated to and 'source system' is the system the
> original process was running on before the migration.
>
> 1. The actual restore process (destination system):
> This is a 'normal' restore with the difference that memory pages
> (MAP_ANONYMOUS and MAP_PRIVATE) are not copied to their place
> but they are marked as being handled by userfaultfd. Therefore
> a userfaultfd FD (UFFD) is opened and passed to a second process.
>
> 2. The local lazy restore UFFD handler (destination system):
> This process listens on the UFFD for userfault requests and tries to
> handle the userfault requests. Either by reading the required pages
> from a local checkpoint (rather unlikely use case) or it is requesting
> the pages from a remote system (source system) via the network.
>
> 3. The remote lazy restore page request handler (source system):
> This process opens a network port and listens for page requests
> and reads the requested pages from a local checkpoint (or even
> better, directly from a stopped process).
Agreed. And the process #1 would eventually turn into the restored process(es).
I would also add that process 3 should not only listen for page requests, but
also send other pages in the background. Probably the ideal process 3 should
1. Have a queue of pages to be sent (struct page_server_iov-s)
2. Fill it with pages that were not transfered (ANON|PRIVATE)
3. Start sending them one by one
4. Receive messages from the process #3 that can move some items from
the queue on top (i.e. -- the pages that are needed right now)
> As this describes the solution I have implemented it all sounds correct
> to me. In addition to handling request for pages (processes 2. and 3.)
> both page handlers need to know how to push unrequested pages at some
> point in time to make sure the migration can finish.
>
> Looking at the page-server it is currently not clear to me how it fits
> into this scenario. Currently it listens on a network port (like process
> 3. from above) and writes the received pages to the local disk.
Not exactly. It redirects pages from socket into particular page_xfer. Right
ow the page server process only uses the local xfer which results in pages
being written on disk.
Also, the page server includes page_server_xfer which is used by criu dump
to send the page, and this thing should be used by process 3.
> To serve as the process mention as process 3. it would need to learn all
> the functionality as it has currently been implemented.
You mean the page server should be taught to work with uffd? Well, kinda yes,
when I was talking about uffd daemon to use page server, I meant, that the
uffd process (#2 in your classification) should use page server protocol and
new page_xfer to transfer pages between hosts. And process #3 should use
standard page_server_xfer to transfer pages onto remote host.
> Instead of receiving pages and writing it to disk it needs to
> receive page requests and read the from disk to the network.
Why to disk? For post-copy live migration using disk for images should
be avoided as much as possible.
> This sounds like the opposite of what it is currently doing and,
> from my point of view, it is either a complete separate process,
> like my implementation, or all the functionality needs to be added.
> Also the logic to handle unrequested pages does not seem like
> something which the page-server can currently do or is designed to do.
>
> So, from my point of view, page-server and remote page request handler
> seem rather different in their functionality (besides being a TCP
> server). I suppose there are some points I am not seeing so I hope to
> understand the situation better from the answers to this email. Thanks.
Probably I was not correct when used the word "page-server". I meant the
components used by it, but you thought of it as of a process itself :)
-- Pavel
More information about the CRIU
mailing list