[CRIU] Remote lazy-restore design discussion

Wed Apr 6 00:38:19 PDT 2016

On Tue, Apr 05, 2016 at 07:04:45PM +0300, Pavel Emelyanov wrote:
> Well, how about this:
> 
> I. Dump side.
> 
> The criu dump process dumps everything but the lazy pagemaps, lazy pagemaps
> are skipped and are queued.

Agreed.

> Then criu dump spawns a daemon that opens a connection to the remote host,
> creates page_server_xfer, takes the queue of pagemaps-s that are to be sent
> to it and starts polling the xfer socket.

Not sure I already understand this. There is now daemon running which
has access to the memory of the dumped process which is still at the
original place in the memory of the dumped process. This is something I
see as important to make sure the pages of the dumped process are copied
as seldom as possible.

Why is the daemon connecting to the restore process and not the other
way around?

>From what I have done so far it seems more logical the other way around
then you described.

 * First the process is dumped (without the lazy pages).
 * Second the dumped information is transferred (scp/rsync) to the
   destination.
 * Third, on the destination host, the process is restored in lazy pages
   mode and now the uffd page handler connects to the dump daemon on the
   source host.
 * The process is now restored.

>From my current understanding it makes no sense that the dumping process
connects to the uffd process on the destination system as it is unknown
when this will be available.

> When available for read, it gets request for particular pagemap and pops one
> up in the queue.
> 
> When available for write it gets the next pagemap from queue and ->writepage
> one to the page_xfer.
> 
> II. Restore side
> 
> The uffd daemon is spawned, it opens a port (to which dump will connect, or
> uses the opts.ps_socket provided connection, the connect_to_page_server()
> knows this), creates a hash with (pid, uffd, pagemaps) structures (called
> lazy_data below) and listens.
> 
> Restore prepares processes and mappings (Adrian's code already does this), sending
> uffd-s to the uffd daemon (already there).
> 
> The uffd daemon starts polling all uffds it has and the connection from the
> dump side.
> 
> When uffd is available for read, it gets the #PF info, the goes to the new
> page_read that sends the page_server_iov request for out-of-order page (note,
> that in case of lazy restore from images the regular page_read is used).
> 
> Most of this code is already in criu-dev from Adrian and you, but we need to
> add multi-uffd polling and lazy_data thing and the ability to handle "page
> will be available later" response from the page_read.
> 
> When dump side connection is available for reading it calls the core part of
> the page_server_serve() routine that reads from socket and handles PS_IOV_FOO
> commands. The page_xfer used in _this_ case is the one that finds the appropriate
> lazy_data and calls map + wakeup ioctls.
> 
> This part is not ready and this is what I meant when was talking about re-using
> page-server code with new page_xfer and page_read.
> 
> Does this make sense?

I am confused about the which side connects to the other side.

		Adrian