[CRIU] Remote lazy-restore design discussion

Pavel Emelyanov xemul at virtuozzo.com
Thu Apr 7 05:37:30 PDT 2016


On 04/06/2016 10:38 AM, Adrian Reber wrote:
> On Tue, Apr 05, 2016 at 07:04:45PM +0300, Pavel Emelyanov wrote:
>> Well, how about this:
>>
>> I. Dump side.
>>
>> The criu dump process dumps everything but the lazy pagemaps, lazy pagemaps
>> are skipped and are queued.
> 
> Agreed.
> 
>> Then criu dump spawns a daemon that opens a connection to the remote host,
>> creates page_server_xfer, takes the queue of pagemaps-s that are to be sent
>> to it and starts polling the xfer socket.
> 
> Not sure I already understand this. There is now daemon running which
> has access to the memory of the dumped process which is still at the
> original place in the memory of the dumped process.

Yes.

> This is something I
> see as important to make sure the pages of the dumped process are copied
> as seldom as possible.

Absolutely agree.

> Why is the daemon connecting to the restore process and not the other
> way around?

Well, this is how dump --page-server already does -- dump side connects
to restore side. So I thought that doing symmetrical thing for lazy pages
would make sense.

>>From what I have done so far it seems more logical the other way around
> then you described.
> 
>  * First the process is dumped (without the lazy pages).

And what about lazy pages? Where are they? In the dump-side images
or in memory?

>  * Second the dumped information is transferred (scp/rsync) to the
>    destination.

Note, that _some_ memory contents will be sent to page server using
dump-connects-to-restore method.

>  * Third, on the destination host, the process is restored in lazy pages
>    mode and now the uffd page handler connects to the dump daemon on the
>    source host.

Hm... OK.

>  * The process is now restored.
> 
>>From my current understanding it makes no sense that the dumping process
> connects to the uffd process on the destination system as it is unknown
> when this will be available.

Well, OK, from this perspective it may be useful to make restore-connect-to-dump.
But this shouldn't affect the described model, since it mostly describes what
happens once sides are interconnected.

>> When available for read, it gets request for particular pagemap and pops one
>> up in the queue.
>>
>> When available for write it gets the next pagemap from queue and ->writepage
>> one to the page_xfer.
>>
>> II. Restore side
>>
>> The uffd daemon is spawned, it opens a port (to which dump will connect, or
>> uses the opts.ps_socket provided connection, the connect_to_page_server()
>> knows this), creates a hash with (pid, uffd, pagemaps) structures (called
>> lazy_data below) and listens.
>>
>> Restore prepares processes and mappings (Adrian's code already does this), sending
>> uffd-s to the uffd daemon (already there).
>>
>> The uffd daemon starts polling all uffds it has and the connection from the
>> dump side.
>>
>> When uffd is available for read, it gets the #PF info, the goes to the new
>> page_read that sends the page_server_iov request for out-of-order page (note,
>> that in case of lazy restore from images the regular page_read is used).
>>
>> Most of this code is already in criu-dev from Adrian and you, but we need to
>> add multi-uffd polling and lazy_data thing and the ability to handle "page
>> will be available later" response from the page_read.
>>
>> When dump side connection is available for reading it calls the core part of
>> the page_server_serve() routine that reads from socket and handles PS_IOV_FOO
>> commands. The page_xfer used in _this_ case is the one that finds the appropriate
>> lazy_data and calls map + wakeup ioctls.
>>
>> This part is not ready and this is what I meant when was talking about re-using
>> page-server code with new page_xfer and page_read.
>>
>> Does this make sense?
> 
> I am confused about the which side connects to the other side.

OK :) Let's then try to resolve this issue.

I don't have sting arguments for dump->restore connection, since if
you look at how p.haul works, it doesn't use this criu connect feature, 
it passes one a pre-established descriptor. On both sides. So question
which side connects to which can be solved either way.

-- Pavel



More information about the CRIU mailing list