[CRIU] Lazy-restore design discussion - round 2
Adrian Reber
adrian at lisas.de
Mon Apr 18 00:46:54 PDT 2016
It seems we have reached some kind of agreement and therefore
I am trying to summarize, from my point of view, our current discussion
results.
* On the source system there will be process listening on a network
socket. In the first implementation it will use a checkpoint
directory as the basis for the UFFD pages and in a later version
it will transfer the pages directly from the checkpointed process.
* The transport protocol between the source system and the UFFD daemon
on the destination will be page-server based (something like Mike's patch)
* The UFFD daemon will be able to handle multiple restore requests
(also Mike's patch "lazy-pages: handle multiple processes")
* The UFFD daemon does not need a checkpoint directory to run, all
required information will be transferred over the network.
e.g. PID and pages
* The page-server protocol needs to be extended to transfer the
lazy-restore pages list from the source system to the UFFD daemon.
* The UFFD daemon is the instance which decides which pages are pushed
when via UFFD into the restored process.
Do we agree on these points? If yes, I would like to start to implement
it that way. If we get to the point where this works it still requires
lot of work on the tooling. For example how to split out the lazy-pages
from an existing dump, so that only the non-lazy-pages are actually
transferred to the destination system.
Adrian
More information about the CRIU
mailing list