[CRIU] criu and userfaultfd
adrian at lisas.de
Thu Sep 17 02:52:16 PDT 2015
On Wed, Sep 09, 2015 at 02:51:04PM +0300, Pavel Emelyanov wrote:
> > Sounds like it should work, for simple cases at least. Which would be a
> > good point to know that and how it works. So I will continue/start to
> > work on userfaultfd in combination with CRIU and once I have something I
> > can update the wiki page.
> That's great! Thanks a lot!
It took me a while but I think I have now found the right place for
userfaultfd to hook into. Right now I am marking a single page that it
should be handled by userfaultfd with UFFDIO_REGISTER_MODE_MISSING.
I am in the restorer after the memory has been remapped and this seems
to be the right place to register my memory page as userfaultfd handled
because earlier it does not exist and the corresponding userfaultfd
ioctl fails. In addition to marking the pages as handled by userfaultfd
I am also setting madvise to MADV_DONTNEED.
The uffd FD is opened in sigreturn_restore() and then passed as an
additional parameter of struct task_args into the restorer. I am opening
the uffd FD before going in the restorer as I am also transmitting the
open uffd FD to another process via unix sockets which is later used to
react on the uffd copy requests once the process is restored and
accesses memory handled by uffd.
Right now I am only marking a single page in the restored process as
being handled by uffd. The process hangs after restore when accessing
this page for the first time and I can now insert via uffd whatever
content I want. As I am printing only the content of this page the
restored process still works but prints out different content.
So far it seems as criu and userfaultfd can be combined. Before
continuing further I wanted to know if this is the right way to go.
I am marking the pages as uffd handled after the madvice() bits are
restored. Using uffd will mean that I will overwrite the madvice()
information. Does this need be handled better? Or is it okay to
overwrite all pages with MADV_DONTNEED in the case of uffd?
>From my point of view the next steps would be to implement a local lazy
restore. The page server should know how to get the uffd FD from the main
restore process and then transfer the memory on request to the process being
lazy restored. This also means that the uffd FD has to be set up in
sigreturn_restore() and passed to the restorer like I am doing it right
For remote lazy restore a simple version of the page server is necessary
which also gets the uffd FD but which then requests the pages over the
network before coping it to the uffd FD.
Does this sound right?
More information about the CRIU