[CRIU] Lazy-restore design discussion - round 3

Mike Rapoport mike.rapoport at gmail.com
Wed Apr 20 01:23:12 PDT 2016


On Wed, Apr 20, 2016 at 08:02:57AM +0300, Pavel Emelyanov wrote:
> On 04/19/2016 10:51 PM, Adrian Reber wrote:
> > On Tue, Apr 19, 2016 at 05:03:57PM +0200, Adrian Reber wrote:
> >>>>>> Maybe we really should implement it like Mike said. First try to get the
> >>>>>> current locally on my and on Mike's system existing patches into shape and
> >>>>>> then we can decide if we want to move the page handling logic to the
> >>>>>> dump side on the destination system.
> >>>>>
> >>>>> OK, let's see how it goes.
> >>>>>
> >>>>> But I have one concern about having brains on restore side. Look, the uffd can request
> >>>>> for two kinds (or types) of pages -- those that task are blocked on in #PF (i.e. -- 
> >>>>> explicit uffd requests) and those that task hasn't yet touched (i.e. -- request them
> >>>>> in advance). With the former pages the situation is clear, it's uffd who knows what
> >>>>> these pages are. It can even know something about the latter pages, e.g. with #PF-ed
> >>>>> pages request for adjacent pages as Adrian proposed. That's clear. But what to do
> >>>>> with other "in advance" pages. It seems that it's better to request those pages in
> >>>>> LRU manner, i.e. -- request for recent pages before those that were used long ago. But
> >>>>> the problem I see is that this LRU information can only be obtained from the dump
> >>>>> side -- all this LRU statistics sits _there_. And what would be the way to share
> >>>>> this knowledge with the restore side (as we plan to make it "smart" or "active")?
> >>>>>
> >>>>> Had we the "brain" (or "active part") on dump side we could just scan this info and
> >>>>> make decision. But what to do when we have "brain" on restore side and all the LRU
> >>>>> info on the dump side?
> >>>>
> >>>> >From where do we have the LRU information? Does CRIU collect this during
> >>>> dump? Or can this be queried from the kernel?
> >>>
> >>> Right now we don't collect it by CRIU, it's only present in CRIU (and somehow can be
> >>> collected using reference-d bit from the proc pagemap file, but we don't do it either).
> >>> My point is that this information is only present on the dump side.
> >>
> >> Good to know, that this information exists. Your proposal to insert
> >> additional pages based on LRU basis makes sense. I need to think about
> >> this a bit...
> > 
> > Knowing about the LRU data on the dump side this means that either the
> > dump side decides which pages are transmitted or with have to transmit
> > the LRU data to the uffd daemon on the restore side. Transferring the
> > LRU data to uffd daemon sounds like it will make the whole thing
> > unnecessarily complicated. Having the logic which pages are transferred
> > when on the dump side means, that the uffd daemon on the restore side
> > does nothing more than forwarding pages or userfault requests. This also
> > means it doesn't need to know anything about the restored process and
> > therefore it does not need to access the checkpoint directory.
>
> There's one point that might require it to do so -- the uffd daemon will
> need to work in 2 phases. The first one is when he will accept uffds from
> the restoring processes. And the second one is when he will serve #PFs
> and inject arriving pages back. We need to know where the stage one ends
> and stage two starts. And we need to know when stage two ends too. All this
> might require the uffd daemon to to at least the amount of processes it
> have to deal with and the lazy regions he will have to fill with data.
> In turn, this knowledge sits in the images directory so uffd daemon might
> still need to read one.

The checkpoint directory is anyway required on the restore side, so whether
lazy-pages daemon should parse it or not does not seem that important to
me. Besides, if the lazy-pages daemon is fork()'ed from 'criu restore', it
will get the parsed data for free.

> > From how
> > I see it this would lead to a similar solution like the one implemented
> > in my first remote lazy restore patchset. Only based on the
> > page-server protocol.
> > 
> > Thinking more about how the protocol needs to be implemented between
> > dump side and uffd daemon it also sounds like the uffd daemon, which is
> > now only forwarding pages and userfault requests, will/should lose the
> > ability to read local checkpoint directories. It will only be used for
> > forwarding pages. A lazy-restore with source and destination on the same
> > machine will then probably also require to forward the pages through the
> > local uffd-page-forwarding-daemon.
>
> > So the above is the result of my thoughts about what it means that the
> > LRU data only exists on the dump side. This is not (yet) what I am
> > proposing to do, it is just what I think this might lead to.
> 
> -- Pavel
> 


More information about the CRIU mailing list