[CRIU] [PATCH RFC 8/8] criu: lazy-pages: enable remoting of lazy pages

Mon May 30 08:23:41 PDT 2016

On Mon, May 30, 2016 at 06:21:53PM +0300, Pavel Emelyanov wrote:
> On 05/30/2016 06:16 PM, Mike Rapoport wrote:
> > On Mon, May 30, 2016 at 05:59:28PM +0300, Pavel Emelyanov wrote:
> >> On 05/30/2016 03:33 PM, Mike Rapoport wrote:
> >>> On Mon, May 30, 2016 at 01:58:53PM +0300, Pavel Emelyanov wrote:
> >>>> On 05/29/2016 09:58 AM, Mike Rapoport wrote:
> >>>>> On Fri, May 27, 2016 at 10:38:00PM +0300, Pavel Emelyanov wrote:
> >>>>>> On 05/21/2016 01:49 PM, Mike Rapoport wrote:
> >>>>>>> The remote lazy pages variant can be run as follows:
> >>>>>>>
> >>>>>>> src# criu dump -t <pid> --lazy-pages --port 9876 -D /tmp/1 &
> >>>>>>
> >>>>>> This thing starts dump and lazy page server that flushes pages remotely.
> >>>>>>
> >>>>>>> src# while ! sudo fuser 9876/tcp ; do sleep 1; done
> >>>>>>> src# scp -r /tmp/1/ dst:/tmp/
> >>>>>>>
> >>>>>>> dst# criu lazy-pages --lazy-addr /tmp/uffd.sock --page-server \
> >>>>>>>                      --address dst --port 9876 -D /tmp/1 &
> >>>>>>
> >>>>>> This will start lazy pages client that would connect to dump side
> >>>>>> and ... request for pages?
> >>>>>
> >>>>> Yep, lazy-pages daemon will connect to the dump side and forward page fault
> >>>>> requests there.
> >>>>>  
> >>>>>>> dst# criu restore --lazy-pages --lazy-addr /tmp/uffd.sock -D /tmp/1
> >>>>>>
> >>>>>> One more comment inline :)
> >>>>>>
> >>>>>>> Signed-off-by: Mike Rapoport <rppt at linux.vnet.ibm.com>
> >>>>>>> ---
> >>>>>>>  criu/cr-dump.c   | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
> >>>>>>>  criu/page-read.c |  2 +-
> >>>>>>>  criu/uffd.c      |  9 ++++++++-
> >>>>>>>  3 files changed, 61 insertions(+), 4 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/criu/cr-dump.c b/criu/cr-dump.c
> >>>>>>> index 1a551b4..faca73b 100644
> >>>>>>> --- a/criu/cr-dump.c
> >>>>>>> +++ b/criu/cr-dump.c
> >>>>>>> @@ -1298,7 +1298,7 @@ static int dump_one_task(struct pstree_item *item)
> >>>>>>>  		}
> >>>>>>>  	}
> >>>>>>>  
> >>>>>>> -	ret = parasite_dump_pages_seized(parasite_ctl, &vmas, false);
> >>>>>>> +	ret = parasite_dump_pages_seized(parasite_ctl, &vmas, opts.lazy_pages);
> >>>>>>>  	if (ret)
> >>>>>>>  		goto err_cure;
> >>>>>>>  
> >>>>>>> @@ -1338,7 +1338,10 @@ static int dump_one_task(struct pstree_item *item)
> >>>>>>>  		goto err;
> >>>>>>>  	}
> >>>>>>>  
> >>>>>>> -	ret = parasite_cure_seized(parasite_ctl);
> >>>>>>> +	if (opts.lazy_pages)
> >>>>>>> +		ret = parasite_cure_remote(parasite_ctl);
> >>>>>>> +	else
> >>>>>>> +		ret = parasite_cure_seized(parasite_ctl);
> >>>>>>>  	if (ret) {
> >>>>>>>  		pr_err("Can't cure (pid: %d) from parasite\n", pid);
> >>>>>>>  		goto err;
> >>>>>>> @@ -1525,6 +1528,49 @@ err:
> >>>>>>>  	return cr_pre_dump_finish(ret);
> >>>>>>>  }
> >>>>>>>  
> >>>>>>> +static int cr_lazy_mem_dump(void)
> >>>>>>> +{
> >>>>>>> +	struct pstree_item *item;
> >>>>>>> +	int ret = 0;
> >>>>>>> +
> >>>>>>> +	pr_info("Lazy pages: pre-dumping memory\n");
> >>>>>>> +	for_each_pstree_item(item) {
> >>>>>>> +		struct parasite_ctl *ctl = item->parasite_ctl;
> >>>>>>> +		struct page_xfer xfer;
> >>>>>>> +
> >>>>>>> +		timing_start(TIME_MEMWRITE);
> >>>>>>> +		ret = open_page_xfer(&xfer, CR_FD_PAGEMAP, ctl->pid.virt);
> >>>>>>> +		if (ret < 0)
> >>>>>>> +			goto err;
> >>>>>>> +
> >>>>>>> +		ret = page_xfer_dump_pages(&xfer, ctl->mem_pp, 0, false);
> >>>>>>
> >>>>>> If I got the set right :) 
> >>>>>
> >>>>> Hmm, not quite ;-)
> >>>>>
> >>>>>> this place just flushes all the lazy memory into
> >>>>>> socket ignoring the requests from restore side and that's it. No?
> >>>>>
> >>>>> This place flushes all the memory *except* lazy pages into the images.
> >>>>> The buffers that are marked with PPB_LAZY are not  written.
> >>>>
> >>>> Ah! Indeed. The last 'false', yes. So here we flush all the memory __but__ the
> >>>> lazy one. OK, the question is -- why not write one using existing --page-server
> >>>> dump?
> >>>
> >>> Haven't got this. Write what? The PPB_LAZY buffer?
> >>
> >> No. You say that at this point you send into the network all the pages that
> >> are not lazy. My question is -- why we pull pages till this point, instead
> >> of sending them into socket immediately, as it's done during regular dump?
> > 
> > Well, I've started from "delayed" dump and then added remoting of the
> > pages, so that's what I've got :)
> > Anyway, there should be no problem to dump non-lazy pages immediately like
> > in regular dump, it just makes __parasite_dump_pages_seized too ugly
> > because of the three cases it needs to handle :)
> 
> Yes :( But this is somehow unavoidable. The problem with pulling page-pipes
> out of parasite_dump_pages_seized is that 
> 
> a) all the memory in pipes is unswappable
> b) there's a limit on the number of pipes per process and on the pipe size
>    itself. This limits the amount of mem we can carry.
> 
> so the less page-pipes we have in criu the better :)

Then maybe we should consider keeping parasite in the dumpee and making it
handle the remote #PFs instead of collecting dumpee memory to pipes?

> -- Pavel
>