[CRIU] [PATCH v5 3/5] Try to include userfaultfd with criu (part 1)

Adrian Reber adrian at lisas.de
Mon Mar 14 14:02:20 PDT 2016


On Mon, Mar 14, 2016 at 11:42:11AM +0300, Pavel Emelyanov wrote:
> On 03/12/2016 07:54 PM, Adrian Reber wrote:
> > On Fri, Mar 11, 2016 at 06:25:52PM +0300, Pavel Emelyanov wrote:
> >> On 03/11/2016 06:03 PM, Adrian Reber wrote:
> >>> On Fri, Mar 11, 2016 at 04:08:10PM +0300, Pavel Emelyanov wrote:
> >>>>
> >>>>> +static void criu_init()
> >>>>> +{
> >>>>> +	/* TODO: return code checking */
> >>>>> +	check_img_inventory();
> >>>>> +	prepare_task_entries();
> >>>>> +	prepare_pstree();
> >>>>> +	collect_remaps_and_regfiles();
> >>>>> +	prepare_shared_reg_files();
> >>>>> +	prepare_remaps();
> >>>>> +	prepare_mm_pid(root_item);
> >>>>> +
> >>>>> +	/* We found a PID */
> >>>>> +	pr_debug("root_item->pid.virt %d\n", root_item->pid.virt);
> >>>>> +	pr_debug("root_item->pid.real %d\n", root_item->pid.real);
> >>>>> +}
> >>>>
> >>>> This portion should be really resolved before merging. All of the above
> >>>> has nothing to do with the page_read, so please, find the reason for
> >>>> page read engine non working due to absence of this. If you need help
> >>>> with the code, just drop me an e-mail, I'll help.
> >>>
> >>> I had a quick look, but need to look a bit in more detail.
> >>>
> >>> If I leave away all those lines I get a segfault, I haven't checked yet
> >>> but I think when accessing root_item->pid.virt.
> >>
> >> Ah! Indeed. You open the page read for init task only. I believe the proper
> >> fix would be to pass the pid of the process via socket you use to pass uffd.
> >>
> >> Since we'll have to do it anyway in the future, I think this is worth doing
> >> from the very beginning. And the lazy pages daemon should accept only one
> >> such message (for you initial case).
> > 
> > Getting the information about which directory contains the checkpoint
> > from the main restore process via the same mechanism as the userfaultfd
> > FD was also my initial plan. But, unfortunately, the lazy-pages server
> > needs to open the checkpoint directory on its own. Especially as I am
> > currently working on the code for remote lazy-restore.
> 
> No no no, I'm not talking about passing the directory with images via socket,
> but about finding out the PID of the task to work on. You get this value (the
> pid) from root_item, but this value should go via socket as a raw integer,
> together with the uffd descriptor.
> 
> This line from patch #3, uffd.c file, uffd_listen() function:
> 
> > +	rc = open_page_read(root_item->pid.virt, &pr, PR_TASK);
> 
> there should not be any root_item-> dereferences, instead, the value of pid.virt
> should be sent by the criu restore here:

Ah, okay. I understand now. If I do not use root_item->pid.virt to get
the PID but if I get it from somewhere else (hardcoded for a quick
test). My code still behaves the same (segfaults, doesn't work, strange
error message) as before if I do not call the functions we are
discussing. So I do not need it to get the PID but looping over the VMAs
of the checkpoint specified with -D just doesn't work.

In my test case, I get for example this output

(02.306731) Opened page read 1 (parent 0)
(02.306733) lazy-pages: iov.iov_base 0x400000 (1 pages)
(02.306734) lazy-pages: iov.iov_base 0x600000 (2 pages)
(02.306736) lazy-pages: iov.iov_base 0x1ead000 (1 pages)
(02.306737) lazy-pages: iov.iov_base 0x7fbdaac63000 (8 pages)
(02.306738) lazy-pages: iov.iov_base 0x7fbdaac6c000 (1 pages)
(02.306739) lazy-pages: iov.iov_base 0x7fbdaae83000 (3 pages)
(02.306741) lazy-pages: iov.iov_base 0x7fbdaae8d000 (5 pages)
(02.306742) lazy-pages: iov.iov_base 0x7ffc33a55000 (2 pages)
(02.306743) lazy-pages: iov.iov_base 0x7ffc33a58000 (1 pages)
(02.306744) lazy-pages: iov.iov_base 0x7ffc33a5d000 (2 pages)
(02.306747) lazy-pages: Found 0 pages to be handled by UFFD

and not single page has been detected as MAP_ANONYMOUS and MAP_PRIVATE.

A debug print from a version with all the functions enabled in
uffd.c/criu_init() gives me:

(08.598388) Found 18 VMAs in image
(08.598392) vma 0x400000 0x401000
(08.598393) vma 0x600000 0x601000
(08.598395) vma 0x601000 0x602000
(08.598396) vma 0x1ead000 0x1ece000
(08.598397) vma 0x7fbdaa8ad000 0x7fbdaaa63000
(08.598399) vma 0x7fbdaaa63000 0x7fbdaac63000
(08.598400) vma 0x7fbdaac63000 0x7fbdaac67000
(08.598401) vma 0x7fbdaac67000 0x7fbdaac69000
(08.598402) vma 0x7fbdaac69000 0x7fbdaac6e000
(08.598403) vma 0x7fbdaac6e000 0x7fbdaac8f000
(08.598405) vma 0x7fbdaae83000 0x7fbdaae86000
(08.598406) vma 0x7fbdaae8d000 0x7fbdaae8f000
(08.598407) vma 0x7fbdaae8f000 0x7fbdaae90000
(08.598408) vma 0x7fbdaae90000 0x7fbdaae91000
(08.598410) vma 0x7fbdaae91000 0x7fbdaae92000
(08.598411) vma 0x7ffc33a37000 0x7ffc33a59000
(08.598412) vma 0x7ffc33a5d000 0x7ffc33a5f000
(08.598414) vma 0xffffffffff600000 0xffffffffff601000
(08.598424) Opened page read 1 (parent 0)
(08.598426) lazy-pages: iov.iov_base 0x400000 (1 pages)
(08.598427) lazy-pages: iov.iov_base 0x600000 (2 pages)
(08.598429) lazy-pages: iov.iov_base 0x1ead000 (1 pages)
(08.598430) lazy-pages: Adding 0x1ead000 to our list
(08.598432) lazy-pages: iov.iov_base 0x7fbdaac63000 (8 pages)
(08.598433) lazy-pages: Adding 0x7fbdaac69000 to our list
(08.598435) lazy-pages: Adding 0x7fbdaac6a000 to our list
(08.598436) lazy-pages: iov.iov_base 0x7fbdaac6c000 (1 pages)
(08.598437) lazy-pages: Adding 0x7fbdaac6c000 to our list
(08.598438) lazy-pages: iov.iov_base 0x7fbdaae83000 (3 pages)
(08.598440) lazy-pages: Adding 0x7fbdaae83000 to our list
(08.598441) lazy-pages: Adding 0x7fbdaae84000 to our list
(08.598442) lazy-pages: Adding 0x7fbdaae85000 to our list
(08.598443) lazy-pages: iov.iov_base 0x7fbdaae8d000 (5 pages)
(08.598445) lazy-pages: Adding 0x7fbdaae8d000 to our list
(08.598446) lazy-pages: Adding 0x7fbdaae8e000 to our list
(08.598447) lazy-pages: Adding 0x7fbdaae91000 to our list
(08.598448) lazy-pages: iov.iov_base 0x7ffc33a55000 (2 pages)
(08.598450) lazy-pages: Adding 0x7ffc33a55000 to our list
(08.598451) lazy-pages: Adding 0x7ffc33a56000 to our list
(08.598452) lazy-pages: iov.iov_base 0x7ffc33a58000 (1 pages)
(08.598453) lazy-pages: Adding 0x7ffc33a58000 to our list
(08.598455) lazy-pages: iov.iov_base 0x7ffc33a5d000 (2 pages)
(08.598456) lazy-pages: Adding 0x7ffc33a5d000 to our list
(08.598457) lazy-pages: Adding 0x7ffc33a5e000 to our list
(08.598460) lazy-pages: Found 15 pages to be handled by UFFD

So more is initialized than just the PID by calling those functions:

        check_img_inventory();
        prepare_task_entries();
        prepare_pstree();
        collect_remaps_and_regfiles();
        prepare_shared_reg_files();
        prepare_remaps();
        prepare_mm_pid(root_item);


		Adrian


More information about the CRIU mailing list