[CRIU] [PATCH 2/3] page-read: Add async arg to ->read_pages callback
Mike Rapoport
rppt at linux.vnet.ibm.com
Wed Nov 9 22:50:52 PST 2016
On Thu, Nov 10, 2016 at 01:02:01AM +0300, Pavel Emelyanov wrote:
> On 11/09/2016 09:44 PM, Mike Rapoport wrote:
> > On Mon, Nov 07, 2016 at 07:03:07PM +0300, Pavel Emelyanov wrote:
> >> Flag PR_ASYNC means, that the caller is OK if the routine
> >> returns w/o data read into the buffer provided.
> >>
> >> Flag PR_UNPLUG means, that the caller nonetheless wants the
> >> data to be available ASAP.
> >
> > This one does not seem to be used neither in this patch nor in the next
> > one.
>
> Yup. I created one when thought how this would work in UFFD case.
>
> > Can you elaborate on how it should/will be used? From the description
> > above, the name should be PR_ASAP if we'd ever use it ;-)
>
> Sure, so the PR_ASYNC flag means, that the ->read_pages callback may
> return w/o actually reading the data into the buffer. But with such a
> notation there's a radical difference between regular restore and lazy
> restore.
>
> In the former case PR_ASYNC may defer start reading the data from image
> until the very end stage of restore. In the latter case the read can
> (and should) be async, but the reading shouldn't be delayed even for a
> fraction of second, since it's called from the page-fault handler. Thus
> the 2nd flag -- PR_UNPLUG (or PR_ASAP) which means -- the caller is OK
> with not having the data in the buffer when the callback returns, but
> nobody wants to wait for the data for too long, so the IO should be
> started ASAP (i.e. right now).
>
> So regular restore should have only PR_ASYNC flag, and it really has
> one in restore_priv_vma_contents(), but the uffd restore should have
> PR_ASYNC | PR_UNPLUG combination.
And this combination would make read_pagemap_page grow from 5 screens to 7? ;-)
> It's not there since there's no code that uses page-reads in uffd remote
> case :)
For the uffd remote pages read I was planning adding open_remote_page_read
rather than adding if-clauses to read_pagemap_pages. If I may suggest,
let's move forward with PR_ASYNC for the local restore and we'll take care
of lazy remote pages later on.
BTW, local lazy restore can use PR_ASYNC in handle_remaining_pages, I
believe.
> -- Pavel
--
Sincerely yours,
Mike.
> >> Signed-off-by: Pavel Emelyanov <xemul at virtuozzo.com>
> >> ---
> >> criu/include/pagemap.h | 6 +++++-
> >> criu/mem.c | 4 ++--
> >> criu/pagemap.c | 5 +++--
> >> criu/shmem.c | 2 +-
> >> criu/uffd.c | 2 +-
> >> 5 files changed, 12 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/criu/include/pagemap.h b/criu/include/pagemap.h
> >> index 705a9af..b445d82 100644
> >> --- a/criu/include/pagemap.h
> >> +++ b/criu/include/pagemap.h
> >> @@ -47,7 +47,7 @@ struct page_read {
> >> */
> >> int (*get_pagemap)(struct page_read *, struct iovec *iov);
> >> /* reads page from current pagemap */
> >> - int (*read_pages)(struct page_read *, unsigned long vaddr, int nr, void *);
> >> + int (*read_pages)(struct page_read *, unsigned long vaddr, int nr, void *, unsigned flags);
> >> void (*close)(struct page_read *);
> >> void (*skip_pages)(struct page_read *, unsigned long len);
> >> int (*seek_page)(struct page_read *pr, unsigned long vaddr, bool warn);
> >> @@ -74,6 +74,10 @@ struct page_read {
> >> int curr_pme;
> >> };
> >>
> >> +/* flags for ->read_pages */
> >> +#define PR_ASYNC 0x1 /* may exit w/o data in the buffer */
> >> +#define PR_UNPLUG 0x2 /* do async, but start the IO asap */
> >> +
> >> #define PR_SHMEM 0x1
> >> #define PR_TASK 0x2
> >>
> >> diff --git a/criu/mem.c b/criu/mem.c
> >> index 809b637..7da64f5 100644
> >> --- a/criu/mem.c
> >> +++ b/criu/mem.c
> >> @@ -790,7 +790,7 @@ static int restore_priv_vma_content(struct pstree_item *t)
> >> if (vma->ppage_bitmap) { /* inherited vma */
> >> clear_bit(off, vma->ppage_bitmap);
> >>
> >> - ret = pr.read_pages(&pr, va, 1, buf);
> >> + ret = pr.read_pages(&pr, va, 1, buf, 0);
> >> if (ret < 0)
> >> goto err_read;
> >>
> >> @@ -818,7 +818,7 @@ static int restore_priv_vma_content(struct pstree_item *t)
> >>
> >> nr = min_t(int, nr_pages - i, (vma->e->end - va) / PAGE_SIZE);
> >>
> >> - ret = pr.read_pages(&pr, va, nr, p);
> >> + ret = pr.read_pages(&pr, va, nr, p, PR_ASYNC);
> >> if (ret < 0)
> >> goto err_read;
> >>
> >> diff --git a/criu/pagemap.c b/criu/pagemap.c
> >> index cfc9659..0a072c8 100644
> >> --- a/criu/pagemap.c
> >> +++ b/criu/pagemap.c
> >> @@ -205,7 +205,8 @@ static inline void pagemap_bound_check(PagemapEntry *pe, unsigned long vaddr, in
> >> }
> >> }
> >>
> >> -static int read_pagemap_page(struct page_read *pr, unsigned long vaddr, int nr, void *buf)
> >> +static int read_pagemap_page(struct page_read *pr, unsigned long vaddr, int nr,
> >> + void *buf, unsigned flags)
> >> {
> >> int ret;
> >> unsigned long len = nr * PAGE_SIZE;
> >> @@ -242,7 +243,7 @@ static int read_pagemap_page(struct page_read *pr, unsigned long vaddr, int nr,
> >> if (p_nr > nr)
> >> p_nr = nr;
> >>
> >> - ret = read_pagemap_page(ppr, vaddr, p_nr, buf);
> >> + ret = read_pagemap_page(ppr, vaddr, p_nr, buf, flags);
> >> if (ret == -1)
> >> return ret;
> >>
> >> diff --git a/criu/shmem.c b/criu/shmem.c
> >> index 5065abd..05028a3 100644
> >> --- a/criu/shmem.c
> >> +++ b/criu/shmem.c
> >> @@ -485,7 +485,7 @@ static int restore_shmem_content(void *addr, struct shmem_info *si)
> >> if (vaddr + nr_pages * PAGE_SIZE > si->size)
> >> break;
> >>
> >> - pr.read_pages(&pr, vaddr, nr_pages, addr + vaddr);
> >> + pr.read_pages(&pr, vaddr, nr_pages, addr + vaddr, 0);
> >> }
> >>
> >> pr.close(&pr);
> >> diff --git a/criu/uffd.c b/criu/uffd.c
> >> index e6d4069..3336651 100644
> >> --- a/criu/uffd.c
> >> +++ b/criu/uffd.c
> >> @@ -398,7 +398,7 @@ static int get_page(struct lazy_pages_info *lpi, unsigned long addr, void *dest)
> >> if (pagemap_zero(lpi->pr.pe))
> >> return 0;
> >>
> >> - ret = lpi->pr.read_pages(&lpi->pr, addr, 1, buf);
> >> + ret = lpi->pr.read_pages(&lpi->pr, addr, 1, buf, 0);
> >> pr_debug("read_pages ret %d\n", ret);
> >> if (ret <= 0)
> >> return ret;
> >> --
> >> 2.1.4
> >>
> >> _______________________________________________
> >> CRIU mailing list
> >> CRIU at openvz.org
> >> https://lists.openvz.org/mailman/listinfo/criu
> >>
> >
> > .
> >
>
More information about the CRIU
mailing list