[CRIU] [RFC PATCH 07/12] lazy-pages: add handling of UFFD_EVENT_REMAP

Mike Rapoport rppt at linux.vnet.ibm.com
Tue Jan 10 23:02:10 PST 2017


On Tue, Jan 10, 2017 at 07:15:28PM +0300, Pavel Emelyanov wrote:
> On 01/10/2017 05:20 PM, Mike Rapoport wrote:
> > On Tue, Jan 10, 2017 at 05:07:01PM +0300, Pavel Emelyanov wrote:
> >> On 01/09/2017 11:23 AM, Mike Rapoport wrote:
> >>> When the restored process calls mremap(), we will see #PF's on the new
> >>> addresses and we have to create a correspondence between the addresses
> >>> found in the dump and the actual addresses the process uses. If the
> >>> mremap() call causes the mapping to grow, the additional part will receive
> >>> zero pages, as expected.
> >>>
> >>> FIXME: is the mapping shrinks, we will try to fill the dropped part and
> >>> lazy-pages daemon will fail. It should be possible to track VMA changes in
> >>> the lazy-pages daemon, but simpler, and, apparently, more correct would be
> >>> to add UFFD_EVENT_MUNMAP to the kernel.
> >>>
> >>> Signed-off-by: Mike Rapoport <rppt at linux.vnet.ibm.com>
> >>> ---
> >>>  criu/uffd.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
> >>>  1 file changed, 46 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/criu/uffd.c b/criu/uffd.c
> >>> index 3bc7bc1..8bf4a14 100644
> >>> --- a/criu/uffd.c
> >>> +++ b/criu/uffd.c
> >>> @@ -53,7 +53,12 @@ struct lazy_iovec {
> >>>  	unsigned long len;
> >>>  };
> >>>  
> >>> -struct lazy_pages_info;
> >>> +struct lazy_remap {
> >>> +	struct list_head l;
> >>> +	unsigned long from;
> >>> +	unsigned long to;
> >>> +	unsigned long len;
> >>> +};
> >>>  
> >>>  struct pf_info {
> >>>  	unsigned long addr;
> >>> @@ -65,6 +70,7 @@ struct lazy_pages_info {
> >>>  
> >>>  	struct list_head iovs;
> >>>  	struct list_head pfs;
> >>> +	struct list_head remaps;
> >>>  
> >>>  	struct page_read pr;
> >>>  
> >>> @@ -92,6 +98,7 @@ static struct lazy_pages_info *lpi_init(void)
> >>>  	memset(lpi, 0, sizeof(*lpi));
> >>>  	INIT_LIST_HEAD(&lpi->iovs);
> >>>  	INIT_LIST_HEAD(&lpi->pfs);
> >>> +	INIT_LIST_HEAD(&lpi->remaps);
> >>>  	INIT_LIST_HEAD(&lpi->l);
> >>>  	lpi->lpfd.revent = handle_uffd_event;
> >>>  
> >>> @@ -101,12 +108,15 @@ static struct lazy_pages_info *lpi_init(void)
> >>>  static void lpi_fini(struct lazy_pages_info *lpi)
> >>>  {
> >>>  	struct lazy_iovec *p, *n;
> >>> +	struct lazy_remap *p1, *n1;
> >>>  
> >>>  	if (!lpi)
> >>>  		return;
> >>>  	free(lpi->buf);
> >>>  	list_for_each_entry_safe(p, n, &lpi->iovs, l)
> >>>  		xfree(p);
> >>> +	list_for_each_entry_safe(p1, n1, &lpi->remaps, l)
> >>> +		xfree(p1);
> >>>  	if (lpi->lpfd.fd > 0)
> >>>  		close(lpi->lpfd.fd);
> >>>  	if (lpi->pr.close)
> >>> @@ -515,9 +525,17 @@ out:
> >>>  static int uffd_copy(struct lazy_pages_info *lpi, __u64 address, int nr_pages)
> >>>  {
> >>>  	struct uffdio_copy uffdio_copy;
> >>> +	struct lazy_remap *r;
> >>>  	unsigned long len = nr_pages * page_size();
> >>>  	int rc;
> >>>  
> >>> +	list_for_each_entry(r, &lpi->remaps, l) {
> >>> +		if (address >= r->from && address < r->from + r->len) {
> >>> +			address += (r->to - r->from);
> >>> +			break;
> >>> +		}
> >>> +	}
> >>> +
> >>>  	uffdio_copy.dst = address;
> >>>  	uffdio_copy.src = (unsigned long)lpi->buf;
> >>>  	uffdio_copy.len = len;
> >>> @@ -670,9 +688,27 @@ static int handle_madv_dontneed(struct lazy_pages_info *lpi,
> >>>  	return 0;
> >>>  }
> >>>  
> >>> +static int handle_remap(struct lazy_pages_info *lpi, struct uffd_msg *msg)
> >>> +{
> >>> +	struct lazy_remap *remap;
> >>> +
> >>> +	remap = xmalloc(sizeof(*remap));
> >>> +	if (!remap)
> >>> +		return -1;
> >>> +
> >>> +	INIT_LIST_HEAD(&remap->l);
> >>> +	remap->from = msg->arg.remap.from;
> >>> +	remap->to = msg->arg.remap.to;
> >>> +	remap->len = msg->arg.remap.len;
> >>> +	list_add_tail(&remap->l, &lpi->remaps);
> >>
> >> Shouldn't we punch a hole in original region? Not to handle #PF there?
> > 
> > I don't quite follow you here. The #PF's are delivered to uffd only if the
> > VMA covering the range is registered with uffd. For moving remaps, the
> > original address range will not be covered by a VMA with uffd. 
> 
> Indeed :) Aren't you worried with the fact that the in-memory state of lazyd
> will differ from the real state in the kernel?

I am worried about all the UFFD_EVENT_* stuff. Except, perhaps,
MADV_DONTNEED. You example below with swap of two regions clearly shows
that list of (from,to) patches will not work. We do need to have more
complex state in lazyd that will be able to correspond old and new
addresses.
 
> > The only issue it the shrinking remaps (see changelog).
> > 
> >> Also, what if one region is remapped somewhere, then remapped back? Will
> >> we have two "patches" for it?
> >> Also, what if we swap two regs with remap (a->c, b->a, c->b) what kind of
> >> mess will we have here? %)
> > 
> > You forgot a combination of fork() and mremap() events in arbitrary order ;-)
> 
> :D
> 
> > This makes the things even more funny.
> > 
> > I can try to "remap" iovecs, but I remember that something bothered me when
> > I first started doing it, just cannot remember what exactly...
> 
> :\
> 
> >>



More information about the CRIU mailing list