[CRIU] [RFC PATCH 07/12] lazy-pages: add handling of UFFD_EVENT_REMAP

Mike Rapoport rppt at linux.vnet.ibm.com
Wed Jan 11 00:16:30 PST 2017


On Wed, Jan 11, 2017 at 11:14:51AM +0300, Pavel Emelyanov wrote:
> On 01/11/2017 11:05 AM, Mike Rapoport wrote:
> > On Wed, Jan 11, 2017 at 10:28:16AM +0300, Pavel Emelyanov wrote:
> >> On 01/11/2017 10:02 AM, Mike Rapoport wrote:
> >>> On Tue, Jan 10, 2017 at 07:15:28PM +0300, Pavel Emelyanov wrote:
> >>>> On 01/10/2017 05:20 PM, Mike Rapoport wrote:
> >>>>> On Tue, Jan 10, 2017 at 05:07:01PM +0300, Pavel Emelyanov wrote:
> >>>>>> On 01/09/2017 11:23 AM, Mike Rapoport wrote:
> >>>>>>> When the restored process calls mremap(), we will see #PF's on the new
> >>>>>>> addresses and we have to create a correspondence between the addresses
> >>>>>>> found in the dump and the actual addresses the process uses. If the
> >>>>>>> mremap() call causes the mapping to grow, the additional part will receive
> >>>>>>> zero pages, as expected.
> >>>>>>>
> >>>>>>> FIXME: is the mapping shrinks, we will try to fill the dropped part and
> >>>>>>> lazy-pages daemon will fail. It should be possible to track VMA changes in
> >>>>>>> the lazy-pages daemon, but simpler, and, apparently, more correct would be
> >>>>>>> to add UFFD_EVENT_MUNMAP to the kernel.
> >>>>>>>
> >>>>>>> Signed-off-by: Mike Rapoport <rppt at linux.vnet.ibm.com>
> >>>>>>> ---
> >>>>>>>  criu/uffd.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
> >>>>>>>  1 file changed, 46 insertions(+), 1 deletion(-)
> >>>>>>>
> >>>>>>> diff --git a/criu/uffd.c b/criu/uffd.c
> >>>>>>> index 3bc7bc1..8bf4a14 100644
> >>>>>>> --- a/criu/uffd.c
> >>>>>>> +++ b/criu/uffd.c
> >>>>>>> +static int handle_remap(struct lazy_pages_info *lpi, struct uffd_msg *msg)
> >>>>>>> +{
> >>>>>>> +	struct lazy_remap *remap;
> >>>>>>> +
> >>>>>>> +	remap = xmalloc(sizeof(*remap));
> >>>>>>> +	if (!remap)
> >>>>>>> +		return -1;
> >>>>>>> +
> >>>>>>> +	INIT_LIST_HEAD(&remap->l);
> >>>>>>> +	remap->from = msg->arg.remap.from;
> >>>>>>> +	remap->to = msg->arg.remap.to;
> >>>>>>> +	remap->len = msg->arg.remap.len;
> >>>>>>> +	list_add_tail(&remap->l, &lpi->remaps);
> >>>>>>
> >>>>>> Shouldn't we punch a hole in original region? Not to handle #PF there?
> >>>>>
> >>>>> I don't quite follow you here. The #PF's are delivered to uffd only if the
> >>>>> VMA covering the range is registered with uffd. For moving remaps, the
> >>>>> original address range will not be covered by a VMA with uffd. 
> >>>>
> >>>> Indeed :) Aren't you worried with the fact that the in-memory state of lazyd
> >>>> will differ from the real state in the kernel?
> >>>
> >>> I am worried about all the UFFD_EVENT_* stuff. Except, perhaps,
> >>> MADV_DONTNEED. You example below with swap of two regions clearly shows
> >>> that list of (from,to) patches will not work. We do need to have more
> >>> complex state in lazyd that will be able to correspond old and new
> >>> addresses.
> >>
> >> How about extending the existing iovec element with 'request_addr' field
> >> that gets initialized with 1:1 map. Then upon mremap()-s the iovs get
> >> physically remapped in lists, but the 'request_addr' remains unchanged
> >> and is used when sending requests to page_read-s?
> > 
> > Yep, this should work. The drawback is that for each #PF we'll need to
> > traverse the iovec list. 
> 
> We can optimize this by keeping iovecs in rbtree. Note, that we'd need to
> look for the iovec at #PF-s address, not request_addr, so once remapped we
> will need to move the iovec in tree to keep the search fast.

Keeping VMAs may also resolve shrinking mappings ;-)
 
> > I was thinking about adding lazy_vma with, e.g.
> > 'initial_start' and 'actual_start' for tracking remaps. The iovec lists
> > then will be per VMA.
> >  
> >> -- Pavel
> >>
> > --
> > Sincerely yours,
> > Mike.
> > 
> > .
> > 
> 



More information about the CRIU mailing list