[CRIU] [RFC PATCH 07/12] lazy-pages: add handling of UFFD_EVENT_REMAP

Pavel Emelyanov xemul at virtuozzo.com
Wed Jan 11 00:14:51 PST 2017


On 01/11/2017 11:05 AM, Mike Rapoport wrote:
> On Wed, Jan 11, 2017 at 10:28:16AM +0300, Pavel Emelyanov wrote:
>> On 01/11/2017 10:02 AM, Mike Rapoport wrote:
>>> On Tue, Jan 10, 2017 at 07:15:28PM +0300, Pavel Emelyanov wrote:
>>>> On 01/10/2017 05:20 PM, Mike Rapoport wrote:
>>>>> On Tue, Jan 10, 2017 at 05:07:01PM +0300, Pavel Emelyanov wrote:
>>>>>> On 01/09/2017 11:23 AM, Mike Rapoport wrote:
>>>>>>> When the restored process calls mremap(), we will see #PF's on the new
>>>>>>> addresses and we have to create a correspondence between the addresses
>>>>>>> found in the dump and the actual addresses the process uses. If the
>>>>>>> mremap() call causes the mapping to grow, the additional part will receive
>>>>>>> zero pages, as expected.
>>>>>>>
>>>>>>> FIXME: is the mapping shrinks, we will try to fill the dropped part and
>>>>>>> lazy-pages daemon will fail. It should be possible to track VMA changes in
>>>>>>> the lazy-pages daemon, but simpler, and, apparently, more correct would be
>>>>>>> to add UFFD_EVENT_MUNMAP to the kernel.
>>>>>>>
>>>>>>> Signed-off-by: Mike Rapoport <rppt at linux.vnet.ibm.com>
>>>>>>> ---
>>>>>>>  criu/uffd.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
>>>>>>>  1 file changed, 46 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/criu/uffd.c b/criu/uffd.c
>>>>>>> index 3bc7bc1..8bf4a14 100644
>>>>>>> --- a/criu/uffd.c
>>>>>>> +++ b/criu/uffd.c
>>>>>>> +static int handle_remap(struct lazy_pages_info *lpi, struct uffd_msg *msg)
>>>>>>> +{
>>>>>>> +	struct lazy_remap *remap;
>>>>>>> +
>>>>>>> +	remap = xmalloc(sizeof(*remap));
>>>>>>> +	if (!remap)
>>>>>>> +		return -1;
>>>>>>> +
>>>>>>> +	INIT_LIST_HEAD(&remap->l);
>>>>>>> +	remap->from = msg->arg.remap.from;
>>>>>>> +	remap->to = msg->arg.remap.to;
>>>>>>> +	remap->len = msg->arg.remap.len;
>>>>>>> +	list_add_tail(&remap->l, &lpi->remaps);
>>>>>>
>>>>>> Shouldn't we punch a hole in original region? Not to handle #PF there?
>>>>>
>>>>> I don't quite follow you here. The #PF's are delivered to uffd only if the
>>>>> VMA covering the range is registered with uffd. For moving remaps, the
>>>>> original address range will not be covered by a VMA with uffd. 
>>>>
>>>> Indeed :) Aren't you worried with the fact that the in-memory state of lazyd
>>>> will differ from the real state in the kernel?
>>>
>>> I am worried about all the UFFD_EVENT_* stuff. Except, perhaps,
>>> MADV_DONTNEED. You example below with swap of two regions clearly shows
>>> that list of (from,to) patches will not work. We do need to have more
>>> complex state in lazyd that will be able to correspond old and new
>>> addresses.
>>
>> How about extending the existing iovec element with 'request_addr' field
>> that gets initialized with 1:1 map. Then upon mremap()-s the iovs get
>> physically remapped in lists, but the 'request_addr' remains unchanged
>> and is used when sending requests to page_read-s?
> 
> Yep, this should work. The drawback is that for each #PF we'll need to
> traverse the iovec list. 

We can optimize this by keeping iovecs in rbtree. Note, that we'd need to
look for the iovec at #PF-s address, not request_addr, so once remapped we
will need to move the iovec in tree to keep the search fast.

> I was thinking about adding lazy_vma with, e.g.
> 'initial_start' and 'actual_start' for tracking remaps. The iovec lists
> then will be per VMA.
>  
>> -- Pavel
>>
> --
> Sincerely yours,
> Mike.
> 
> .
> 



More information about the CRIU mailing list