[CRIU] [PATCH] vmsplice user-buffer(grabbed by process_vm_readv) to page-pipe
abhishek dubey
dubeyabhishek777 at gmail.com
Wed Jul 24 23:13:36 MSK 2019
Hi Andrei,
Please find inline reply.
On 24/07/19 12:55 PM, Andrei Vagin wrote:
> On Sun, Jul 21, 2019 at 05:54:34AM +0530, abhishek dubey wrote:
>> Hi Andrei,
>>
>> On 18/07/19 1:57 AM, Andrei Vagin wrote:
>>> On Fri, Jul 12, 2019 at 03:01:39AM +0530, abhishek dubey wrote:
>>>> Hi Andrei,
>>>>
>>>> On 09/07/19 9:38 PM, Andrei Vagin wrote:
>>>>> On Tue, Jul 09, 2019 at 01:50:57AM +0530, abhishek dubey wrote:
>>>>>> Hi Andrei,
>>>>>>
>>>>>> Please find inline replies.
>>>>>>
>>>>>> On 08/07/19 9:35 AM, Andrei Vagin wrote:
>>>>>>> On Fri, Jul 05, 2019 at 05:34:26PM +0530, Abhishek Dubey wrote:
>>>>>>>> This patch implements usage of process_vm_readv
>>>>>>>> syscall to collect memory pages from target process during
>>>>>>>> pre-dump. process_vm_readv collects pages in user-buffer,
>>>>>>>> which is later vmspliced to page-pipes.
>>>>>>> You have to be more detailed in the commit message.
>>>>>>>
>>>>>>> Frist of all, you have to explain why we need this patch.
>>>>>> I will take care of same from next time.
>>>>>>> If you improve performance, you need to give some numbers.
>>>>>> Still I am handling issue of "missing entry in parent pagemap". Next patch
>>>>>> will have comparative figures included.
>>>>>>> If you implement a new feature, you need to provide tests for it.
>>>>>>>
>>>>>>> If I understand this patch right, criu freezes all processes, collects
>>>>>>> page maps, unfreezes process, dump pages? If this is right,
>>>>>> Yes, this is right.
>>>>>>> you need to
>>>>>>> explain how it handles in this case:
>>>>>>>
>>>>>>> addr = map(NULL, 4 * PAGE_SIZE) |
>>>>>>> addr[0] = 1 |
>>>>>>> addr[PAGE_SIZE] = 2 |
>>>>>>> addr[2 * PAGE_SIZE] = 3 |
>>>>>>> addr[3 * PAGE_SIZE] = 4 |
>>>>>>> | criu freezes the process
>>>>>>> | criu collects page maps
>>>>>>> | criu unfreezes the process
>>>>>>> unmap(addr, PAGE_SIZE * 2) |
>>>>>>> | criu dumps pages
>>>>>> Need to explain this example somewhere along with code(as in page-pipe.h) or
>>>>>> some other place?
>>>>> here :)
>> testcase/maps007 does something like this. But instead of unmapping initial
>> 2 pages(as in example above), it always unmaps
>>
>> some intermediate page, like page 3 in this example. In current
>> implementation, I have handled it by pre-dumping just first 2 pages, 3rd is
>> unmapped so skip it
>>
>> and 4th page is not pre-dumped. I also adjust the length of iovec as per
>> pre-dumped page count. This approach is not correct and passes the test case
>>
>> very few times. maps007 fails with page-count mis-match error.
>>
>>>>> I think this can be non-trivial case. process_vm_readv will return
>>>>> ENOMEM for this segment, but a part of it still exists and we need to
>>>>> find what part is here and dump it.
>> Consider the following iovec:
>>
>> [A: 1 page] [B: 4 page] [C: 1 page] [D: 2 page]
>>
>> iovec_cnt: 4, PIPE_size =16
>>
>> Before pre-dump could start, process has unmapped complete 'B' region.
>>
>> process_vm_readv() will fail at 'B' and will only process region 'A' by
>> default.
>>
>> Current implementation done till now supports processing region 'C' and 'D'
>> and generate page-map accordingly.
>>
>> So here bad iovec (B) is skipped and remaining iovec(C,D) are processed.
> If we want to skip iovec, we need to remove these pages from a pagemap,
> don't we?
Yes, pagemap entry corresponding to skipped iov is bypassed.
Special handling of memory regions reflects in pagemap also.
>> ----------------------------------------------------------------------------------------------------------
>>
>> Considering the same above example, status of 4 pages of region B :
>>
>> {B1: unmapped} {B2: unmapped} {B3: mapped} {B4: mapped}
>>
>> The regions B3,B4 although mapped were skipped by pre-dumping.
>>
>>
>> For such case, can I implement something like:
>>
>> Make "temporary iovec" for region B, like:
>>
>> [B1: 1 page] [B2: 1 page] [B3: 1 page] [B4: 1 page]
> How and when are you going to generate this "temporary iovec"?
When:
Whenever process_vm_readv returns with less count of bytes than
expected. At this point,
find the failing iov using the count of bytes returned by syscall.
Two case arise here:
1) failing iov is of 1 PAGE_SIZE: it means page has been unmapped
meanwhile. skip pagemap entry and move to next iov in supplied iovec.
2) failing iov is of "n" PAGE_SIZE: it means unmapped region can be
anywhere from [0..n-1] PAGE within iov range
Temporary iovec will be generated only in second case.
How:
Take failing iov of length 'n' and form temporary iovec of n-entries,
each sized with 1 PAGE_SIZE. Since each entry in iovec is of 1 PAGE_SIZE,
either it will be processed completely or discarded completely(no
partial iov processing).
Mark down those iov in temporary iovec which fails and skip page map
generation for same.
Once temporary iovec processing is complete, resume processing of
remaining iovs of supplied iovec.
>
>>
>> With existing implementation(explained above dotted line), process_vm_readv
>> can skip iovec B1 and B2
>>
>> and move ahead processing B3 onward. Handling culprit iovec at finer
>> granularity.
>>
>>
>> During pagemap generation, instead of iovec of B, temporary iovec will be
>> supplied.
>>
>> Can I go ahead with this approach?
> I don't know. I need more details about it...
Please let me know, if I fail delivering clear idea(through explanation
above).
I will come up with pictorial representation.
-Abhishek
More information about the CRIU
mailing list