[CRIU] [PATCH] vmsplice user-buffer(grabbed by process_vm_readv) to page-pipe
Andrei Vagin
avagin at gmail.com
Wed Jul 24 10:25:34 MSK 2019
On Sun, Jul 21, 2019 at 05:54:34AM +0530, abhishek dubey wrote:
> Hi Andrei,
>
> On 18/07/19 1:57 AM, Andrei Vagin wrote:
> > On Fri, Jul 12, 2019 at 03:01:39AM +0530, abhishek dubey wrote:
> > > Hi Andrei,
> > >
> > > On 09/07/19 9:38 PM, Andrei Vagin wrote:
> > > > On Tue, Jul 09, 2019 at 01:50:57AM +0530, abhishek dubey wrote:
> > > > > Hi Andrei,
> > > > >
> > > > > Please find inline replies.
> > > > >
> > > > > On 08/07/19 9:35 AM, Andrei Vagin wrote:
> > > > > > On Fri, Jul 05, 2019 at 05:34:26PM +0530, Abhishek Dubey wrote:
> > > > > > > This patch implements usage of process_vm_readv
> > > > > > > syscall to collect memory pages from target process during
> > > > > > > pre-dump. process_vm_readv collects pages in user-buffer,
> > > > > > > which is later vmspliced to page-pipes.
> > > > > > You have to be more detailed in the commit message.
> > > > > >
> > > > > > Frist of all, you have to explain why we need this patch.
> > > > > I will take care of same from next time.
> > > > > > If you improve performance, you need to give some numbers.
> > > > > Still I am handling issue of "missing entry in parent pagemap". Next patch
> > > > > will have comparative figures included.
> > > > > > If you implement a new feature, you need to provide tests for it.
> > > > > >
> > > > > > If I understand this patch right, criu freezes all processes, collects
> > > > > > page maps, unfreezes process, dump pages? If this is right,
> > > > > Yes, this is right.
> > > > > > you need to
> > > > > > explain how it handles in this case:
> > > > > >
> > > > > > addr = map(NULL, 4 * PAGE_SIZE) |
> > > > > > addr[0] = 1 |
> > > > > > addr[PAGE_SIZE] = 2 |
> > > > > > addr[2 * PAGE_SIZE] = 3 |
> > > > > > addr[3 * PAGE_SIZE] = 4 |
> > > > > > | criu freezes the process
> > > > > > | criu collects page maps
> > > > > > | criu unfreezes the process
> > > > > > unmap(addr, PAGE_SIZE * 2) |
> > > > > > | criu dumps pages
> > > > > Need to explain this example somewhere along with code(as in page-pipe.h) or
> > > > > some other place?
> > > > here :)
>
> testcase/maps007 does something like this. But instead of unmapping initial
> 2 pages(as in example above), it always unmaps
>
> some intermediate page, like page 3 in this example. In current
> implementation, I have handled it by pre-dumping just first 2 pages, 3rd is
> unmapped so skip it
>
> and 4th page is not pre-dumped. I also adjust the length of iovec as per
> pre-dumped page count. This approach is not correct and passes the test case
>
> very few times. maps007 fails with page-count mis-match error.
>
> > > > I think this can be non-trivial case. process_vm_readv will return
> > > > ENOMEM for this segment, but a part of it still exists and we need to
> > > > find what part is here and dump it.
>
> Consider the following iovec:
>
> [A: 1 page] [B: 4 page] [C: 1 page] [D: 2 page]
>
> iovec_cnt: 4, PIPE_size =16
>
> Before pre-dump could start, process has unmapped complete 'B' region.
>
> process_vm_readv() will fail at 'B' and will only process region 'A' by
> default.
>
> Current implementation done till now supports processing region 'C' and 'D'
> and generate page-map accordingly.
>
> So here bad iovec (B) is skipped and remaining iovec(C,D) are processed.
If we want to skip iovec, we need to remove these pages from a pagemap,
don't we?
>
> ----------------------------------------------------------------------------------------------------------
>
> Considering the same above example, status of 4 pages of region B :
>
> {B1: unmapped} {B2: unmapped} {B3: mapped} {B4: mapped}
>
> The regions B3,B4 although mapped were skipped by pre-dumping.
>
>
> For such case, can I implement something like:
>
> Make "temporary iovec" for region B, like:
>
> [B1: 1 page] [B2: 1 page] [B3: 1 page] [B4: 1 page]
How and when are you going to generate this "temporary iovec"?
>
>
> With existing implementation(explained above dotted line), process_vm_readv
> can skip iovec B1 and B2
>
> and move ahead processing B3 onward. Handling culprit iovec at finer
> granularity.
>
>
> During pagemap generation, instead of iovec of B, temporary iovec will be
> supplied.
>
> Can I go ahead with this approach?
I don't know. I need more details about it...
>
> > > Pagemap cache won't be useful here.
> > >
> > > For now, I can only think of setting a threshold. If size of mapping is
> > > below threshold, we can process page-by-page(which is costly).
> > >
> > > Beyond threshold, we can leave such mapping to be handled by dump stage.
> Incorrect approach. Can't leave pages.
> > How do you detect that these pages have not been dumped on the dump
> > stage? Do we have a test for this case?
> >
> > >
> > > Implementation query:
> > >
> > > During "pre-dump", some of the iovs in iovec, fail to be processed by
> > > process_vm_readv(). Hence pagemap entry is skipped for same.
> > >
> > > Now, memory belonging to these failed iovs is not dirtied. During "dump" on
> > > generating iovs, such non-dirty memory areas are marked as
> > >
> > > holes and check for pages(in parent) backing these holes is done in
> > > page-xfer. Ultimately, holes are pointing to pages that are not pre-dumped.
> > >
> > > This check must be shifted to "dump stage", so that pages(iovs) skipped by
> > > process_vm_readv can't be marked as holes in incremental steps.
> > >
> > > Can I add new data structure to list down all the iovs failed by
> > > process_vm_readv. This list will be checked while generating holes in dump
> > > stage.
> > >
> > > is that a right approach?
> > >
> > >
> > > Please suggest if something more efficient or direct solution is possible.
> > >
> > >
> > > -Abhishek
> > >
> -Abhishek
More information about the CRIU
mailing list