[CRIU] [PATCH] vmsplice user-buffer(grabbed by process_vm_readv) to page-pipe
Andrei Vagin
avagin at gmail.com
Wed Jul 24 09:18:47 MSK 2019
On Sun, Jul 21, 2019 at 05:54:34AM +0530, abhishek dubey wrote:
> Hi Andrei,
>
> On 18/07/19 1:57 AM, Andrei Vagin wrote:
> > On Fri, Jul 12, 2019 at 03:01:39AM +0530, abhishek dubey wrote:
> > > Hi Andrei,
> > >
> > > On 09/07/19 9:38 PM, Andrei Vagin wrote:
> > > > On Tue, Jul 09, 2019 at 01:50:57AM +0530, abhishek dubey wrote:
> > > > > Hi Andrei,
> > > > >
> > > > > Please find inline replies.
> > > > >
> > > > > On 08/07/19 9:35 AM, Andrei Vagin wrote:
> > > > > > On Fri, Jul 05, 2019 at 05:34:26PM +0530, Abhishek Dubey wrote:
> > > > > > > This patch implements usage of process_vm_readv
> > > > > > > syscall to collect memory pages from target process during
> > > > > > > pre-dump. process_vm_readv collects pages in user-buffer,
> > > > > > > which is later vmspliced to page-pipes.
> > > > > > You have to be more detailed in the commit message.
> > > > > >
> > > > > > Frist of all, you have to explain why we need this patch.
> > > > > I will take care of same from next time.
> > > > > > If you improve performance, you need to give some numbers.
> > > > > Still I am handling issue of "missing entry in parent pagemap". Next patch
> > > > > will have comparative figures included.
> > > > > > If you implement a new feature, you need to provide tests for it.
> > > > > >
> > > > > > If I understand this patch right, criu freezes all processes, collects
> > > > > > page maps, unfreezes process, dump pages? If this is right,
> > > > > Yes, this is right.
> > > > > > you need to
> > > > > > explain how it handles in this case:
> > > > > >
> > > > > > addr = map(NULL, 4 * PAGE_SIZE) |
> > > > > > addr[0] = 1 |
> > > > > > addr[PAGE_SIZE] = 2 |
> > > > > > addr[2 * PAGE_SIZE] = 3 |
> > > > > > addr[3 * PAGE_SIZE] = 4 |
> > > > > > | criu freezes the process
> > > > > > | criu collects page maps
> > > > > > | criu unfreezes the process
> > > > > > unmap(addr, PAGE_SIZE * 2) |
> > > > > > | criu dumps pages
> > > > > Need to explain this example somewhere along with code(as in page-pipe.h) or
> > > > > some other place?
> > > > here :)
>
> testcase/maps007 does something like this. But instead of unmapping initial
> 2 pages(as in example above), it always unmaps
>
> some intermediate page, like page 3 in this example. In current
> implementation, I have handled it by pre-dumping just first 2 pages, 3rd is
> unmapped so skip it
>
> and 4th page is not pre-dumped. I also adjust the length of iovec as per
> pre-dumped page count. This approach is not correct and passes the test case
>
> very few times. maps007 fails with page-count mis-match error.
I disabled this check in zdtm.py:
diff --git a/test/zdtm.py b/test/zdtm.py
index 69be1df5a..47768ed2d 100755
--- a/test/zdtm.py
+++ b/test/zdtm.py
@@ -1204,7 +1204,7 @@ class criu:
if (stats_written != r_pages) or (r_off != 0):
print("ERROR: bad page counts, stats = %d real = %d(%d)" %
(stats_written, r_pages, r_off))
- raise test_fail_exc("page counts mismatch")
+ #raise test_fail_exc("page counts mismatch")
def dump(self, action, opts=[]):
self.__iter += 1
and maps007 fails each time:
[root at fc24 criu]# python test/zdtm.py run --pre 2 -t zdtm/transition/maps007
=== Run 1/1 ================ zdtm/transition/maps007
======================= Run zdtm/transition/maps007 in h =======================
Start test
Test is SUID
./maps007 --pidfile=maps007.pid --outfile=maps007.out
Run criu pre-dump
ERROR: bad page counts, stats = 3506 real = 14(0)
Run criu pre-dump
=[log]=> dump/zdtm/transition/maps007/36/2/pre-dump.log
------------------------ grep Error ------------------------
(00.269502) buf 477/433
(00.269639) Can't read remote iovs (225280(1953792)/477)
(00.269649) h 0x9cd000 [2]
(00.269657) Checking 0x9cd000/8192 hole
(00.269667) Error (criu/page-xfer.c:287): Missing 9cd000 in parent pagemap
(00.269676) Error (criu/page-xfer.c:331): Hole 0x9cd000/8192 not found in parent
(00.269752) Writing image inventory (version 1)
(00.269842) Error (criu/cr-dump.c:1548): Pre-dumping FAILED.
------------------------ ERROR OVER ------------------------
############## Test zdtm/transition/maps007 FAIL at CRIU pre-dump ##############
Send the 9 signal to 36
Wait for zdtm/transition/maps007(36) to die for 0.100000
##################################### FAIL #####################################
>
> > > > I think this can be non-trivial case. process_vm_readv will return
> > > > ENOMEM for this segment, but a part of it still exists and we need to
> > > > find what part is here and dump it.
>
> Consider the following iovec:
>
> [A: 1 page] [B: 4 page] [C: 1 page] [D: 2 page]
>
> iovec_cnt: 4, PIPE_size =16
>
> Before pre-dump could start, process has unmapped complete 'B' region.
>
> process_vm_readv() will fail at 'B' and will only process region 'A' by
> default.
>
> Current implementation done till now supports processing region 'C' and 'D'
> and generate page-map accordingly.
>
> So here bad iovec (B) is skipped and remaining iovec(C,D) are processed.
>
> ----------------------------------------------------------------------------------------------------------
>
> Considering the same above example, status of 4 pages of region B :
>
> {B1: unmapped} {B2: unmapped} {B3: mapped} {B4: mapped}
>
> The regions B3,B4 although mapped were skipped by pre-dumping.
>
>
> For such case, can I implement something like:
>
> Make "temporary iovec" for region B, like:
>
> [B1: 1 page] [B2: 1 page] [B3: 1 page] [B4: 1 page]
>
>
> With existing implementation(explained above dotted line), process_vm_readv
> can skip iovec B1 and B2
>
> and move ahead processing B3 onward. Handling culprit iovec at finer
> granularity.
>
>
> During pagemap generation, instead of iovec of B, temporary iovec will be
> supplied.
>
> Can I go ahead with this approach?
>
> > > Pagemap cache won't be useful here.
> > >
> > > For now, I can only think of setting a threshold. If size of mapping is
> > > below threshold, we can process page-by-page(which is costly).
> > >
> > > Beyond threshold, we can leave such mapping to be handled by dump stage.
> Incorrect approach. Can't leave pages.
> > How do you detect that these pages have not been dumped on the dump
> > stage? Do we have a test for this case?
> >
> > >
> > > Implementation query:
> > >
> > > During "pre-dump", some of the iovs in iovec, fail to be processed by
> > > process_vm_readv(). Hence pagemap entry is skipped for same.
> > >
> > > Now, memory belonging to these failed iovs is not dirtied. During "dump" on
> > > generating iovs, such non-dirty memory areas are marked as
> > >
> > > holes and check for pages(in parent) backing these holes is done in
> > > page-xfer. Ultimately, holes are pointing to pages that are not pre-dumped.
> > >
> > > This check must be shifted to "dump stage", so that pages(iovs) skipped by
> > > process_vm_readv can't be marked as holes in incremental steps.
> > >
> > > Can I add new data structure to list down all the iovs failed by
> > > process_vm_readv. This list will be checked while generating holes in dump
> > > stage.
> > >
> > > is that a right approach?
> > >
> > >
> > > Please suggest if something more efficient or direct solution is possible.
> > >
> > >
> > > -Abhishek
> > >
> -Abhishek
More information about the CRIU
mailing list