[CRIU] [PATCH 2/3] mem: if no parent image persists, can't rely on it

Pavel Tikhomirov snorcht at gmail.com
Tue Apr 15 05:48:33 PDT 2014


It seem to me, that clear_soft_dirty function for pte in kernel is called
only
if we had explicit write to /proc/pid/clear_refs. Only after that all pages
will
become write-protected and dirty tracking realy works, if new process was
created between snapshots, some part of its memory can be not write
protected so no PF generated for this part and some soft-dirty bits not set,
isn't it?


Best Regards, Tikhomirov Pavel.


2014-04-14 16:54 GMT+04:00 Pavel Emelyanov <xemul at parallels.com>:

> On 04/14/2014 04:52 PM, Pavel Tikhomirov wrote:
> > try next patch(3/3) without this one, you'll catch(triggers 1/5
> approximately)
> > (00.017322)  21253: Error (page-read.c:67): No parent for snapshot
> pagemap
> > this mean that page is in parent but no parent exists
>
> Plz, investigate.
>
> > sudo bash test/zdtm.sh -i 3 -s transition/fork
> > ================================= CRIU CHECK
> =================================
> > Looks good.
> > Execute zdtm/live/transition/fork
> > ./fork --pidfile=fork.pid --outfile=fork.out
> > Dump 19786
> > /home/snorch/temp_criu/criu/test/post-dump.sh: 3: [: post-dump:
> unexpected operator
> > Dump 19786
> > /home/snorch/temp_criu/criu/test/post-dump.sh: 3: [: post-dump:
> unexpected operator
> > Dump 19786
> > Restore
> > Test: zdtm/live/transition/fork, Result: FAIL
> > Test: zdtm/live/transition/fork, Namespace:
> > Dump log   : /home/snorch/temp_criu/criu/test/dump/fork/19786/3/dump.log
> > ==================================== ERROR
> ====================================
> > --------------------------------- grep Error
> ---------------------------------
> > (00.023035) Error (image.c:202): Unable to open pagemap-21253.img: No
> such file or directory
> > (00.023046) Error (image.c:202): Unable to open pages-21253.img: No such
> file or directory
> > (00.023049) Error (page-xfer.c:661): No parent image found, though
> parent directory is set: No such file or directory
> > ------------------------------------- END
> -------------------------------------
> > Restore log:
> /home/snorch/temp_criu/criu/test/dump/fork/19786/3/restore.log
> > --------------------------------- grep Error
> ---------------------------------
> > (00.017277)  21253: Error (image.c:202): Unable to open
> pagemap-21253.img: No such file or directory
> > (00.017297)  21253: Error (image.c:202): Unable to open pages-21253.img:
> No such file or directory
> > (00.017322)  21253: Error (page-read.c:67): No parent for snapshot
> pagemap
> > (00.017468)  19786: Error (cr-restore.c:1036): 21253 exited, status=1
> > (00.017502) Error (cr-restore.c:1579): Restoring FAILED.
> > ------------------------------------- END
> -------------------------------------
> > ================================= ERROR OVER
> =================================
> >
> >
> > Best Regards, Tikhomirov Pavel.
> >
> >
> > 2014-04-14 15:17 GMT+04:00 Pavel Emelyanov <xemul at parallels.com <mailto:
> xemul at parallels.com>>:
> >
> >     On 04/09/2014 01:34 PM, Tikhomirov Pavel wrote:
> >     > here was bug cause if e.g.: iterative snapshots are made and
> >     > between two of them new process in process tree was created,
> >     > criu will assume that all pages of this new process are "clean"
> >     > believing that there is previous image for it and dirty tracking
> >     > is on, but non of that is true, and it will end up in fail on
> restore.
> >     >
> >     > also this bug was not catched because of error in zdtm, look 3/3
> >     >
> >     > Signed-off-by: Tikhomirov Pavel <snorcht at gmail.com <mailto:
> snorcht at gmail.com>>
> >     > ---
> >     >  mem.c | 6 +++---
> >     >  1 file changed, 3 insertions(+), 3 deletions(-)
> >     >
> >     > diff --git a/mem.c b/mem.c
> >     > index ef1d010..6df198c 100644
> >     > --- a/mem.c
> >     > +++ b/mem.c
> >     > @@ -106,7 +106,7 @@ static inline bool page_in_parent(u64 pme)
> >     >   * the memory contents is present in the pagent image set.
> >     >   */
> >     >
> >     > -static int generate_iovs(struct vma_area *vma, struct page_pipe
> *pp, u64 *map, u64 *off)
> >     > +static int generate_iovs(struct vma_area *vma, struct page_pipe
> *pp, u64 *map, u64 *off, bool no_parent)
> >     >  {
> >     >       u64 *at = &map[PAGE_PFN(*off)];
> >     >       unsigned long pfn, nr_to_scan;
> >     > @@ -130,7 +130,7 @@ static int generate_iovs(struct vma_area *vma,
> struct page_pipe *pp, u64 *map, u
> >     >                * page. The latter would be checked in page-xfer.
> >     >                */
> >     >
> >     > -             if (page_in_parent(at[pfn])) {
> >     > +             if (page_in_parent(at[pfn]) && !no_parent) {
> >
> >     If xfer.parent == NULL then page_in_parent should never return true.
> Why is this happening?
> >
> >     >                       ret = page_pipe_add_hole(pp, vaddr);
> >     >                       pages[0]++;
> >     >               } else {
> >     > @@ -282,7 +282,7 @@ static int __parasite_dump_pages_seized(struct
> parasite_ctl *ctl,
> >     >               if (!map)
> >     >                       goto out_xfer;
> >     >  again:
> >     > -             ret = generate_iovs(vma_area, pp, map, &off);
> >     > +             ret = generate_iovs(vma_area, pp, map, &off,
> xfer.parent == NULL);
> >     >               if (ret == -EAGAIN) {
> >     >                       BUG_ON(pp_ret);
> >     >
> >     >
> >
> >
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20140415/860e8dd1/attachment.html>


More information about the CRIU mailing list