[CRIU] [PATCH] zdtm: check that a command completes successfully after a fault (v2)
Andrew Vagin
avagin at virtuozzo.com
Tue Mar 1 07:50:36 PST 2016
On Tue, Mar 01, 2016 at 05:53:17PM +0300, Pavel Emelyanov wrote:
> On 03/01/2016 05:23 PM, Andrey Vagin wrote:
> > 2016-03-01 2:01 GMT-08:00 Pavel Emelyanov <xemul at virtuozzo.com>:
> >> On 03/01/2016 03:04 AM, Andrey Vagin wrote:
> >>> From: Andrew Vagin <avagin at virtuozzo.com>
> >>>
> >>> I suggest to inject a fault and than try to execute the same command
> >>> again without a fault to check that it will complete successfully.
> >>>
> >>> v2: skip a parasite blob when we are checking vma-s
> >>> Signed-off-by: Andrew Vagin <avagin at virtuozzo.com>
> >>> ---
> >>> test/zdtm.py | 40 +++++++++++++++++++++++++++++-----------
> >>> 1 file changed, 29 insertions(+), 11 deletions(-)
> >>>
> >>> diff --git a/test/zdtm.py b/test/zdtm.py
> >>> index 1ace919..27fa8d4 100755
> >>> --- a/test/zdtm.py
> >>> +++ b/test/zdtm.py
> >>> @@ -656,13 +656,31 @@ class criu_cli:
> >>>
> >>> preexec = self.__user and self.set_user_id or None
> >>>
> >>> - ret = self.__criu(action, s_args, self.__fault, strace, preexec)
> >>> - grep_errors(os.path.join(self.__ddir(), log))
> >>> - if ret != 0:
> >>> - if self.__fault or self.__test.blocking() or (self.__sat and action == 'restore'):
> >>> - raise test_fail_expected_exc(action)
> >>> - else:
> >>> - raise test_fail_exc("CRIU %s" % action)
> >>> + faults = [ self.__fault ]
> >>> + # try again after the first failed case
> >>> + if self.__fault:
> >>> + faults.append(None)
> >>> + for fault in faults:
> >>> + __ddir = self.__ddir()
> >>> +
> >>> + ret = self.__criu(action, s_args, fault, strace, preexec)
> >>> + grep_errors(os.path.join(__ddir, log))
> >>> + if ret != 0:
> >>> + if fault:
> >>> + try_run_hook(self.__test, ["--fault", action])
> >>> + if action == "dump":
> >>> + __ddir_fail = __ddir + ".fail"
> >>> + os.rename(__ddir, __ddir + ".fail")
> >>> + os.mkdir(__ddir)
> >>> + os.chmod(__ddir, 0777)
> >>> + else:
> >>> + os.rename(os.path.join(__ddir, log), os.path.join(__ddir, log + ".fail"))
> >>
> >> What does this dir manipulation do?
> >
> > On dump this directory will contain a part of images, so we move the
> > whole directory.
> > On restore we move only a log file.
>
> Move where and what for? There was no log file moving in this place.
Rename into DIRNAME.fail. It's to avoid situation when we have images
from a previous run. We have a few optional images and if we execute
dump in a second time, we can get mix of images from the frist and
second runs.
We need to move a log file to save it for future investigations. We
need to rename a log file, because we don't know when a fault will be
injected.
>
> -- Pavel
More information about the CRIU
mailing list