[CRIU] [PATCHv3 0/7] Large pages support for aarch64/ppc64

Dmitry Safonov 0x7f454c46 at gmail.com
Mon Apr 9 19:46:29 MSK 2018


Hi Adrian,

Thank you again! You're awesome!

2018-04-09 16:40 GMT+01:00 Adrian Reber <areber at redhat.com>:
> I tested it on s390x, ppc64le and aarch64.
>
> s390x:
>
> ################### 1 TEST(S) FAILED (TOTAL 356/SKIPPED 34) ####################
>  * zdtm/static/s390x_gs_threads(h)
> ##################################### FAIL #####################################
>
> As the kernel has no guarded storage support enabled, that is a pretty
> good result.
>
> ppc64le:
>
> ################### 3 TEST(S) FAILED (TOTAL 353/SKIPPED 35) ####################
>  * zdtm/static/aio00(h)
>  * zdtm/static/aio01(h)
>  * zdtm/static/maps06(h)
> ##################################### FAIL #####################################

Are these regressions? Were they failing before the patches set?

> ========================= Run zdtm/static/maps06 in h ==========================
> Start test
> ./maps06 --pidfile=maps06.pid --outfile=maps06.out --filename=maps06.test
>  Test zdtm/static/maps06 FAIL at start: [Errno 2] No such file or directory: 'zdtm/static/maps06.pid'
> Test output: ================================
> 05:21:20.159:    47: ERR: test.c:252: Test exited unexpectedly with code 1
>
>  <<< ================================
>
> ========================== Run zdtm/static/aio00 in h ==========================
> Start test
> ./aio00 --pidfile=aio00.pid --outfile=aio00.out
> Run criu dump
> Run criu restore
> =[log]=> dump/zdtm/static/aio00/48/1/restore.log
> ------------------------ grep Error ------------------------
> (00.038557)pie: 48: vdso: image [vdso] 0x7fff86ac0000-0x7fff86ae0000 [vvar] 0>
> (00.038557)pie: 48: xffffffffffffffff-0xffffffffffffffff
> (00.038575)pie: 48: vdso: Runtime vdso/vvar matches dumpee, remap inplace
> (00.038588)pie: 48: vdso: Remap rt-vdso 0x80000 -> 0x7fff86ac0000
> (00.038642)pie: 48: Error (criu/pie/restorer.c:702): Ring setup failed with ->
> (00.038642)pie: 48: 22
> (00.038642)pie: 48: 22
> (00.038661)pie: 48: Error (criu/pie/restorer.c:1781): Restorer fail 48
> (00.038666) Error (criu/cr-restore.c:2346): Failed to wait inprogress tasks
> (00.038689) Error (criu/cr-restore.c:2523): Restoring FAILED.
> ------------------------ ERROR OVER ------------------------
> ################# Test zdtm/static/aio00 FAIL at CRIU restore ##################
>
> ========================== Run zdtm/static/aio01 in h ==========================
> Start test
> ./aio01 --pidfile=aio01.pid --outfile=aio01.out
> Run criu dump
> Run criu restore
> =[log]=> dump/zdtm/static/aio01/48/1/restore.log
> ------------------------ grep Error ------------------------
> (00.038850)pie: 48: vdso: image [vdso] 0x7fffb12f0000-0x7fffb1310000 [vvar] 0>
> (00.038850)pie: 48: xffffffffffffffff-0xffffffffffffffff
> (00.038864)pie: 48: vdso: Runtime vdso/vvar matches dumpee, remap inplace
> (00.038878)pie: 48: vdso: Remap rt-vdso 0x80000 -> 0x7fffb12f0000
> (00.038927)pie: 48: Error (criu/pie/restorer.c:702): Ring setup failed with ->
> (00.038927)pie: 48: 22
> (00.038927)pie: 48: 22
> (00.038946)pie: 48: Error (criu/pie/restorer.c:1781): Restorer fail 48
> (00.038952) Error (criu/cr-restore.c:2346): Failed to wait inprogress tasks
> (00.038978) Error (criu/cr-restore.c:2523): Restoring FAILED.
> ------------------------ ERROR OVER ------------------------
> ################# Test zdtm/static/aio01 FAIL at CRIU restore ##################
>
>
> and aarch64:
>
> ################### 21 TEST(S) FAILED (TOTAL 353/SKIPPED 35) ###################
>  * zdtm/static/aio00(h)
>  * zdtm/static/aio01(h)
>  * zdtm/static/fd(h)
>  * zdtm/static/userns00(uns)
>  * zdtm/static/userns01(uns)
>  * zdtm/static/userns02(uns)
>  * zdtm/static/userns-leaked-sock(uns)
>  * zdtm/static/netns_sub_veth(uns)
>  * zdtm/static/pidns01(uns)
>  * zdtm/static/maps06(h)
>  * zdtm/static/write_read10(h)
>  * zdtm/static/sk-unix-rel(h)
>  * zdtm/static/unlink_fstat02(h)
>  * zdtm/static/unlink_mmap02(h)
>  * zdtm/static/cow01(h)
>  * zdtm/static/sockets00(h)
>  * zdtm/static/socket_close_data01(h)
>  * zdtm/static/mntns_ghost01(ns)
>  * zdtm/static/del_standalone_un(h)
>  * zdtm/static/sk-unix-mntns(uns)
>  * zdtm/transition/maps007(h)
> ##################################### FAIL #####################################
>
> Not sure what to say about this. Before your patchset is wasn't working
> at all. Complete log of zdtm.py:
>
> https://lisas.de/~adrian/aarch64-zdtm.log

Here is a quick glance at failures, how I see them:

The same failures as on ppc (not sure, what's going on there, needs debugging):
- aio*

ZDTM tests that have hardcoded 4096 as page size (easy fixable):
- maps06
- unlink_mmap02
- cow01

Pretty worrisome failures, will need some debugging:
  (00.965055) Error (criu/cr-restore.c:1586): 73 killed by signal 7: Bus error
- userns0[012]
- netns_sub_veth
- pidns01

Also not sure what's wrong there, will need to debug:
 Test zdtm/static/userns-leaked-sock FAIL at start: [Errno 2] No such
file or directory: 'zdtm/static/userns-leaked-sock.pid'
- userns-leaked-sock

05:42:36.786:    36: FAIL: write_read10.c:102: can't read
write_read10.test: Resource temporarily unavailable
- write_read10

Device or inode differs after C/R o.O:
05:43:39.838:    36: FAIL: unlink_fstat02.c:85: files differ after restore
- unlink_fstat02

And socket-related failures (maybe kernel miss some support, idk,
needs look at it):
(00.010146) sk unix: Resolving relative name sk-unix-rel.test for socket f2968
(00.010174) Error (criu/sk-unix.c:313): sk unix: Can't resolve name
for socket 0x7
- sk-unix-rel, socket_close_data01

(00.027965)     36: sk unix: Opening standalone socket (id 0x8 ino
0x10652e peer 0x10652d)
(00.027969)     36: sk unix: Connect 0x10652e to 0x10652d
(00.028007)     36: Error (criu/sk-unix.c:1280): sk unix: Can't
connect 0x10652e socket: No such file or directory
- sockets00
- sk-unix-mntns in uns

05:54:32.160:    36: FAIL: del_standalone_un.c:111:
/home/criu/test/zdtm/static/del_standalone_un.test/sock doesn't exist
after restore
 (errno = 2 (No such file or directory))
- del_standalone_un

Interesting failure during dumb ns read-only rootfs:
(01.965788)      1: Remap rpath is
zdtm/static/mntns_ghost01.test/test.ghost.cr.1.ghost
(01.965829)      1: Configuring remap 0xe -> 0x2
(01.965863)      1: Error (criu/files-reg.c:398): Can't create ghost
regfile: Read-only file system
(01.965921) Error (criu/cr-restore.c:2346): Failed to wait inprogress tasks
- mntns_ghost01 in ns

So, if ppc failures are not regressions, I would suggest to apply
patches set and to work
on failures on the top. Tests that have page size hardcoded are
easy-fixable, other would
need some care and proper look/debug. Fortunately, I've already set
aarch64 VM, so
probably will be able to reproduce them.

-- 
             Dmitry


More information about the CRIU mailing list