[CRIU] [PATCH 00/11] vDSO rework, part 3/3
Dmitry Safonov
dsafonov at virtuozzo.com
Mon Jul 17 15:39:51 MSK 2017
Hi guys,
That's the last part of the set, it consists of proxyfication
fixes: before those (2-4) patches on the creating of proxy-vdso
the code unmapped original vvar image and did it iff vvar vma
was placed after vdso in vmas list. As the result, after several
C/R with inserted trampolines the virtual address space of the
task has being polluted with rt-vvar vmas from previous restoree.
There is a test (7) for checking that after several C/Rs with
inserted jump trampolines there is no pollution happening.
Besides, there is performance optimizations - by checking kernel
APIs in kdat tests, we can omit some unnecessary deeds.
The only thing, I did not address in this set, but that may go
wrong - is dropping of rt-vmas on the further C/Rs. Application
may be on rt-vdso at the moment of dump and this seems racy.
Nevertheless, it's not the patches set's regression - it's just
still present race, which should be addressed by some fix afterward.
Attaching original set's cover:
=== Full description of the set ===
After facing a bug with mismatched pfn for vdso in vz7 CTs
and proposing a way to solve it and make vdso C/R faster:
https://lists.openvz.org/pipermail/criu/2017-June/037982.html
I've started working on it. Then I've moved vdso symtable to
criu.kdat file (as I previously have proposed to Pasha)...
And found that vDSO code is not very readable and another
fast-fix on the top may roll a yacht over
During the refactoring I've also meet some bugs, which were fixed:
1. Leaving vdso/vvar after restore in task which didn't have them.
2. Bug with unmapping original vvar vma on the second C/R if
jump trampolines & rt-vdso is used.
3. Keeping rt-vvar after each C/R with inserted jump trampolines.
4. Bug with ia32 unmapping vdso after trampolines insertion.
For (2), (3), I did introduce rt-vdso mark v3.
There are two new tests:
o Task with unmapped vdso (vdso02)
o Iterative C/R with inserting jump-trampolines (vdso-proxy)
Then, the whole process of vdso's C/R has being reworked with
criu.kdat file keeping in mind:
=== Process of vDSO C/R ===
On *Dump*:
Before After
------ -----
Checking vdso's pfn or filling No need to do this on post v3.16
symtable to find if VMa is vdso kernels, as "[vdso]" mark stays
or is mishinted by /proc/../maps in maps file after mremap().
------ -----
Parsing of self-maps to find
vdso/vvar position for remapping No need to do this on dump.
into restorer's parking zone.
------ -----
Parsing vdso's symtable to find No need to do this on dump.
if image's vdso matches host's vdso.
On *Restore*:
Before After
------ -----
Parsing of self-maps to find Mapping vdso/vvar with
vdso/vvar position for remapping arch_prctl(MAP_VDSO_*)
into restorer's parking zone. to save sys_mremap() syscalls.
------ -----
Parsing vdso's symtable to find Keeping vdso symtable in criu.kdat.
if image's vdso matches host's vdso.
------ -----
Checking vdso's pfn or filling
symtable to find if VMa is vdso No need to do this on restore.
or is mishinted by /proc/../maps
=== Result: some numbers ===
I've tested the performance impact of the patches set on:
4.12.0-rc5+ kernel (needed arch_prctl() is available from v4.9 kernel)
With the current criu-dev af6399cc5 ("vma: Fix badly inherited FD
in filemap_open")
And my `wip/new-vdso' branch on github.
Done on Qemu with 2Gb of memory and 4 CPUs on fedora-26.
Only values those differ are presented.
Mean test results on 30 iterations on busyloop00 (single-process impact)
and session00 (~10 processes impact).
Deviation is from session00 (before) test, as it's the largest there.
*Dump* | === busyloop00 === | === session00 === max dev
| before: after: | before: after:
syscalls:sys_enter_splice | 13 12 | 96 96
syscalls:sys_enter_fcntl | 30 29 | 200 199
syscalls:sys_enter_unlinkat | 25 26 | 31 31
syscalls:sys_enter_pipe | 6 5 | 45 44
syscalls:sys_enter_newfstatat | 157 159 | 334 334
syscalls:sys_enter_read | 100 96 | 378 374
syscalls:sys_enter_write | 17 11 | 57 56
syscalls:sys_enter_pread64 | 16 17 | 137 73
syscalls:sys_enter_writev | 2 0 | 4 4
syscalls:sys_enter_sendfile64 | 3 0 | 0 0
syscalls:sys_enter_open | 85 83 | 83 83
syscalls:sys_enter_openat | 68 62 | 436 426
syscalls:sys_enter_close | 172 160 | 655 641
syscalls:sys_enter_brk | 10 4 | 14 14
syscalls:sys_enter_munmap | 5 2 | 12 9
syscalls:sys_enter_kcmp | 5 5 | 111 105 +-0.45%
syscalls:sys_enter_getpid | 25 24 | 88 87
syscalls:sys_enter_kill | 2 0 | 2 0
syscalls:sys_enter_exit | 1 0 | 1 0
syscalls:sys_enter_wait4 | 19 17 | 138 136
syscalls:sys_enter_mmap | 26 25 | 33 32
syscalls:sys_enter_arch_prctl | 2 1 | 2 1
seconds time elapsed |0.085441 0.082569 |0.107197 0.103572 +-2.98%
*Restore* | === busyloop00 === | === session00 === max dev
| before: after: | before: after:
syscalls:sys_enter_dup2 | 245 241 | 1748 1615 +-6.45%
syscalls:sys_enter_fcntl | 627 617 | 4450 4119 +-6.34%
syscalls:sys_enter_pipe | 1 0 | 5 4
syscalls:sys_enter_read | 50 43 | 140 133
syscalls:sys_enter_write | 16 15 | 33 32
syscalls:sys_enter_pread64 | 1 0 | 176 168
syscalls:sys_enter_open | 165 163 | 915 849 +-6.17%
syscalls:sys_enter_openat | 82 78 | 208 204
syscalls:sys_enter_close | 355 343 | 2074 1933 +-5.44%
syscalls:sys_enter_mremap | 8 6 | 319 303
syscalls:sys_enter_munmap | 14 11 | 70 67
syscalls:sys_enter_futex | 32 32 | 472 426 +-7.29%
syscalls:sys_enter_getpid | 36 33 | 133 130
syscalls:sys_enter_rt_sigproc..| 266 262 | 1786 1654 +-6.32%
syscalls:sys_enter_kill | 122 118 | 878 810 +-6.43%
syscalls:sys_enter_exit | 1 0 | 1 0
syscalls:sys_enter_wait4 | 22 20 | 392 350 +-7.47%
syscalls:sys_enter_mmap | 68 67 | 96 95
syscalls:sys_enter_arch_prctl | 6 6 | 20 27
seconds time elapsed |0.041698 0.0328033 |0.106096 0.097462 +-1.83%
Looking at the deviation, dumping takes about the same time, heh,
but less number of syscalls anyway.
In time-values:
Around 21% faster restoring on single busyloop!
And 8% faster restore for ~10 processes.
Cc: Cyrill Gorcunov <gorcunov at openvz.org>
Dmitry Safonov (11):
kdat: Add test for presence of vdso mapping API
vdso: Introduce vdso mark v3
vdso: Don't drop original VVAR VMA on dump
vdso: Don't miss rt-vvar while searching
vdso: Split parasite_fixup_vdso() once more
vdso: Add a comment about rt-vdso and decreasing nr. of symbols
vdso/zdtm: Add iterative proxification test
vdso/restorer: Don't map compatible vdso if it was unmapped
vdso: Don't parse self-maps if kdat.can_map_vdso
vdso/kdat: Add test for preserving "[vdso]" hint after mremap()
vdso: Don't read pagemap or parse symtable under vdso_hint_reliable
criu/arch/aarch64/include/asm/restorer.h | 2 +
criu/arch/arm/include/asm/restorer.h | 2 +
criu/arch/ppc64/include/asm/restorer.h | 2 +
criu/arch/x86/crtools.c | 65 ++++--
criu/arch/x86/include/asm/restorer.h | 6 +
criu/arch/x86/restorer.c | 10 +
criu/cr-check.c | 10 +
criu/cr-restore.c | 1 +
criu/include/kerndat.h | 2 +
criu/include/parasite-vdso.h | 71 +++---
criu/include/parasite.h | 5 +-
criu/include/restorer.h | 1 +
criu/include/vdso.h | 2 +
criu/kerndat.c | 14 +-
criu/pie/parasite-vdso.c | 48 ++--
criu/pie/parasite.c | 14 +-
criu/pie/restorer.c | 32 ++-
criu/vdso.c | 383 ++++++++++++++++++++-----------
test/jenkins/criu-fault.sh | 1 +
test/zdtm/static/Makefile | 1 +
test/zdtm/static/vdso-proxy.c | 147 ++++++++++++
21 files changed, 586 insertions(+), 233 deletions(-)
create mode 100644 test/zdtm/static/vdso-proxy.c
--
2.13.1
More information about the CRIU
mailing list