[CRIU] [PATCH 00/11] vDSO rework, part 1/3

Dmitry Safonov dsafonov at virtuozzo.com
Thu Jun 15 19:36:04 MSK 2017


Hi guys,

This is the first part of the set, which is 30 patches long,
so I've split it by 3 parts.
The whole set is available at:
https://github.com/0x7f454c46/criu/commits/wip/new-vdso

=== Full description of the set ===

After facing a bug with mismatched pfn for vdso in vz7 CTs
and proposing a way to solve it and make vdso C/R faster:
https://lists.openvz.org/pipermail/criu/2017-June/037982.html

I've started working on it. Then I've moved vdso symtable to
criu.kdat file (as I previously have proposed to Pasha)...
And found that vDSO code is not very readable and another
fast-fix on the top may roll a yacht over ;-)

During the refactoring I've also meet some bugs, which were fixed:
1. Leaving vdso/vvar after restore in task which didn't have them.
2. Bug with unmapping original vvar vma on the second C/R if
   jump trampolines & rt-vdso is used.
3. Keeping rt-vvar after each C/R with inserted jump trampolines.
4. Bug with ia32 unmapping vdso after trampolines insertion.

For (2), (3), I did introduce rt-vdso mark v3.

There are two new tests:
o Task with unmapped vdso (vdso02)
o Iterative C/R with inserting jump-trampolines (vdso-proxy)

Then, the whole process of vdso's C/R has being reworked with
criu.kdat file keeping in mind:

=== Process of vDSO C/R ===

On *Dump*:
        Before                                  After
        ------                                  -----
Checking vdso's pfn or filling          No need to do this on post v3.16
symtable to find if VMa is vdso         kernels, as "[vdso]" mark stays
or is mishinted by /proc/../maps        in maps file after mremap().
        ------                                  -----
Parsing of self-maps to find
vdso/vvar position for remapping        No need to do this on dump.
into restorer's parking zone.
        ------                                  -----
Parsing vdso's symtable to find         No need to do this on dump.
if image's vdso matches host's vdso.


On *Restore*:
        Before                                  After
        ------                                  -----
Parsing of self-maps to find            Mapping vdso/vvar with
vdso/vvar position for remapping        arch_prctl(MAP_VDSO_*)
into restorer's parking zone.           to save sys_mremap() syscalls.
        ------                                  -----
Parsing vdso's symtable to find         Keeping vdso symtable in criu.kdat.
if image's vdso matches host's vdso.
        ------                                  -----
Checking vdso's pfn or filling
symtable to find if VMa is vdso         No need to do this on restore.
or is mishinted by /proc/../maps

=== Result: some numbers ===

I've tested the performance impact of the patches set on:
4.12.0-rc5+ kernel (needed arch_prctl() is available from v4.9 kernel)
With the current criu-dev af6399cc5 ("vma: Fix badly inherited FD
in filemap_open")
And my `wip/new-vdso' branch on github.
Done on Qemu with 2Gb of memory and 4 CPUs on fedora-26.
Only values those differ are presented.

Mean test results on 30 iterations on busyloop00 (single-process impact)
and session00 (~10 processes impact).
Deviation is from session00 (before) test, as it's the largest there.

*Dump*                         | === busyloop00 === | === session00 === max dev
                               | before:     after: | before:    after:
syscalls:sys_enter_splice      |      13         12 |      96        96
syscalls:sys_enter_fcntl       |      30         29 |     200       199
syscalls:sys_enter_unlinkat    |      25         26 |      31        31
syscalls:sys_enter_pipe        |       6          5 |      45        44
syscalls:sys_enter_newfstatat  |     157        159 |     334       334
syscalls:sys_enter_read        |     100         96 |     378       374
syscalls:sys_enter_write       |      17         11 |      57        56
syscalls:sys_enter_pread64     |      16         17 |     137        73
syscalls:sys_enter_writev      |       2          0 |       4         4
syscalls:sys_enter_sendfile64  |       3          0 |       0         0
syscalls:sys_enter_open        |      85         83 |      83        83
syscalls:sys_enter_openat      |      68         62 |     436       426
syscalls:sys_enter_close       |     172        160 |     655       641
syscalls:sys_enter_brk         |      10          4 |      14        14
syscalls:sys_enter_munmap      |       5          2 |      12         9
syscalls:sys_enter_kcmp        |       5          5 |     111       105 +-0.45%
syscalls:sys_enter_getpid      |      25         24 |      88        87
syscalls:sys_enter_kill        |       2          0 |       2         0
syscalls:sys_enter_exit        |       1          0 |       1         0
syscalls:sys_enter_wait4       |      19         17 |     138       136
syscalls:sys_enter_mmap        |      26         25 |      33        32
syscalls:sys_enter_arch_prctl  |       2          1 |       2         1
seconds time elapsed           |0.085441   0.082569 |0.107197  0.103572 +-2.98%

*Restore*                      | === busyloop00 === | === session00 === max dev
                               | before:     after: | before:    after:
syscalls:sys_enter_dup2        |     245        241 |    1748      1615 +-6.45%
syscalls:sys_enter_fcntl       |     627        617 |    4450      4119 +-6.34%
syscalls:sys_enter_pipe        |       1          0 |       5         4
syscalls:sys_enter_read        |      50         43 |     140       133
syscalls:sys_enter_write       |      16         15 |      33        32
syscalls:sys_enter_pread64     |       1          0 |     176       168
syscalls:sys_enter_open        |     165        163 |     915       849 +-6.17%
syscalls:sys_enter_openat      |      82         78 |     208       204
syscalls:sys_enter_close       |     355        343 |    2074      1933 +-5.44%
syscalls:sys_enter_mremap      |       8          6 |     319       303
syscalls:sys_enter_munmap      |      14         11 |      70        67
syscalls:sys_enter_futex       |      32         32 |     472       426 +-7.29%
syscalls:sys_enter_getpid      |      36         33 |     133       130
syscalls:sys_enter_rt_sigproc..|     266        262 |    1786      1654 +-6.32%
syscalls:sys_enter_kill        |     122        118 |     878       810 +-6.43%
syscalls:sys_enter_exit        |       1          0 |       1         0
syscalls:sys_enter_wait4       |      22         20 |     392       350 +-7.47%
syscalls:sys_enter_mmap        |      68         67 |      96        95
syscalls:sys_enter_arch_prctl  |       6          6 |      20        27
seconds time elapsed           |0.041698  0.0328033 |0.106096  0.097462 +-1.83%

Looking at the deviation, dumping takes about the same time, heh,
but less number of syscalls anyway.

In time-values:
Around 21% faster restoring on single busyloop!
And 8% faster restore for ~10 processes.

Cc: Cyrill Gorcunov <gorcunov at openvz.org>

Dmitry Safonov (11):
  vdso: Check kdat.compat_cr in vdso_fill_compat_symtable
  vdso: Keep {vvar,vdso} sizes in symtable instead of end address
  vdso: Don't park vdso/vvar if restoree doesn't have them
  zdtm/vdso: Add test for restoring task without vdso blob
  vdso/restorer: Simplify vdso/vvar order checking
  vdso/restore: Separate vdso/vvar blobs comparing logic
  vdso: Save vdso/vvar pair order inside vdso_symtable
  vdso: Exclude {vdso,vvar}_start from vdso_symtable
  vdso: Move parsing of self/maps outside vdso_fill_self_symtable()
  vdso: Separate vdso_init() on dump/restore
  vdso/kdat: Store symtable in kerndat_s

 criu/cr-dump.c                  |   4 +-
 criu/cr-restore.c               |  16 +--
 criu/include/asm-generic/vdso.h |  11 +-
 criu/include/kerndat.h          |   9 ++
 criu/include/parasite-vdso.h    |   3 +-
 criu/include/restorer.h         |   2 +-
 criu/include/util-vdso.h        |  38 +++----
 criu/include/vdso.h             |  14 ++-
 criu/kerndat.c                  |   4 +
 criu/pie/parasite-vdso.c        | 130 +++++++++++-----------
 criu/pie/restorer.c             |  27 ++++-
 criu/vdso-compat.c              |  15 ++-
 criu/vdso.c                     | 146 ++++++++++++++++++-------
 test/zdtm/static/Makefile       |   1 +
 test/zdtm/static/vdso02.c       | 231 ++++++++++++++++++++++++++++++++++++++++
 15 files changed, 494 insertions(+), 157 deletions(-)
 create mode 100644 test/zdtm/static/vdso02.c

-- 
2.12.2



More information about the CRIU mailing list