[CRIU] [PATCH] s390: Prevent GOT relocations

Thu Jul 20 21:47:00 MSK 2017

Am Wed, 19 Jul 2017 17:36:26 +0200
schrieb Adrian Reber <areber at redhat.com>:

> On Wed, Jul 19, 2017 at 11:00:15AM +0200, Michael Holzheu wrote:
> > > Thanks for helping to figure this out. It was not only sys_socket().
> > > Basically all syscalls from
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=977108f89c989b1eeb5c8d938e1e71913391eb5f
> > > 
> > > which were present in
> > > compel/arch/s390/plugins/std/syscalls/syscall-s390.tbl had to be added
> > > to the kernel. Now zdtm seems to be quite happy:
> > 
> > That's good news.
> >  
> > > $ ./zdtm.py run -a -f h --keep-going -x zdtm/static/s390x_mmap_high
> > > [...]
> > > 
> > > ################### 1 TEST(S) FAILED (TOTAL 315/SKIPPED 115) ###################
> > >  * zdtm/static/stopped(h)
> > > ##################################### FAIL #####################################
> > > 
> > > I have to excldue zdtm/static/s390x_mmap_high as that test case just
> > > hangs. zdtm says ==== ALARM ==== and then a process list.
> > 
> > I assume you have included already applied the kernel patch
> > ee71d16d22 ("s390/mm: make TASK_SIZE independent from the number
> > of page table levels").
> > 
> > But it should not hang without the patch either - so we have to look into
> > this issue.
> 
> Just saw an oops when running s390x_mmap_high:
> 
> [ 3279.609740] kernel BUG at mm/rmap.c:1144!
> [ 3279.609777] illegal operation: 0001 [#1] SMP
> [ 3279.609781] Modules linked in: binfmt_misc vmur ip_tables xfs libcrc32c dasd_
> fba_mod qeth_l2 dasd_eckd_mod dasd_mod lcs qeth ctcm qdio fsm ccwgroup dm_mirror
>  dm_region_hash dm_log dm_mod
> [ 3279.609801] CPU: 1 PID: 1581 Comm: s390x_mmap_high Not tainted 3.10.0-693.el7
> .criu.s390x #1
> [ 3279.609804] task: 000000007f3a6348 ti: 000000007372c000 task.ti: 000000007372
> c000
> [ 3279.609807] Krnl PSW : 0704c00180000000 000000000029ba22 (__page_set_anon_rma
> p+0x92/0xa8)
> [ 3279.609818]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:
> 3
> Krnl GPRS: 00000000fffffffb 0000000000000000 000003d101dcd600 000000007db285b0  
> [ 3279.609826]            001ffffffffff000 0000000000000000 0000000000000000 000
> 0000000000000
> [ 3279.609829]            000000007db285b0 0000000077358215 00000000017b1c00 000
> 003d101dcd600
> [ 3279.609833]            000000007c883fe8 000003d101dcd600 000000007372fdb0 000
> 000007372fd90
> [ 3279.609848] Krnl Code: 000000000029ba14: e31010000004        lg      %r1,0(%r
> 1)
>            000000000029ba1a: a7f4ffe4           brc     15,29b9e2
>           #000000000029ba1e: a7f40001           brc     15,29ba20
>           >000000000029ba22: b9040023           lgr     %r2,%r3
>            000000000029ba26: b9040034           lgr     %r3,%r4
>            000000000029ba2a: c0e500008d53       brasl   %r14,2ad4d0
>            000000000029ba30: a7f4ffeb           brc     15,29ba06
>            000000000029ba34: 0707               bcr     0,%r7
> [ 3279.609859] Call Trace:
> [ 3279.609860] ([<00000000017b1c00>] 0x17b1c00)
> [ 3279.609862]  [<0000000000291d64>] handle_mm_fault+0xb3c/0x1050
> [ 3279.609864]  [<00000000006c8060>] do_dat_exception+0x1f8/0x348
> [ 3279.609869]  [<00000000006c619e>] pgm_check_handler+0x16e/0x172
> [ 3279.609872]  [<0000000080001ffa>] 0x80001ffa
> [ 3279.609873] Last Breaking-Event-Address:
> [ 3279.609874]  [<000000000029ba1e>] __page_set_anon_rmap+0x8e/0xa8
> [ 3279.609876]
> [ 3279.609879] Kernel panic - not syncing: Fatal exception: panic_on_oops
> 
> This means that I probably backported the mentioned patch not correctly ;-)

Yes ... and at least the test case was good enough to find this out ;-)

Michael