[CRIU] criu check --extra output and dump failure

Dmitry Safonov 0x7f454c46 at gmail.com
Tue Apr 11 04:43:05 PDT 2017


2017-04-11 11:56 GMT+03:00 Brinkmann, Harald
<Harald.Brinkmann at bst-international.com>:
>
> Hi Dmitry,
>
> On Mon, 2017-04-10 at 17:07 +0300, Dmitry Safonov wrote:
>> 2017-04-10 16:59 GMT+03:00 Dmitry Safonov <0x7f454c46 at gmail.com>:
>> > 2017-04-07 16:23 GMT+03:00 Dmitry Safonov <0x7f454c46 at gmail.com>:
>> >> 2017-04-07 16:14 GMT+03:00 Brinkmann, Harald
>> >> <Harald.Brinkmann at bst-international.com>:
>> >>>
>> >>> Hi Dmitry,
>> >>>
>> >>>> could you check it on the last release (which is 2.12.1)?
>> >>>> I assume that arm port was broken during last changes.
>> >>>> We'll try to check and fix them till v3.0 release.
>> >>>
>> >>> Yes, it is 2.12.1.
>> >>>
>> >>> Is there a "last known good" ARM-version I might try to go back to?
>> >>
>> >> Ok, try 2.11, please.
>> >
>> > Hi Harald,
>> >
>> > Have you managed your problems with CRIU on ARM32?
>> > Because I just tried it on RPI2 board with 4.11.0-rc5 kernel
>> > and it works. Yes, there are some problems with tests, etc,
>> > but simple setsid loop C/R normally, as many tests.
>> >
>> > Please, if you still have problems, provide info about your
>> > environment: gcc, ld versions, as this looks to depend on it.
>> >
>> > Will you be able to recompile criu and run some commands?
>>
>> More info about my env, JFI:
>>
>> [criu]# gcc --version
>> gcc (GCC) 6.3.1 20170306
>> Copyright (C) 2016 Free Software Foundation, Inc.
>> This is free software; see the source for copying conditions.  There is NO
>> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>> [criu]# ld --version
>> GNU ld (GNU Binutils) 2.28.0.20170322
>> Copyright (C) 2017 Free Software Foundation, Inc.
>> This program is free software; you may redistribute it under the terms of
>> the GNU General Public License version 3 or (at your option) a later version.
>> This program has absolutely no warranty.
>>
>> I've checked criu-dev, master and v2.12.1 tag.
>
> I have been away for a couple of days, so I didn't yet have a chance to
> do further tests.
>
> I am on a custom imx6q-based board. We do full cross-compilation from an
> x86-based Linux system. Our versions are quite old compared to yours:
>
>> platform-imx6/selected_toolchain/arm-v7a-linux-gnueabihf-gcc --version
> arm-v7a-linux-gnueabihf-gcc (OSELAS.Toolchain-2014.12.2) 4.9.2
> Copyright (C) 2014 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>
>> platform-imx6/selected_toolchain/arm-v7a-linux-gnueabihf-ld --version
> GNU ld (GNU Binutils) 2.24
> Copyright 2013 Free Software Foundation, Inc.
> This program is free software; you may redistribute it under the terms of
> the GNU General Public License version 3 or (at your option) a later version.
> This program has absolutely no warranty.
>
> Yes, I can recompile everything and run some commands you tell me to.

Good.

> I had to convert the build process to the framework we use. Maybe
> something went wrong there?

That might.
If you did some changes not related to your framework, but for
cross-compiling sake, you may post them on maillist.

> Is there anything in particular that I need to look out for?

Well, there were issues with relocations in parasite on arm.
Your issue may be related.

For x86 and ppc64 relocations in parasite's object are handled this way:
1. link-time: parsing ELF relocation table and writing relocations to
   a header file, which is used in..
2. run-time: applying relocations on remote task by CRIU (adding base
   address where the parasite was loaded)

But for arm32/arm64 we do this yet different less-reliable way:
we expect that toolchain (ld) will produce a valid PIE, so no relocations
done by CRIU in runtime.
I've noticed that `-r' flag to ld on arm doesn't handle relocation for PIE
binaries. That was solved in criu/pie/Makefile with $(LD_R) thing
previously and NO_RELOCS in compel/src/main-host.c on criu-dev
at this moment.

So, the first thing to do is using objdump from your toolchain, check
that relocs were handled during linking:
> [criu]# objdump -d criu/pie/parasite.built-in.o
> criu/pie/parasite.built-in.o:     file format elf32-littlearm
> Disassembly of section .crblob:
> 00000000 <__export_parasite_head_start>:
>        0:    e24f2008     sub    r2, pc, #8
>        4:    e28f0018     add    r0, pc, #24
>        8:    e5900000     ldr    r0, [r0]
>        c:    e28f100c     add    r1, pc, #12
>       10:    e5911000     ldr    r1, [r1]
>       14:    e0811002     add    r1, r1, r2
>       18:    eb0009ea     bl    27c8 <parasite_service>
>       1c:    e7f001f0     .word    0xe7f001f0

So, jump to parasite_service() should look like this, not like
>      18:    ebfffffe     bl    27c8 <parasite_service>

Anyway, I think your issue is something different, not unhandled
relocs. Because unhandled relocation instruction is `ebfffffe'
which is `bl .' (self-jump).
Which would lead to busylooping on dump, not to segfault as
in your log.

So, to blow some more light on this, could you get the address
of segmentation fault and dump log again?
You can enable logging of fatal signals with this:
> echo 1 > /proc/sys/kernel/print-fatal-signals
and get the PC from dmesg afterward.

Then we can check where it is in parasite blob, if it's not kind of
null-pointer dereference.

> Also I don't do the 'make install', because I didn't
> want to add asciidoc and xmlto dependencies on the host just for the
> sake of the documentation.

That's completely ok.

-- 
             Dmitry


More information about the CRIU mailing list