[CRIU] ARM: SIGSEGV in parasite code
Alexander Kartashov
alekskartashov at parallels.com
Wed Jun 12 05:50:00 EDT 2013
Hi Chanho,
On 06/12/2013 06:15 AM, Chanho Park wrote:
>> I ran the script dump_test.sh 1000 times and I failed to reproduce the
>> >problem.
> Hmm. I think this weird problem is occurred only to me.
> Did you also use my criu tool?
I tried crtools provided by you as well as compiled from the repository
head.
> If so, can you tell me your environment?
I use a debootstrapped Linux environment running in the ARM Versatile
Express for Cortex-A9 QEMU model
compiled from the repository head. I used to catch spurious SEGFAULT's
with previous versions of QEMU,
however I haven't caught any of them recently.
>
>> >Could you please reproduce the problem with the attached patch applied?
>> >This patch aborts the dumper if the SIGSEGV is intercepted so we may
>> >catch the SIGSEGV in the coredump.
> Yeah. I've attached the coredump and log after applying your patch.
> It might be also generated in the parasite code.
> I also attached socket connection failed log between the tool and pie.
> It didn't SIGSEGV signal.
> But, dumping was failed and also infected program was killed.
> IMHO the parasite code might have some weird problems.
> Can you share your criu tool which compiles with static mode?
Thank you for this dump. I suspect a kernel stack corruption.
Please verify my findings.
(gdb) i r
r0 0xffffffff 4294967295
r1 0xb6fa749c 3069867164
r2 0x0 0
r3 0x10 16
r4 0x0 0
r5 0x0 0
r6 0x0 0
r7 0x0 0
r8 0x0 0
r9 0x0 0
r10 0x0 0
r11 0x0 0
r12 0xb6fa9060 3069874272
sp 0xb6fa9088 0xb6fa9088
lr 0xb6fa46bc -1225111876
pc 0x0 0
cpsr 0x60000010 1610612752
The suspicious thing is that the PC register is zero. Let's
analyze the code pointed by the LR register:
(gdb) x /3i $lr - 8
0xb6fa46b4: mov r2, #0
0xb6fa46b8: bl 0xb6fa4b20
0xb6fa46bc: cmp r0, #0
Analyzing the code at 0xb6fa4b20:
(gdb) x /10i 0xb6fa4b20
0xb6fa4b20: push {r7}
0xb6fa4b24: movw r7, #297 ; 0x129
0xb6fa4b28: svc 0x00000000
0xb6fa4b2c: pop {r7}
0xb6fa4b30: bx lr
This is the parasite wrapper for the syscall recvmsg().
It seems the kernel restored the register PC incorrectly.
By the way I'm unable to explain the following mysteries:
* the value of the register R1 is surely a struct msghdr*
but other registers are clobbered. The most suspicious
register is R0 that contain 0xffffffff that is probably
the result of our mismanipulation with ARM_ORIG_r0.
* The following line in the second log:
pie: __sent ack msg: -1225390624 -1225390624 0
is totally incomprehensible.
Could you please apply the attached patch and report
whether it helps to cope with the spurious SEGFAULT's?
--
Sincerely yours,
Alexander Kartashov
Intern
Core team
www.parallels.com
Skype: aleksandr.kartashov
Email: alekskartashov at parallels.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: remove-arm-orig-r0.patch
Type: text/x-patch
Size: 723 bytes
Desc: not available
URL: <http://lists.openvz.org/pipermail/criu/attachments/20130612/6a85ffe3/attachment.bin>
More information about the CRIU
mailing list