[CRIU] CRIU segfaulting when restoring a process

Fri Aug 19 03:25:36 PDT 2016

On 08/19/2016 01:03 PM, Dmitry Safonov wrote:
> On 08/19/2016 11:16 AM, Nikolay Borisov wrote:
>>
>>
>> On 08/18/2016 06:13 PM, Dmitry Safonov wrote:
>>> On 08/18/2016 04:44 PM, Nikolay Borisov wrote:
>>>> Hello,
>>>
>>> Hi Nikolay,
>>>
>>>> I've built CRIU 2.5 from source + some patches which move around stuff
>>>> in the headers to facilitate compilation on centos 6.7 with external
>>>> glibc 2.19. My CRIU is built the following way:
>>>>
>>>> make -j8 USERCFLAGS="-I/opt/glibc-2.19/include/ -L/opt/glibc-2.19//lib/
>>>> -Wl,-dynamic-linker=/opt/glibc-2.19//lib/ld-linux-x86-64.so.2 -Wl,-
>>>> rpath=/opt/glibc-2.19/lib/:/usr/lib64/:/lib64/"
>>>>
>>>> This way I can happily dump a simple process a la
>>>> https://criu.org/Simple_loop style. However my problems begin when I
>>>> try
>>>> to restore the process, since CRIU segfaults. Here is a restore log as
>>>> well as strace from the restore process:
>>>>
>>>> http://sprunge.us/DcIh - restore.log
>>>> http://sprunge.us/CVBG - strace.log
>>>>
>>>> I'd happy if you could shed some light what I might be causing the
>>>> problem. One thing I thought might be the difference between the way
>>>> the
>>>> process being restore - bash is compiled and criu. Here is a comparison
>>>> how they libraries look like: http://paste.ubuntu.com/23067392/ should
>>>> it matter of course
>>>
>>> So, the problem seems to be in the restorer blob:
>>> Switching to the restorer was sucessful:
>>>> 23537 write(199999, "(00.188706)  23537: task_args:
>>> 0x20000\ntask_args->pid: 23537\ntask_args->nr_threads:
>>> 1\ntask_args->clo"..., 155) = 155
>>>> 23537 getpid()                          = 23537
>>>
>>> which is sys_getpid() in __export_restore_task().
>>> The fault address is very strange one:
>>> 23537 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR,
>>> si_addr=0x12024b48d48} ---
>>>
>>> So, the fail is somwhere between getpid() and sigaction() calls in
>>> __export_restore_task() (criu/pie/restorer.c), as we would see
>>> sys_sigaction() if it has been called.
>>>
>>> Could you give a shot with the next diff and paste strace output before
>>> failure?
>>>
>>> --->8--->8--->8---8<---8<---8<---
>>> diff --git a/criu/pie/restorer.c b/criu/pie/restorer.c
>>> index 7cc735c96870..8503078a82d9 100644
>>> --- a/criu/pie/restorer.c
>>> +++ b/criu/pie/restorer.c
>>> @@ -1123,6 +1123,7 @@ long __export_restore_task(struct
>>> task_restore_args *args)
>>>      n_helpers = args->helpers_n;
>>>      zombies = args->zombies;
>>>      n_zombies = args->zombies_n;
>>> +    sys_getpid();
>>>      *args->breakpoint = rst_sigreturn;
>>>
>>>      ksigfillset(&act.rt_sa_mask);
>>>
>>> --->8--->8--->8---8<---8<---8<---
>>>
>>> I suspect that the problem is in *args pointer to be garbage by some
>>> reason -- if we find another getpid() call in strace log, that's not
>>> the reason and it's somewhere in ksigfillset() (which is unlikely).
>>>
>>> And let me think a while, why *args may have such strange junk inside
>>> (0x12024b48d48).
>>>
>>
>>
>> So here is an strace with your patch applied, it looks a bit different
>> indeed - http://sprunge.us/HBEV
> 
> Hmm, I don't see the second call to getpid(), but *args, which are in
> %rdi looks quite normal (0x20000).
> 
>> I checked the disassembly my compiler produces for the
>> __export_restore_task and it indeed has a prologue, setting up the
>> stack. So that looks good indeed.
>>
>> Also here are the register state at the time the crash occurs with your
>> patch applied:
>>
>> (gdb) info register
>> rax            0x23000    143360
>> rbx            0x20000    131072
>> rcx            0x12dad    77229
>> rdx            0x48000042b0058948    5188147057151805768
>> rsi            0x6fc8a0    7325856
>> rdi            0x20000    131072
>> rbp            0x5b16    0x5b16
>> rsp            0x1eec0    0x1eec0
>> r8             0x1    1
>> r9             0x1    1
>> r10            0x7fffbf9c4d70    140736408079728
>> r11            0x206    518
>> r12            0x1f070    127088
>> r13            0x703e20    7355936
>> r14            0x203c0    132032
>> r15            0x7fffffffde60    140737488346720
>> rip            0x10b27    0x10b27
>> eflags         0x10206    [ PF IF RF ]
>> cs             0x33    51
>> ss             0x2b    43
>> ds             0x0    0
>> es             0x0    0
>> fs             0x0    0
>> gs             0x0    0
> 
> So %rdi is fine, %rsp also, AFAICS, everything looks just fine.
> Could you provide disassembly for __export_restore_task -- till 0xb27
> address for this binary?
> Like: $ objdump -dS criu/pie/restorer.built-in.o
> I belive it's a load from *args, but...

Does this help:  http://paste.ubuntu.com/23069854/ ?

> 
>> My theory here is that since CRIU is compiled with a non-standard glibc,
>> it has started being interpreted by glibc 2.19's interpreter
>> (ld-linux-x86-64.so.2) and when it's time to restore the process which
>> is bash it starts executing it with the new interpreted and my bash
>> indeed doesn't work with it and segfaults. Does this sound possible?
> 
> Well, it's a good theory, but it can't be applied as the application
> wasn't restored fully. So your restoree process yet has not gained the
> control and wasn't resumed to execute. The fail is in restore process,
> not in the result of it.
> So, I think there shouldn't be anything special about using external
> glibc -- all should C/R as normal.

One way to test this would be to create a simple C-based application and
link it against glibc 2.19 and then try to C/R that one. This will
completely debunk the theory that the glibc is the culprit.

> 
> The only concern is restorer binary size -- for me address is:
> 0000000000001520 <__export_restore_task>
> 
> And the size at run-time is:
> (00.035761)  29972: Found bootstrap VMA hint at: 0x10000 (needs ~104K)
> 
> While yours is on a page lesser -- but that's likely just ok.

Regarding your VM question - this is a simple centos 6.7 installation,
running on kernel 4.4 (CRIU check says that everything looks good) and
I've built criu with the following build script:
http://paste.ubuntu.com/23069868/