[CRIU] CRIU segfaulting when restoring a process
Dmitry Safonov
dsafonov at virtuozzo.com
Thu Aug 18 09:28:26 PDT 2016
On 08/18/2016 06:58 PM, Dmitry Safonov wrote:
> On 08/18/2016 06:13 PM, Dmitry Safonov wrote:
>> On 08/18/2016 04:44 PM, Nikolay Borisov wrote:
>>> Hello,
>>
>> Hi Nikolay,
>>
>>> I've built CRIU 2.5 from source + some patches which move around stuff
>>> in the headers to facilitate compilation on centos 6.7 with external
>>> glibc 2.19. My CRIU is built the following way:
>>>
>>> make -j8 USERCFLAGS="-I/opt/glibc-2.19/include/ -L/opt/glibc-2.19//lib/
>>> -Wl,-dynamic-linker=/opt/glibc-2.19//lib/ld-linux-x86-64.so.2 -Wl,-
>>> rpath=/opt/glibc-2.19/lib/:/usr/lib64/:/lib64/"
>>>
>>> This way I can happily dump a simple process a la
>>> https://criu.org/Simple_loop style. However my problems begin when I try
>>> to restore the process, since CRIU segfaults. Here is a restore log as
>>> well as strace from the restore process:
>>>
>>> http://sprunge.us/DcIh - restore.log
>>> http://sprunge.us/CVBG - strace.log
>>>
>>> I'd happy if you could shed some light what I might be causing the
>>> problem. One thing I thought might be the difference between the way the
>>> process being restore - bash is compiled and criu. Here is a comparison
>>> how they libraries look like: http://paste.ubuntu.com/23067392/ should
>>> it matter of course
>>
>> So, the problem seems to be in the restorer blob:
>> Switching to the restorer was sucessful:
>>> 23537 write(199999, "(00.188706) 23537: task_args:
>> 0x20000\ntask_args->pid: 23537\ntask_args->nr_threads:
>> 1\ntask_args->clo"..., 155) = 155
>>> 23537 getpid() = 23537
>>
>> which is sys_getpid() in __export_restore_task().
>> The fault address is very strange one:
>> 23537 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR,
>> si_addr=0x12024b48d48} ---
>>
>> So, the fail is somwhere between getpid() and sigaction() calls in
>> __export_restore_task() (criu/pie/restorer.c), as we would see
>> sys_sigaction() if it has been called.
>>
>> Could you give a shot with the next diff and paste strace output before
>> failure?
>>
>> --->8--->8--->8---8<---8<---8<---
>> diff --git a/criu/pie/restorer.c b/criu/pie/restorer.c
>> index 7cc735c96870..8503078a82d9 100644
>> --- a/criu/pie/restorer.c
>> +++ b/criu/pie/restorer.c
>> @@ -1123,6 +1123,7 @@ long __export_restore_task(struct
>> task_restore_args *args)
>> n_helpers = args->helpers_n;
>> zombies = args->zombies;
>> n_zombies = args->zombies_n;
>> + sys_getpid();
>> *args->breakpoint = rst_sigreturn;
>>
>> ksigfillset(&act.rt_sa_mask);
>>
>> --->8--->8--->8---8<---8<---8<---
>>
>> I suspect that the problem is in *args pointer to be garbage by some
>> reason -- if we find another getpid() call in strace log, that's not
>> the reason and it's somewhere in ksigfillset() (which is unlikely).
>>
>> And let me think a while, why *args may have such strange junk inside
>> (0x12024b48d48).
>>
>
> Hmm, another idea, what may be wrong is stack pointer.
> We're right after entering the restorer and compiler hasn't yet
> accessed stack even to form a stack frame.
> So, if stackframe is corrupted, we can't save the result of getpid()
> and we have the same result.
> It would be worth, if you provide the registers state at segfault
> moment. To do this, run:
> $ ulimit -c unlimited
> which will allow to save core dump files,
> $ strace criu restore -vvvv #... the usual args
> $ gdb core.<pid>
> to open with gdb saved core file in the same directory
>> info registers
> to print registers state at crash moment.
>
So, my toolchain produces stack frame on restorer entry, and I think
your also should (there is nothing special, just a fuction frame):
0000000000000fa0 <__export_restore_thread>:
/*
* Threads restoration via sigreturn. Note it's locked
* routine and calls for unlock at the end.
*/
long __export_restore_thread(struct thread_restore_args *args)
{
fa0: 41 55 push %r13
fa2: 41 54 push %r12
fa4: 55 push %rbp
fa5: 53 push %rbx
fa6: 48 89 fb mov %rdi,%rbx
fa9: 48 83 ec 18 sub $0x18,%rsp
So, I think, the second theory is just a bs.
Anyway, registers state may be helpful.
--
Dmitry
More information about the CRIU
mailing list