[CRIU] CRIU segfaulting when restoring a process

Dmitry Safonov dsafonov at virtuozzo.com
Thu Aug 18 08:13:52 PDT 2016


On 08/18/2016 04:44 PM, Nikolay Borisov wrote:
> Hello,

Hi Nikolay,

> I've built CRIU 2.5 from source + some patches which move around stuff
> in the headers to facilitate compilation on centos 6.7 with external
> glibc 2.19. My CRIU is built the following way:
>
> make -j8 USERCFLAGS="-I/opt/glibc-2.19/include/ -L/opt/glibc-2.19//lib/
> -Wl,-dynamic-linker=/opt/glibc-2.19//lib/ld-linux-x86-64.so.2 -Wl,-
> rpath=/opt/glibc-2.19/lib/:/usr/lib64/:/lib64/"
>
> This way I can happily dump a simple process a la
> https://criu.org/Simple_loop style. However my problems begin when I try
> to restore the process, since CRIU segfaults. Here is a restore log as
> well as strace from the restore process:
>
> http://sprunge.us/DcIh - restore.log
> http://sprunge.us/CVBG - strace.log
>
> I'd happy if you could shed some light what I might be causing the
> problem. One thing I thought might be the difference between the way the
> process being restore - bash is compiled and criu. Here is a comparison
> how they libraries look like: http://paste.ubuntu.com/23067392/ should
> it matter of course

So, the problem seems to be in the restorer blob:
Switching to the restorer was sucessful:
 > 23537 write(199999, "(00.188706)  23537: task_args: 
0x20000\ntask_args->pid: 23537\ntask_args->nr_threads: 
1\ntask_args->clo"..., 155) = 155
 > 23537 getpid()                          = 23537

which is sys_getpid() in __export_restore_task().
The fault address is very strange one:
23537 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, 
si_addr=0x12024b48d48} ---

So, the fail is somwhere between getpid() and sigaction() calls in
__export_restore_task() (criu/pie/restorer.c), as we would see
sys_sigaction() if it has been called.

Could you give a shot with the next diff and paste strace output before
failure?

--->8--->8--->8---8<---8<---8<---
diff --git a/criu/pie/restorer.c b/criu/pie/restorer.c
index 7cc735c96870..8503078a82d9 100644
--- a/criu/pie/restorer.c
+++ b/criu/pie/restorer.c
@@ -1123,6 +1123,7 @@ long __export_restore_task(struct 
task_restore_args *args)
  	n_helpers = args->helpers_n;
  	zombies = args->zombies;
  	n_zombies = args->zombies_n;
+	sys_getpid();
  	*args->breakpoint = rst_sigreturn;

  	ksigfillset(&act.rt_sa_mask);

--->8--->8--->8---8<---8<---8<---

I suspect that the problem is in *args pointer to be garbage by some
reason -- if we find another getpid() call in strace log, that's not
the reason and it's somewhere in ksigfillset() (which is unlikely).

And let me think a while, why *args may have such strange junk inside
(0x12024b48d48).

-- 
              Dmitry


More information about the CRIU mailing list