[CRIU] Restarting ERESTART_RESTARTBLOCK
Andrei Vagin
avagin at virtuozzo.com
Tue Feb 6 21:10:35 MSK 2018
On Mon, Feb 05, 2018 at 02:04:59AM -0500, Rayson Ho wrote:
> I'm trying to debug a GO lang restart deadlock in the GO runtime, and
> I was narrowing down to the handling of ERESTART_RESTARTBLOCK.
Could you give a step to reproduce the problem?
>
> According to the "Understanding the Linux Kernel" book:
>
> ============================================================================
> ... restart_block field in the current's thread_info structure with
> the address of a special service routine to be used when restarting,
> and returns -ERESTART_RESTARTBLOCK if interrupted. The
> sys_restart_syscall( ) service routine just executes the special
> nanosleep( )'s service routine, which adjusts the delay to consider
> the time elapsed between the invocation of the original system call
> and its restarting.
> ============================================================================
>
> My question is, it does not seem like CRIU dumps the whole thread_info
> structure in dump_thread_common(), so would the ERESTART_RESTARTBLOCK
> mechanism still be correct during restore?
Unfotunatly a restart block is saved in a kernel and can't be dumped and
restored form a user-space.
We met this problem before and we decided to return EINTR instead
of ERESTART_RESTARTBLOCK.
commit dd71cca58ada1b1b930c17201373cb8de569c97f
Author: Oleg Nesterov <oleg at redhat.com>
Date: Thu Mar 26 14:10:32 2015 +0300
dump/x86: sanitize the ERESTART_RESTARTBLOCK -> EINTR transition
1. The -ERESTART_RESTARTBLOCK case in get_task_regs() depends on kernel
internals too much, and for no reason. We shouldn't rely on fact that
a) we are going to do sigreturn() and b) restore_sigcontext() always
sets restart_block->fn = do_no_restart_syscall which returns -EINTR.
Just change this code to enforce -EINTR after restore, this is what
we actually want until we teach criu to handle ERESTART_RESTARTBLOCK.
2. Add pr_warn() to make the potential bug-reports more understandable,
a sane application should handle -EINTR correctly but this is not
always the case.
Signed-off-by: Oleg Nesterov <oleg at redhat.com>
Acked-by: Cyrill Gorcunov <gorcunov at openvz.org>
Acked-by: Andrew Vagin <avagin at parallels.com>
Signed-off-by: Pavel Emelyanov <xemul at parallels.com>
>
> Thanks,
> Rayson
>
> ==================================================
> Open Grid Scheduler - The Official Open Source Grid Engine
> http://gridscheduler.sourceforge.net/
> http://gridscheduler.sourceforge.net/GridEngine/GridEngineCloud.html
> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu
More information about the CRIU
mailing list