[CRIU] [PATCH] restore: Serialize access to last_pid
Cyrill Gorcunov
gorcunov at gmail.com
Sat Nov 16 10:09:55 MSK 2019
On Fri, Nov 15, 2019 at 11:52:50PM +0000, Dmitry Safonov wrote:
> On Fri, 15 Nov 2019 at 14:35, Cyrill Gorcunov <gorcunov at gmail.com> wrote:
> > When we do clone threads in a later stage of
> > restore procedure it may race with helpers
> > which do call clone_noasan by self.
> >
> > In particular when restoring mount points
> > we do use try_remount_writable which is called
> > via call_helper_process and produce own fork.
>
> try_remount_writable() is called from prepare_remaps() which is called only
> in root_prepare_shared(). Could you produce a graph for the race?
No. It is called from open_reg_by_id -> restore_one_task
>
> ===========================================
> THREAD 1 THREAD 2
> ------------------------------------ -------------------------------------
> ? ?
> try_remount_writable() clone_noasan()
>
> Hmm?
>
> [..]
> > To be honest I have no clue right now how to
> > provide some kind of test for this stuff.
>
> Fault-injection?
Nope. One need some additional sleeping/notification functions to reach the
moment where both clone-thread and clone_noasan called.
>
> [..]
> > criu/clone-noasan.c | 63 ++++++++++++++++++++++++++++++++++++-
>
> Please, keep only one function there - the file is compiled without sanitizer
> instrumentation, so it's better to avoid inflating the file with new code.
Well, could move this code to noather place, surely.
>
> [..]
> > @@ -23,9 +80,13 @@ int clone_noasan(int (*fn)(void *), int flags, void *arg)
> > {
> > void *stack_ptr = (void *)round_down((unsigned long)&stack_ptr - 1024, 16);
> > BUG_ON((flags & CLONE_VM) && !(flags & CLONE_VFORK));
> > + int ret;
> > /*
> > * Reserve some bytes for clone() internal needs
> > * and use as stack the address above this area.
> > */
> > - return clone(fn, stack_ptr, flags, arg);
> > + context_lock();
> > + ret = clone(fn, stack_ptr, flags, arg);
> > + context_unlock();
> > + return ret;
> > }
>
> I don't understand - you need to make an atomic section between
> write(last_pid_fd, ...) and clone(), not just during clone().
The helpers clone-calls do not use last pid, insted we pass last_pid
mutex into noasan context and then we have this
clone_noasan pie-code-thread
mutex_lock(last_pid) mutex_lock(last_pid)
clone RUN_CLONE
mutex_unlock(last_pid) mutex_unlock(last_pid)
More information about the CRIU
mailing list