[CRIU] [PATCH] restore: Serialize access to last_pid

Sat Nov 16 21:10:14 MSK 2019

On Fri, Nov 15, 2019 at 05:33:34PM +0300, Cyrill Gorcunov wrote:
> When we do clone threads in a later stage of
> restore procedure it may race with helpers
> which do call clone_noasan by self.
>
> In particular when restoring mount points
> we do use try_remount_writable which is called
> via call_helper_process and produce own fork.
>
> Thus to eliminate the race which happens that
> later lets allocate clone_noasan context via
> shared memory on restore and then once
> processes are forked we pass @last_pid_mutex
> to the clone_noasan engine and all subsequent
> calls to clone_noasan will be serialized with
> threads creation.
>
> To be honest I have no clue right now how to
> provide some kind of test for this stuff. Our
> QA team reported the issue as
>
>  | (07.398000) pie: 20768: Error (criu/pie/restorer.c:1823): Unable to create a thread 20770: 20775
>  | (07.398013) pie: 20768: Error (criu/pie/restorer.c:1828): Reread last_pid 20775
>
> where we wrote 20769 into last_pid inside pie thread
> restore , it failed and re-reading the last_pid returned
> 20775 instead of expecting value, which means there is
> other forks happen inbetween without taking last_pid
> lock.
>
> Also I'm in big doubt this architectural solution, don't
> like it much, it was rather a fast fix which address
> the problem. So any other ideas are welcome.
>
> Signed-off-by: Cyrill Gorcunov <gorcunov at gmail.com>
> ---
>  criu/clone-noasan.c         | 63 ++++++++++++++++++++++++++++++++++++-
>  criu/cr-restore.c           |  6 ++++
>  criu/include/clone-noasan.h |  5 +++
>  3 files changed, 73 insertions(+), 1 deletion(-)
>
> diff --git a/criu/clone-noasan.c b/criu/clone-noasan.c
> index 5ca280eb8386..627d924d5346 100644
> --- a/criu/clone-noasan.c
> +++ b/criu/clone-noasan.c
> @@ -1,8 +1,65 @@
>  #include <sched.h>
> +#include <sys/mman.h>
>  #include "common/compiler.h"
> +#include "clone-noasan.h"
>  #include "log.h"
>  #include "common/bug.h"
>
> +#undef LOG_PREFIX
> +#define LOG_PREFIX "clone_noasan: "
> +
> +static struct {
> +	mutex_t		op_mutex;
> +	mutex_t		*clone_mutex;
> +} *context;
> +
> +int clone_noasan_init(void)
> +{
> +	context = mmap(NULL, sizeof(*context), PROT_READ | PROT_WRITE,
> +		       MAP_ANONYMOUS | MAP_SHARED, -1, 0);
> +	if (context == MAP_FAILED) {
> +		pr_perror("Can't allocate context");
> +		return -1;
> +	}
> +
> +	mutex_init(&context->op_mutex);
> +	return 0;
> +}
> +
> +void clone_noasan_fini(void)
> +{
> +	munmap(context, sizeof(*context));
> +	context = NULL;
> +}
> +
> +static inline void context_lock(void)
> +{
> +	if (context && context->clone_mutex)
> +		mutex_lock(context->clone_mutex);
> +}
> +
> +static inline void context_unlock(void)
> +{

context_lock has to return whether a mutex has been locked or not:

locked = context_lock();
....
context_unlock(locked)l

otherwise you can call mutex_unlock for the unlocked mutex or unlock a
mutex which has been locked by someone else.

Actually, all this looks like overdesign...

I think we don't need context and another shared vma.

static struct *last_pid_mutex;

void clone_noasan_init(struct *last_pid_mu) {
	last_pid_mutex = last_pid_mu;
}

and then call clone_noasan_init before forking the root task. Hmm?