[CRIU] Re: [PATCH] pidns: remove recursion from free_pid_ns (v3)

Andrew Morton akpm at linux-foundation.org
Tue Oct 9 14:48:21 EDT 2012


On Sat,  6 Oct 2012 23:56:33 +0400
Andrew Vagin <avagin at openvz.org> wrote:

> Here is a stack trace of recursion:
> free_pid_ns(parent)
>   put_pid_ns(parent)
>     kref_put(&ns->kref, free_pid_ns);
>       free_pid_ns
> 
> This patch turns recursion into loops.
> 
> pidns can be nested many times, so in case of recursion
> a simple user space program can provoke a kernel panic
> due to exceed of a kernel stack.

So we should backport this into earlier kernels.

> --- a/include/linux/kref.h
> +++ b/include/linux/kref.h
> @@ -95,6 +95,18 @@ static inline int kref_put(struct kref *kref, void (*release)(struct kref *kref)
>  	return kref_sub(kref, 1, release);
>  }
>  
> +/**
> + * kref_put - decrement refcount for object.
> + * @kref: object.
> + *
> + * Decrement the refcount.
> + * Return 1 if refcount is zero.
> + */
> +static inline int __kref_put(struct kref *kref)
> +{
> +	return atomic_dec_and_test(&kref->refcount);
> +}

Greg might be interested in this.

It's a pretty specialised thing and perhaps it needs some stern words
in the description explaining when and why it should and shouldn't be
used.

I wonder if people might (ab)use this to avoid the "doesn't
have a release function" warning.

>  static inline int kref_put_mutex(struct kref *kref,
>  				 void (*release)(struct kref *kref),
>  				 struct mutex *lock)
> diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
> index 6144bab..b051fa6 100644
> --- a/kernel/pid_namespace.c
> +++ b/kernel/pid_namespace.c
> @@ -138,11 +138,19 @@ void free_pid_ns(struct kref *kref)
>  
>  	ns = container_of(kref, struct pid_namespace, kref);
>  
> -	parent = ns->parent;
> -	destroy_pid_namespace(ns);
> +	while (1) {
> +		parent = ns->parent;
> +		destroy_pid_namespace(ns);
>  
> -	if (parent != NULL)
> -		put_pid_ns(parent);
> +		if (parent == &init_pid_ns)
> +			break;
> +
> +		/* kref_put cannot be used for avoiding recursion */
> +		if (__kref_put(&parent->kref) == 0)
> +			break;
> +
> +		ns = parent;
> +	}
>  }
>  
>  void zap_pid_ns_processes(struct pid_namespace *pid_ns)



More information about the CRIU mailing list