[CRIU] Re: [PATCH] pidns: remove recursion from free_pid_ns (v3)

Greg KH greg at kroah.com
Wed Oct 10 03:49:23 EDT 2012


On Tue, Oct 09, 2012 at 12:08:31PM -0700, Andrew Morton wrote:
> On Tue, 9 Oct 2012 12:03:00 -0700
> Greg KH <greg at kroah.com> wrote:
> 
> > On Tue, Oct 09, 2012 at 11:48:21AM -0700, Andrew Morton wrote:
> > > On Sat,  6 Oct 2012 23:56:33 +0400
> > > Andrew Vagin <avagin at openvz.org> wrote:
> > > 
> > > > Here is a stack trace of recursion:
> > > > free_pid_ns(parent)
> > > >   put_pid_ns(parent)
> > > >     kref_put(&ns->kref, free_pid_ns);
> > > >       free_pid_ns
> > > > 
> > > > This patch turns recursion into loops.
> > > > 
> > > > pidns can be nested many times, so in case of recursion
> > > > a simple user space program can provoke a kernel panic
> > > > due to exceed of a kernel stack.
> > > 
> > > So we should backport this into earlier kernels.
> > > 
> > > > --- a/include/linux/kref.h
> > > > +++ b/include/linux/kref.h
> > > > @@ -95,6 +95,18 @@ static inline int kref_put(struct kref *kref, void (*release)(struct kref *kref)
> > > >  	return kref_sub(kref, 1, release);
> > > >  }
> > > >  
> > > > +/**
> > > > + * kref_put - decrement refcount for object.
> > > > + * @kref: object.
> > > > + *
> > > > + * Decrement the refcount.
> > > > + * Return 1 if refcount is zero.
> > > > + */
> > > > +static inline int __kref_put(struct kref *kref)
> > > > +{
> > > > +	return atomic_dec_and_test(&kref->refcount);
> > > > +}
> > > 
> > > Greg might be interested in this.
> > > 
> > > It's a pretty specialised thing and perhaps it needs some stern words
> > > in the description explaining when and why it should and shouldn't be
> > > used.
> > > 
> > > I wonder if people might (ab)use this to avoid the "doesn't
> > > have a release function" warning.
> > 
> > Yes they would, please don't do this at all.
> > 
> > In fact, why is it needed?  It doesn't solve anything (if it does,
> > something in the way the kref is being used is wrong.)
> > 
> 
> It's right there in the changelog.  The patch fixes deep
> kref_put->release->kref_put recursion by turning the operation for
> pidns into a loop.

But why would a kref release function ever decrement the same kref
again causing a loop in the first place?

That's what I was referring to.  This strongly sounds like a problem in
how the kref is being used, not in the kref code itself.

Is a kref even the correct thing here?

greg k-h


More information about the CRIU mailing list