[Devel] Re: [RFC v10][PATCH 08/13] Dump open file descriptors

Linus Torvalds torvalds at linux-foundation.org
Mon Dec 1 13:02:09 PST 2008



On Mon, 1 Dec 2008, Dave Hansen wrote:
> 
> Why is this done in two steps?  It first grabs a list of fd numbers
> which needs to be validated, then goes back and turns those into 'struct
> file's which it saves off.  Is there a problem with doing that
> fd->'struct file' conversion under the files->file_lock?

Umm, why do we even worry about this?

Wouldn't it be much better to make sure that all other threads are 
stopped before we snapshot, and if we cannot account for some thread (ie 
there's some elevated count in the fs/files/mm structures that we cannot 
see from the threads we've stopped), just refuse to dump.

There is no sane dump from a multi-threaded app that shares resources 
without that kind of serialization _anyway_, so why even try?

In other words: any races in dumping are fundamental _bugs_ in the dumping 
at a much higher level. There's absolutely no point in trying to make 
something like "dump open fd's" be race-free, because if there are other 
people that are actively accessing the 'files' structure concurrently, you 
had a much more fundamental bug in the first place!

So do things more like the core-dumping does: make sure that all other 
threads are quiescent first!

			Linus
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers




More information about the Devel mailing list