[Devel] Re: [RFC v10][PATCH 08/13] Dump open file descriptors
Linus Torvalds
torvalds at linux-foundation.org
Mon Dec 1 13:02:09 PST 2008
On Mon, 1 Dec 2008, Dave Hansen wrote:
>
> Why is this done in two steps? It first grabs a list of fd numbers
> which needs to be validated, then goes back and turns those into 'struct
> file's which it saves off. Is there a problem with doing that
> fd->'struct file' conversion under the files->file_lock?
Umm, why do we even worry about this?
Wouldn't it be much better to make sure that all other threads are
stopped before we snapshot, and if we cannot account for some thread (ie
there's some elevated count in the fs/files/mm structures that we cannot
see from the threads we've stopped), just refuse to dump.
There is no sane dump from a multi-threaded app that shares resources
without that kind of serialization _anyway_, so why even try?
In other words: any races in dumping are fundamental _bugs_ in the dumping
at a much higher level. There's absolutely no point in trying to make
something like "dump open fd's" be race-free, because if there are other
people that are actively accessing the 'files' structure concurrently, you
had a much more fundamental bug in the first place!
So do things more like the core-dumping does: make sure that all other
threads are quiescent first!
Linus
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list