[CRIU] [PATCH 10/17] proc/fd: In proc_readfd_common use fnext_task
Eric W. Biederman
ebiederm at xmission.com
Tue Aug 18 06:42:44 MSK 2020
Al Viro <viro at zeniv.linux.org.uk> writes:
> On Mon, Aug 17, 2020 at 05:04:18PM -0500, Eric W. Biederman wrote:
>> When discussing[1] exec and posix file locks it was realized that none
>> of the callers of get_files_struct fundamentally needed to call
>> get_files_struct, and that by switching them to helper functions
>> instead it will both simplify their code and remove unnecessary
>> increments of files_struct.count. Those unnecessary increments can
>> result in exec unnecessarily unsharing files_struct which breaking
>> posix locks, and it can result in fget_light having to fallback to
>> fget reducing system performance.
>>
>> Using fnext_task simplifies proc_readfd_common, by moving the checking
>> for the maximum file descritor into the generic code, and by
>> remvoing the need for capturing and releasing a reference on
>> files_struct.
>
> Yecchhh... So you keep locking/unlocking the damn thing for each
> descriptor?
For each descriptor found yes.
In practice there is a copy_to_user in between each fnext_task call which
limits what can be done.
The options are:
- Batching
- Adding a second count to files_struct for observers
- Or making exit_file/put_files_struct always put files_struct in
an rcu safe way.
My gut says if we are going to care enough about the readers in proc
to do something making the entire thing rcu safe is probably the way
to go.
I don't know the real world cost of modifing exit_files/put_task_struct
to put the files_struct in an rcu safe way, but it looks like the only
case that matters is when a processes is exiting, and so it probably
is in the noise.
Any objection to putting an rcu_head into files_struct so that it can be
rcu freed?
There might even be a way to split exit_files so they are all closed
in the current location, and then the data structure is freed
in delayed put_task_struct.
I am definitely willing to look at it. Do we think there would be enough
traffic on task_lock from /proc/<pid>/fd access to make it work doing?
Eric
More information about the CRIU
mailing list