[Devel] [PATCH v3] nfs: protect callback execution against per-net callback thread shutdown

Tue Nov 7 17:29:19 MSK 2017

Please, see my comments below.

07.11.2017 12:30, Kirill Tkhai пишет:
> On 07.11.2017 13:39, Stanislav Kinsburskiy wrote:
>> From: Stanislav Kinsburskiy <skinsbursky at parallels.com>
>>
>> Here is the race:
>>
>> CPU #0				CPU#1
>>
>> cleanup_mnt			nfs41_callback_svc (get xprt from the list)
>> nfs_callback_down		...
>> ...				...
>> svc_close_net			...
>> ...				...
>> svc_xprt_free			...
>> svc_bc_sock_free		bc_svc_process
>> kfree(xprt)			svc_process_common
>> 				rqstp->rq_xprt->xpt_ops (use after free)
>>
>> The problem is that per-net SUNRPC transports shutdown is done regardless
>> current callback execution. This is a race leading to transport use-after-free
>> in callback handler.
>> This patch fixes it in stright-forward way. I.e. it protects callback
>> execution with the same mutex used for per-net data creation and destruction.
>> Hopefully, it won't slow down NFS client significantly.
>>
>> https://jira.sw.ru/browse/PSBM-75751
>>
>> v3: Fix mutex deadlock, when shutdown callback waits for thread to exit (with
>> mutex taken), while thread wait for the mutex to take.
>> The idea is to simply check if thread has to exit, if mutex lock has failed.
>> This is a busy loop, but it shouldn't happend often and for long.
>>
>> Signed-off-by: Stanislav Kinsburskiy <skinsbursky at parallels.com>
>> ---
>>  fs/nfs/callback.c |    6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
>> index 0beb275..bbb07e4 100644
>> --- a/fs/nfs/callback.c
>> +++ b/fs/nfs/callback.c
>> @@ -117,7 +117,11 @@ nfs41_callback_svc(void *vrqstp)
>>  		if (try_to_freeze())
>>  			continue;
>>  
>> +		if (!mutex_trylock(&nfs_callback_mutex))
>> +		       continue;
> 
> This looks like a busy loop (that especially bad, because mutex-owner may sleep
> at the moment we are looping).
> 

Well, yes. It a busy loop. And that's not good. But I have to mention, that this busy loop can happen only on umount request.

> It seems the solution may be to change nfs_callback_down() function.
> Can't we flush pending request in nfs_callback_down()?

Well, this already happens. The problem is in race between transport usage (unconditional) and its destruction (also unconditional).
IOW, there should be either reference counting or some critical section.
Looks like the former will need a way more code and thus more error-prone, the latter is uglier, but code should be simpler at first sight.

> Or just delete it from sv_cb_list without handling (does nfs proto allow that)?
> 

Well, this looks like a promising idea. But yet again, there should be some protection against transport usage in shutdown helper.
And it's not yet clear, how to implement it without significant code change...