[Devel] [PATCH] fs: sync: prevent possible deadlock in sync_collect_filesystems

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Fri Mar 5 10:44:14 MSK 2021



On 3/5/21 10:43 AM, Pavel Tikhomirov wrote:
> 
> 
> On 3/4/21 6:59 PM, Alexander Mikhalitsyn wrote:
>> There are two problems:
>> 1. kmalloc should be called with GFP_NOWAIT to prevent sleep
>> under spinlock taken
>>
>> 2. spin_lock(&sb_lock) under spinlock(mnt_ns_list)
>> There we have a small probability of deadlock, to overcome
>> this let's use spin_trylock(&sb_lock) here
> Bug link is missing, Probably PSBM-126568?
> 
>>
>> Cc: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
>> Cc: Vasily Averin <vvs at virtuozzo.com>
>> Signed-off-by: Alexander Mikhalitsyn 
>> <alexander.mikhalitsyn at virtuozzo.com>
>> ---
>>   fs/sync.c | 11 +++++++++--
>>   1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/sync.c b/fs/sync.c
>> index 1cf0f0b38824..b2b72ac75615 100644
>> --- a/fs/sync.c
>> +++ b/fs/sync.c
>> @@ -150,7 +150,7 @@ static int sync_collect_filesystems(struct 
>> ve_struct *ve, struct list_head *sync
>>           if (sync_filesystem_collected(sync_list, mnt->mnt.mnt_sb))
>>               continue;
>> -        ss = kmalloc(sizeof(*ss), GFP_KERNEL);
>> +        ss = kmalloc(sizeof(*ss), GFP_NOWAIT);
> 
> I don't think it's a good thing to just replace all GFP_KERNEL to 
> GFP_NOWAIT, I'm not an expert but I would expect it to eat more memory 
> reserves. We can do it only if there is nothing else to do.
> 
> But we have an alternative here: Use mnt_list_next.
> 
> We need to hold lock_ns_list(mnt_ns); only when mnt is "cursor" all 
> other mounts in list can be removed from it only under 
> down_write(&namespace_sem) from which we are already protected here with 
> down_read(&namespace_sem). So we need to take lock_ns_list(mnt_ns) only 
> for iterating to next mnt entry over probably several "cursor" ones, you 
> can safely to this by using mnt_list_next() helper.

Note: this alternative aproach fixes both (1) and (2).

> 
>>           if (ss == NULL) {
>>               ret = -ENOMEM;
>>               break;
>> @@ -161,7 +161,14 @@ static int sync_collect_filesystems(struct 
>> ve_struct *ve, struct list_head *sync
>>            * alive. And it means, that we can safely increase it's usage
>>            * counter.
>>            */
>> -        spin_lock(&sb_lock);
>> +        if (unlikely(!spin_trylock(&sb_lock))) {
>> +            /*
>> +             * If we can't take spinlock then just skip that mount to 
>> prevent
>> +             * possible deadlocks here.
>> +             */
>> +            kfree(ss);
>> +            continue;
>> +        }
>>           ss->sb->s_count++;
>>           spin_unlock(&sb_lock);
>>           list_add_tail(&ss->list, sync_list);
>>
> 

-- 
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.


More information about the Devel mailing list