[Devel] [PATCH rh7 v2] cgroup: rework reference acquisition for cgroup_find_inode

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Thu Nov 12 18:39:08 MSK 2020



On 11/12/20 6:17 PM, Kirill Tkhai wrote:
> On 12.11.2020 18:08, Andrey Zhadchenko wrote:
>> On Thu, 12 Nov 2020 17:29:21 +0300
>> Kirill Tkhai <ktkhai at virtuozzo.com> wrote:
>>
>> Hi, Kirill,
>>
>>> Hi, Andrey,
>>>
>>> On 11.11.2020 10:32, Andrey Zhadchenko wrote:
>>>> Use more generic igrab instead of atomic inc. Move cgroup_hash_del
>>>> to eviction stage to avoid deadlock.
>>>>
>>>> Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>
>>>> ---
>>>>
>>>> v2: adjusted function call order in cgroup_evict_inode to match
>>>> existing code
>>>>
>>>>   kernel/cgroup.c | 25 ++++++++-----------------
>>>>   1 file changed, 8 insertions(+), 17 deletions(-)
>>>>
>>>> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
>>>> index 27d7a5e..8c2cef8 100644
>>>> --- a/kernel/cgroup.c
>>>> +++ b/kernel/cgroup.c
>>>> @@ -1522,21 +1522,10 @@ static struct inode
>>>> *cgroup_find_inode(unsigned long fh[2], char take_ref) struct inode
>>>> *ret = NULL;
>>>>   	spin_lock(&cgroup_inode_table_lock);
>>>> -	item = cgroup_find_item_no_lock(fh);
>>>>   
>>>> -	/*
>>>> -	 * If we need to increase refcount, we should be aware of
>>>> possible
>>>> -	 * deadlock. Another thread may have started deleting this
>>>> inode:
>>>> -	 * iput->iput_final->cgroup_delete_inode->cgroup_hash_del
>>>> -	 * If we just call igrab, it will try to take i_lock and
>>>> this will
>>>> -	 * result in deadlock, because deleting thread has already
>>>> taken it
>>>> -	 * and waits on cgroup_inode_table_lock to find inode in
>>>> hashtable.
>>>> -	 *
>>>> -	 * If i_count is zero, someone is deleting it -> skip.
>>>> -	 */
>>>> -	if (take_ref && item)
>>>> -		if (!atomic_inc_not_zero(&item->inode->i_count))
>>>> -			item = NULL;
>>>> +	item = cgroup_find_item_no_lock(fh);
>>>> +	if (item && take_ref && !igrab(item->inode))
>>>> +		item = NULL;
>>>
>>> Here you call igrab() under cgroup_inode_table_lock, so the check for
>>> (inode->i_state & I_FREEING|I_WILL_FREE) is done under that lock and
>>> i_lock.
>>>
>>> But clear_inode() sets "inode->i_state = I_FREEING | I_CLEAR" without
>>> any lock. This place does not look obvious. Isn't there some problem?
>>>
>>
>> iput_final() sets inode to I_FREEING under i_lock, release it and calls
>> evict(), which calls cgroup_evict_inode() via
>> inode->i_sb->s_op->evict().
> 
> What is about inodes in LRU?

I thought iput_final:drop is always true for cgroupfs. And there is no lru.

> 
> Say, iput_final() moves inode to LRU with zero counter. I_FREEING is not
> set there. Later prune_icache_sb() operates with that inode with zero counter.
> 
> Doesn't dispose_list()->evict()->cgroup_evict_inode() race with cgroup_find_inode()?
> 
>> If I get it right, this seems to be fine. As soon as iput_final starts
>> working igrab shouldn't be able to take ref on this inode.
>>
>>>>   
>>>>   	spin_unlock(&cgroup_inode_table_lock);
>>>>   
>>>> @@ -1634,15 +1623,17 @@ static const struct export_operations
>>>> cgroup_export_ops = { .fh_to_dentry	= cgroup_fh_to_dentry,
>>>>   };
>>>>   
>>>> -static int cgroup_delete_inode(struct inode *inode)
>>>> +static void cgroup_evict_inode(struct inode *inode)
>>>>   {
>>>> +	truncate_inode_pages_final(&inode->i_data);
>>>> +	clear_inode(inode);
>>>>   	cgroup_hash_del(inode);
>>>> -	return generic_delete_inode(inode);
>>>>   }
>>>>   
>>>>   static const struct super_operations cgroup_ops = {
>>>>   	.statfs = simple_statfs,
>>>> -	.drop_inode = cgroup_delete_inode,
>>>> +	.drop_inode = generic_delete_inode,
>>>> +	.evict_inode = cgroup_evict_inode,
>>>>   	.show_options = cgroup_show_options,
>>>>   #ifdef CONFIG_VE
>>>>   	.show_path = cgroup_show_path,
>>>>    
>>>
>>
> 

-- 
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.


More information about the Devel mailing list