[Devel] [PATCH rh7 v2] cgroup: rework reference acquisition for cgroup_find_inode

Kirill Tkhai ktkhai at virtuozzo.com
Thu Nov 12 18:40:45 MSK 2020


On 12.11.2020 18:39, Pavel Tikhomirov wrote:
> 
> 
> On 11/12/20 6:17 PM, Kirill Tkhai wrote:
>> On 12.11.2020 18:08, Andrey Zhadchenko wrote:
>>> On Thu, 12 Nov 2020 17:29:21 +0300
>>> Kirill Tkhai <ktkhai at virtuozzo.com> wrote:
>>>
>>> Hi, Kirill,
>>>
>>>> Hi, Andrey,
>>>>
>>>> On 11.11.2020 10:32, Andrey Zhadchenko wrote:
>>>>> Use more generic igrab instead of atomic inc. Move cgroup_hash_del
>>>>> to eviction stage to avoid deadlock.
>>>>>
>>>>> Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>
>>>>> ---
>>>>>
>>>>> v2: adjusted function call order in cgroup_evict_inode to match
>>>>> existing code
>>>>>
>>>>>   kernel/cgroup.c | 25 ++++++++-----------------
>>>>>   1 file changed, 8 insertions(+), 17 deletions(-)
>>>>>
>>>>> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
>>>>> index 27d7a5e..8c2cef8 100644
>>>>> --- a/kernel/cgroup.c
>>>>> +++ b/kernel/cgroup.c
>>>>> @@ -1522,21 +1522,10 @@ static struct inode
>>>>> *cgroup_find_inode(unsigned long fh[2], char take_ref) struct inode
>>>>> *ret = NULL;
>>>>>       spin_lock(&cgroup_inode_table_lock);
>>>>> -    item = cgroup_find_item_no_lock(fh);
>>>>>   -    /*
>>>>> -     * If we need to increase refcount, we should be aware of
>>>>> possible
>>>>> -     * deadlock. Another thread may have started deleting this
>>>>> inode:
>>>>> -     * iput->iput_final->cgroup_delete_inode->cgroup_hash_del
>>>>> -     * If we just call igrab, it will try to take i_lock and
>>>>> this will
>>>>> -     * result in deadlock, because deleting thread has already
>>>>> taken it
>>>>> -     * and waits on cgroup_inode_table_lock to find inode in
>>>>> hashtable.
>>>>> -     *
>>>>> -     * If i_count is zero, someone is deleting it -> skip.
>>>>> -     */
>>>>> -    if (take_ref && item)
>>>>> -        if (!atomic_inc_not_zero(&item->inode->i_count))
>>>>> -            item = NULL;
>>>>> +    item = cgroup_find_item_no_lock(fh);
>>>>> +    if (item && take_ref && !igrab(item->inode))
>>>>> +        item = NULL;
>>>>
>>>> Here you call igrab() under cgroup_inode_table_lock, so the check for
>>>> (inode->i_state & I_FREEING|I_WILL_FREE) is done under that lock and
>>>> i_lock.
>>>>
>>>> But clear_inode() sets "inode->i_state = I_FREEING | I_CLEAR" without
>>>> any lock. This place does not look obvious. Isn't there some problem?
>>>>
>>>
>>> iput_final() sets inode to I_FREEING under i_lock, release it and calls
>>> evict(), which calls cgroup_evict_inode() via
>>> inode->i_sb->s_op->evict().
>>
>> What is about inodes in LRU?
> 
> I thought iput_final:drop is always true for cgroupfs. And there is no lru.

That is OK.
 
>>
>> Say, iput_final() moves inode to LRU with zero counter. I_FREEING is not
>> set there. Later prune_icache_sb() operates with that inode with zero counter.
>>
>> Doesn't dispose_list()->evict()->cgroup_evict_inode() race with cgroup_find_inode()?
>>
>>> If I get it right, this seems to be fine. As soon as iput_final starts
>>> working igrab shouldn't be able to take ref on this inode.
>>>
>>>>>         spin_unlock(&cgroup_inode_table_lock);
>>>>>   @@ -1634,15 +1623,17 @@ static const struct export_operations
>>>>> cgroup_export_ops = { .fh_to_dentry    = cgroup_fh_to_dentry,
>>>>>   };
>>>>>   -static int cgroup_delete_inode(struct inode *inode)
>>>>> +static void cgroup_evict_inode(struct inode *inode)
>>>>>   {
>>>>> +    truncate_inode_pages_final(&inode->i_data);
>>>>> +    clear_inode(inode);
>>>>>       cgroup_hash_del(inode);
>>>>> -    return generic_delete_inode(inode);
>>>>>   }
>>>>>     static const struct super_operations cgroup_ops = {
>>>>>       .statfs = simple_statfs,
>>>>> -    .drop_inode = cgroup_delete_inode,
>>>>> +    .drop_inode = generic_delete_inode,
>>>>> +    .evict_inode = cgroup_evict_inode,
>>>>>       .show_options = cgroup_show_options,
>>>>>   #ifdef CONFIG_VE
>>>>>       .show_path = cgroup_show_path,
>>>>>    
>>>>
>>>
>>
> 



More information about the Devel mailing list