[Devel] [RFC PATCH vz9 v6 20/62] dm-ploop: reduce BAT accesses on discard completion

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Tue Jan 21 08:40:24 MSK 2025



On 1/20/25 21:33, Alexander Atanasov wrote:
> On 20.01.25 6:15, Pavel Tikhomirov wrote:
>>
>>
>> On 12/6/24 05:55, Alexander Atanasov wrote:
>>> From: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>
>>>
>>> Drop extra ploop_cluster_is_in_top_delta() as we are planning to
>>> access BAT anyway
>>>
>>> https://virtuozzo.atlassian.net/browse/VSTOR-91817
>>> Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>
>>> ---
>>>   drivers/md/dm-ploop-map.c | 28 ++++++++++++----------------
>>>   1 file changed, 12 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/drivers/md/dm-ploop-map.c b/drivers/md/dm-ploop-map.c
>>> index ad7ca7d43dfc..b00dd364072d 100644
>>> --- a/drivers/md/dm-ploop-map.c
>>> +++ b/drivers/md/dm-ploop-map.c
>>> @@ -711,12 +711,15 @@ static void ploop_complete_cow(struct ploop_cow 
>>> *cow, blk_status_t bi_status)
>>>       kmem_cache_free(cow_cache, cow);
>>>   }
>>> -static void ploop_release_cluster(struct ploop *ploop, u32 clu)
>>> +static void ploop_piwb_discard_completed(struct ploop *ploop,
>>> +                     bool success, u32 clu, u32 new_dst_clu)
>>>   {
>>>       u32 id, *bat_entries, dst_clu;
>>>       struct md_page *md;
>>> +    u8 level;
>>> -    lockdep_assert_held(&ploop->bat_rwlock);
>>> +    if (new_dst_clu)
>>> +        return;
>>>       id = ploop_bat_clu_to_page_nr(clu);
>>>       md = ploop_md_page_find(ploop, id);
>>
>> Is this md the same to md in caller function 
>> ploop_advance_local_after_bat_wb?
> 
> It can be the same or different, it is iterating over the clusters and 
> it is possible the page to change, so this needs a rewrite.
> May be pass md as argument and check if it is the same, if not the same 
> lock or something like that. i have to think about how to do it.

Also we need not forget that if we'll have nested md->md_lock we should 
be aware about ABBA deadlocks. E.g.: if md->md_lock B is nested under 
md->md_lock A in one thread and vice versa A is nested under B in 
another thread, we can get deadlock.

> 
>>
>>> @@ -726,22 +729,15 @@ static void ploop_release_cluster(struct ploop 
>>> *ploop, u32 clu)
>>>       bat_entries = md->kmpage;
>>>       dst_clu = READ_ONCE(bat_entries[clu]);
>>> -    WRITE_ONCE(bat_entries[clu], BAT_ENTRY_NONE);
>>> -    WRITE_ONCE(md->bat_levels[clu], 0);
>>> -
>>> -    ploop_hole_set_bit(dst_clu, ploop);
>>> -}
>>> -
>>> -static void ploop_piwb_discard_completed(struct ploop *ploop,
>>> -                     bool success, u32 clu, u32 new_dst_clu)
>>> -{
>>> -    if (new_dst_clu)
>>> -        return;
>>> +    level = md->bat_levels[clu];
>>
>> If for previous comment the answer is no, should not we take md- 
>> >md_lock here to make the use of md->bat_levels and md->kmpage 
>> atomic / consistent? In the next patch we introduce md->md_lock to 
>> "use it when accessing md->levels and md->page at the sime time to 
>> protect readers against writers".
>>
>> If the answer is yes, should not we do a lockdep check for md->md_lock?
> 
> if it comes as an argument lockdep can be added but if it is different 
> we will get false alarm.
> 
>>
>>> -    if (ploop_cluster_is_in_top_delta(ploop, clu)) {
>>> +    if (!(dst_clu == BAT_ENTRY_NONE || level < 
>>> ploop_top_level(ploop))) {
>>>           WARN_ON_ONCE(ploop->nr_deltas != 1);
>>> -        if (success)
>>> -            ploop_release_cluster(ploop, clu);
>>> +        if (success) {
>>> +            WRITE_ONCE(bat_entries[clu], BAT_ENTRY_NONE);
>>> +            WRITE_ONCE(md->bat_levels[clu], 0);
>>> +            ploop_hole_set_bit(dst_clu, ploop);
>>> +        }
>>>       }
>>>   }
>>
> 

-- 
Best regards, Tikhomirov Pavel
Senior Software Developer, Virtuozzo.



More information about the Devel mailing list