[Devel] [RFC PATCH vz9 v6 20/62] dm-ploop: reduce BAT accesses on discard completion
Alexander Atanasov
alexander.atanasov at virtuozzo.com
Mon Jan 20 23:08:57 MSK 2025
On 20.01.25 15:33, Alexander Atanasov wrote:
> On 20.01.25 6:15, Pavel Tikhomirov wrote:
>>
>>
>> On 12/6/24 05:55, Alexander Atanasov wrote:
>>> From: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>
>>>
>>> Drop extra ploop_cluster_is_in_top_delta() as we are planning to
>>> access BAT anyway
>>>
>>> https://virtuozzo.atlassian.net/browse/VSTOR-91817
>>> Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>
>>> ---
>>> drivers/md/dm-ploop-map.c | 28 ++++++++++++----------------
>>> 1 file changed, 12 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/drivers/md/dm-ploop-map.c b/drivers/md/dm-ploop-map.c
>>> index ad7ca7d43dfc..b00dd364072d 100644
>>> --- a/drivers/md/dm-ploop-map.c
>>> +++ b/drivers/md/dm-ploop-map.c
>>> @@ -711,12 +711,15 @@ static void ploop_complete_cow(struct ploop_cow
>>> *cow, blk_status_t bi_status)
>>> kmem_cache_free(cow_cache, cow);
>>> }
>>> -static void ploop_release_cluster(struct ploop *ploop, u32 clu)
>>> +static void ploop_piwb_discard_completed(struct ploop *ploop,
>>> + bool success, u32 clu, u32 new_dst_clu)
>>> {
>>> u32 id, *bat_entries, dst_clu;
>>> struct md_page *md;
>>> + u8 level;
>>> - lockdep_assert_held(&ploop->bat_rwlock);
>>> + if (new_dst_clu)
>>> + return;
>>> id = ploop_bat_clu_to_page_nr(clu);
>>> md = ploop_md_page_find(ploop, id);
>>
>> Is this md the same to md in caller function
>> ploop_advance_local_after_bat_wb?
>
> It can be the same or different, it is iterating over the clusters and
> it is possible the page to change, so this needs a rewrite.
> May be pass md as argument and check if it is the same, if not the same
> lock or something like that. i have to think about how to do it.
After a deeper look - it is the same, i and off are limited to be within
one page, so it does not change. (i actually tested this with passing md
as md_in into ploop_piwb_discard_completed and a WARN_ON(md != md_in))
I think to remove ploop_piwb_discard_completed.
most of the init is duplicated it boils down to:
if (piwb->type == PIWB_TYPE_DISCARD) {
u32 clu = i + off;
u8 level = md->bat_levels[clu];
u32 d_clu = READ_ONCE(bat_entries[clu]);
if (success && !dst_clu[i] && (!(d_clu == BAT_ENTRY_NONE || level <
ploop_top_level(ploop)))) {
WARN_ON_ONCE(ploop->nr_deltas != 1);
WRITE_ONCE(bat_entries[clu], BAT_ENTRY_NONE);
WRITE_ONCE(md->bat_levels[clu], 0);
ploop_hole_set_bit(d_clu, ploop);
}
continue;
}
It will save a page lookup (and a function call) and make it a bit more
readable. Other option i will explore is to split into different code
paths for alloc/discard/realoc instead of single for with conditions.
This is wip - it may be shortened further.
>
>>
>>> @@ -726,22 +729,15 @@ static void ploop_release_cluster(struct ploop
>>> *ploop, u32 clu)
>>> bat_entries = md->kmpage;
>>> dst_clu = READ_ONCE(bat_entries[clu]);
>>> - WRITE_ONCE(bat_entries[clu], BAT_ENTRY_NONE);
>>> - WRITE_ONCE(md->bat_levels[clu], 0);
>>> -
>>> - ploop_hole_set_bit(dst_clu, ploop);
>>> -}
>>> -
>>> -static void ploop_piwb_discard_completed(struct ploop *ploop,
>>> - bool success, u32 clu, u32 new_dst_clu)
>>> -{
>>> - if (new_dst_clu)
>>> - return;
>>> + level = md->bat_levels[clu];
>>
>> If for previous comment the answer is no, should not we take
>> md->md_lock here to make the use of md->bat_levels and md->kmpage
>> atomic / consistent? In the next patch we introduce md->md_lock to
>> "use it when accessing md->levels and md->page at the sime time to
>> protect readers against writers".
>>
>> If the answer is yes, should not we do a lockdep check for md->md_lock?
>
> if it comes as an argument lockdep can be added but if it is different
> we will get false alarm.
>
>>
>>> - if (ploop_cluster_is_in_top_delta(ploop, clu)) {
>>> + if (!(dst_clu == BAT_ENTRY_NONE || level <
>>> ploop_top_level(ploop))) {
>>> WARN_ON_ONCE(ploop->nr_deltas != 1);
>>> - if (success)
>>> - ploop_release_cluster(ploop, clu);
>>> + if (success) {
>>> + WRITE_ONCE(bat_entries[clu], BAT_ENTRY_NONE);
>>> + WRITE_ONCE(md->bat_levels[clu], 0);
>>> + ploop_hole_set_bit(dst_clu, ploop);
>>> + }
>>> }
>>> }
>>
>
--
Regards,
Alexander Atanasov
More information about the Devel
mailing list