[Devel] [PATCH vz9] dm-ploop: fix requeuing of delayed pios at the end of BAT

Alexander Atanasov alexander.atanasov at virtuozzo.com
Fri Apr 25 13:12:03 MSK 2025


On 25.04.25 13:05, Pavel Tikhomirov wrote:
> 
> 
> On 4/25/25 13:35, Alexander Atanasov wrote:
>> Rework ploop_advance_local_after_bat_wb so that if we have delayed
>> update to the same page is really executed and don't endup in wait_list.
>                           ^ missing "it"
>> Due to releasing and taking locks there was a window that a new update
>> could go into the wait_list. Both due to WRITEBACK bit cleared after
>                                 ^ "both" does not sound right, maybe you 
> meant "also"?

We have two cases that a pio can get there for the same reason so both 
not also.


>> retaking the lock.
>>
>> To fix this take all pending pios and clear the bit under lock,
>> using the same schema as in other code that checks the bit.
>> After releasing the lock displatch delayed pios.
>                                 ^ excess 'l'
> 
> Reviewed-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
> 
> I would like explanation more specific:
> 
> Only place we can add to md_page->wait_llist is in 
> ploop_delay_if_md_busy if md_page->md_status has MD_UPDATING or 
> MD_WRITEBACK bits (under both md_lock and bat_lock). Previously we first 
> acquired elements from wait_llist and only then cleared MD_WRITEBACK 
> bit, thus having a race window where wait_llist can receive new delayed 
> elements after acquiring elements from it and before clearing 
> MD_WRITEBACK and those new elements will never be processed (until the 
> next writeback to the same metadata page).


I explain what problem we faced and how we fixed it, Why do i have to 
explain how code works - it is there to be read if one needs the details?

> 
> Putting bit-clear and elements-acquire under both locks will guarantee 
> it is atomicity relative to ploop_delay_if_md_busy.
> 
> One alternative case is when we delay due to MD_UPDATING bit set, and 
> there is no MD_WRITEBACK at the time. This is also covered though more 
> trickily. In this case the one who set MD_UPDATING bit, will call 
> ploop_add_dirty_for_wb, which will be processed later in 
> do_ploop_run_work -> ploop_submit_metadata_writeback and will eventually 
> lead to MD_WRITEBACK bit set and writeback submitted via 
> ploop_index_wb_submit. So it is guaranteed that wait_llist will always 
> be handled in this case.

This is again explaining how code works not what the patch does.
> 
>>
>> https://virtuozzo.atlassian.net/browse/VSTOR-102677
>> https://virtuozzo.atlassian.net/browse/VSTOR-103768
>> Signed-off-by: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
>> ---
>>   drivers/md/dm-ploop-map.c | 10 +++++-----
>>   1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/md/dm-ploop-map.c b/drivers/md/dm-ploop-map.c
>> index 76544cdb476e..8ef0426e7468 100644
>> --- a/drivers/md/dm-ploop-map.c
>> +++ b/drivers/md/dm-ploop-map.c
>> @@ -884,11 +884,15 @@ static void 
>> ploop_advance_local_after_bat_wb(struct ploop *ploop,
>>   #endif
>>       WARN_ON_ONCE(!test_bit(MD_WRITEBACK, &md->status));
>> +
>> +    wait_llist_pending = llist_del_all(&md->wait_llist);
>> +
>> +    /* Clear flag at the end when all processing related is done */
>> +    clear_bit(MD_WRITEBACK, &md->status);
>>       spin_unlock(&md->md_lock);
>>       spin_unlock_irqrestore(&ploop->bat_lock, flags);
>>       kunmap_local(dst_clu);
>> -    wait_llist_pending = llist_del_all(&md->wait_llist);
>>       if (wait_llist_pending) {
>>           llist_for_each_safe(pos, t, wait_llist_pending) {
>>               pio = list_entry((struct list_head *)pos, typeof(*pio), 
>> list);
>> @@ -900,10 +904,6 @@ static void 
>> ploop_advance_local_after_bat_wb(struct ploop *ploop,
>>       if (!list_empty(&list))
>>           ploop_dispatch_pios(ploop, NULL, &list);
>> -    spin_lock_irqsave(&ploop->bat_lock, flags);
>> -    /* Clear flag at the end when all processing related is done */
>> -    clear_bit(MD_WRITEBACK, &md->status);
>> -    spin_unlock_irqrestore(&ploop->bat_lock, flags);
>>   }
>>   static void ploop_free_piwb(struct ploop_index_wb *piwb)
> 


-- 
Regards,
Alexander Atanasov


More information about the Devel mailing list