[Devel] [PATCH vz9] dm-ploop: fix requeuing of delayed pios at the end of BAT

Alexander Atanasov alexander.atanasov at virtuozzo.com
Fri Apr 25 13:32:39 MSK 2025


On 25.04.25 13:27, Pavel Tikhomirov wrote:
> 
> 
> On 4/25/25 18:12, Alexander Atanasov wrote:
>> On 25.04.25 13:05, Pavel Tikhomirov wrote:
>>>
>>>
>>> On 4/25/25 13:35, Alexander Atanasov wrote:
>>>> Rework ploop_advance_local_after_bat_wb so that if we have delayed
>>>> update to the same page is really executed and don't endup in 
>>>> wait_list.
>>>                           ^ missing "it"
>>>> Due to releasing and taking locks there was a window that a new update
>>>> could go into the wait_list. Both due to WRITEBACK bit cleared after
>>>                                 ^ "both" does not sound right, maybe 
>>> you meant "also"?
>>
>> We have two cases that a pio can get there for the same reason so both 
>> not also.
> 
> What cases when? It's not clear from the commit message.

Case 1:
 >>>> Rework ploop_advance_local_after_bat_wb so that if we have delayed
 >>>> update to the same page is really executed and don't endup in
 >>>> wait_list.
Case 2:
 >>>> Due to releasing and taking locks there was a window that a new update
 >>>> could go into the wait_list. Both due to WRITEBACK bit cleared after

And yes it must be "Both are" to be grammar compliant.


> 
>>
>>
>>>> retaking the lock.
>>>>
>>>> To fix this take all pending pios and clear the bit under lock,
>>>> using the same schema as in other code that checks the bit.
>>>> After releasing the lock displatch delayed pios.
>>>                                 ^ excess 'l'
>>>
>>> Reviewed-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
>>>
>>> I would like explanation more specific:
>>>
>>> Only place we can add to md_page->wait_llist is in 
>>> ploop_delay_if_md_busy if md_page->md_status has MD_UPDATING or 
>>> MD_WRITEBACK bits (under both md_lock and bat_lock). Previously we 
>>> first acquired elements from wait_llist and only then cleared 
>>> MD_WRITEBACK bit, thus having a race window where wait_llist can 
>>> receive new delayed elements after acquiring elements from it and 
>>> before clearing MD_WRITEBACK and those new elements will never be 
>>> processed (until the next writeback to the same metadata page).
>>
>>
>> I explain what problem we faced and how we fixed it, Why do i have to 
>> explain how code works - it is there to be read if one needs the details?
>>
>>>
>>> Putting bit-clear and elements-acquire under both locks will 
>>> guarantee it is atomicity relative to ploop_delay_if_md_busy.
>>>
>>> One alternative case is when we delay due to MD_UPDATING bit set, and 
>>> there is no MD_WRITEBACK at the time. This is also covered though 
>>> more trickily. In this case the one who set MD_UPDATING bit, will 
>>> call ploop_add_dirty_for_wb, which will be processed later in 
>>> do_ploop_run_work -> ploop_submit_metadata_writeback and will 
>>> eventually lead to MD_WRITEBACK bit set and writeback submitted via 
>>> ploop_index_wb_submit. So it is guaranteed that wait_llist will 
>>> always be handled in this case.
>>
>> This is again explaining how code works not what the patch does.
> 
> Strongly disagree. I believe the whole locking scheme should be 
> explained here, as it is the thing that we are fixing. There is no 
> description of it anywhere. But we should be able to verify that what is 
> written in code corresponds to how we intend it to work.

There is a description in the previous commits how locking works.
Here it is about how we fix a specific place. No need to repeat it.
Documentation yes, there is no documentation we have to use the source
until there is documentation.


> 
>>>
>>>>
>>>> https://virtuozzo.atlassian.net/browse/VSTOR-102677
>>>> https://virtuozzo.atlassian.net/browse/VSTOR-103768
>>>> Signed-off-by: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
>>>> ---
>>>>   drivers/md/dm-ploop-map.c | 10 +++++-----
>>>>   1 file changed, 5 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/md/dm-ploop-map.c b/drivers/md/dm-ploop-map.c
>>>> index 76544cdb476e..8ef0426e7468 100644
>>>> --- a/drivers/md/dm-ploop-map.c
>>>> +++ b/drivers/md/dm-ploop-map.c
>>>> @@ -884,11 +884,15 @@ static void 
>>>> ploop_advance_local_after_bat_wb(struct ploop *ploop,
>>>>   #endif
>>>>       WARN_ON_ONCE(!test_bit(MD_WRITEBACK, &md->status));
>>>> +
>>>> +    wait_llist_pending = llist_del_all(&md->wait_llist);
>>>> +
>>>> +    /* Clear flag at the end when all processing related is done */
>>>> +    clear_bit(MD_WRITEBACK, &md->status);
>>>>       spin_unlock(&md->md_lock);
>>>>       spin_unlock_irqrestore(&ploop->bat_lock, flags);
>>>>       kunmap_local(dst_clu);
>>>> -    wait_llist_pending = llist_del_all(&md->wait_llist);
>>>>       if (wait_llist_pending) {
>>>>           llist_for_each_safe(pos, t, wait_llist_pending) {
>>>>               pio = list_entry((struct list_head *)pos, 
>>>> typeof(*pio), list);
>>>> @@ -900,10 +904,6 @@ static void 
>>>> ploop_advance_local_after_bat_wb(struct ploop *ploop,
>>>>       if (!list_empty(&list))
>>>>           ploop_dispatch_pios(ploop, NULL, &list);
>>>> -    spin_lock_irqsave(&ploop->bat_lock, flags);
>>>> -    /* Clear flag at the end when all processing related is done */
>>>> -    clear_bit(MD_WRITEBACK, &md->status);
>>>> -    spin_unlock_irqrestore(&ploop->bat_lock, flags);
>>>>   }
>>>>   static void ploop_free_piwb(struct ploop_index_wb *piwb)
>>>
>>
>>
> 


-- 
Regards,
Alexander Atanasov


More information about the Devel mailing list