[Devel] [PATCH vz9] dm-ploop: fix requeuing of delayed pios at the end of BAT
Alexander Atanasov
alexander.atanasov at virtuozzo.com
Fri Apr 25 13:12:03 MSK 2025
On 25.04.25 13:05, Pavel Tikhomirov wrote:
>
>
> On 4/25/25 13:35, Alexander Atanasov wrote:
>> Rework ploop_advance_local_after_bat_wb so that if we have delayed
>> update to the same page is really executed and don't endup in wait_list.
> ^ missing "it"
>> Due to releasing and taking locks there was a window that a new update
>> could go into the wait_list. Both due to WRITEBACK bit cleared after
> ^ "both" does not sound right, maybe you
> meant "also"?
We have two cases that a pio can get there for the same reason so both
not also.
>> retaking the lock.
>>
>> To fix this take all pending pios and clear the bit under lock,
>> using the same schema as in other code that checks the bit.
>> After releasing the lock displatch delayed pios.
> ^ excess 'l'
>
> Reviewed-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
>
> I would like explanation more specific:
>
> Only place we can add to md_page->wait_llist is in
> ploop_delay_if_md_busy if md_page->md_status has MD_UPDATING or
> MD_WRITEBACK bits (under both md_lock and bat_lock). Previously we first
> acquired elements from wait_llist and only then cleared MD_WRITEBACK
> bit, thus having a race window where wait_llist can receive new delayed
> elements after acquiring elements from it and before clearing
> MD_WRITEBACK and those new elements will never be processed (until the
> next writeback to the same metadata page).
I explain what problem we faced and how we fixed it, Why do i have to
explain how code works - it is there to be read if one needs the details?
>
> Putting bit-clear and elements-acquire under both locks will guarantee
> it is atomicity relative to ploop_delay_if_md_busy.
>
> One alternative case is when we delay due to MD_UPDATING bit set, and
> there is no MD_WRITEBACK at the time. This is also covered though more
> trickily. In this case the one who set MD_UPDATING bit, will call
> ploop_add_dirty_for_wb, which will be processed later in
> do_ploop_run_work -> ploop_submit_metadata_writeback and will eventually
> lead to MD_WRITEBACK bit set and writeback submitted via
> ploop_index_wb_submit. So it is guaranteed that wait_llist will always
> be handled in this case.
This is again explaining how code works not what the patch does.
>
>>
>> https://virtuozzo.atlassian.net/browse/VSTOR-102677
>> https://virtuozzo.atlassian.net/browse/VSTOR-103768
>> Signed-off-by: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
>> ---
>> drivers/md/dm-ploop-map.c | 10 +++++-----
>> 1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/md/dm-ploop-map.c b/drivers/md/dm-ploop-map.c
>> index 76544cdb476e..8ef0426e7468 100644
>> --- a/drivers/md/dm-ploop-map.c
>> +++ b/drivers/md/dm-ploop-map.c
>> @@ -884,11 +884,15 @@ static void
>> ploop_advance_local_after_bat_wb(struct ploop *ploop,
>> #endif
>> WARN_ON_ONCE(!test_bit(MD_WRITEBACK, &md->status));
>> +
>> + wait_llist_pending = llist_del_all(&md->wait_llist);
>> +
>> + /* Clear flag at the end when all processing related is done */
>> + clear_bit(MD_WRITEBACK, &md->status);
>> spin_unlock(&md->md_lock);
>> spin_unlock_irqrestore(&ploop->bat_lock, flags);
>> kunmap_local(dst_clu);
>> - wait_llist_pending = llist_del_all(&md->wait_llist);
>> if (wait_llist_pending) {
>> llist_for_each_safe(pos, t, wait_llist_pending) {
>> pio = list_entry((struct list_head *)pos, typeof(*pio),
>> list);
>> @@ -900,10 +904,6 @@ static void
>> ploop_advance_local_after_bat_wb(struct ploop *ploop,
>> if (!list_empty(&list))
>> ploop_dispatch_pios(ploop, NULL, &list);
>> - spin_lock_irqsave(&ploop->bat_lock, flags);
>> - /* Clear flag at the end when all processing related is done */
>> - clear_bit(MD_WRITEBACK, &md->status);
>> - spin_unlock_irqrestore(&ploop->bat_lock, flags);
>> }
>> static void ploop_free_piwb(struct ploop_index_wb *piwb)
>
--
Regards,
Alexander Atanasov
More information about the Devel
mailing list