[Devel] [PATCH vz9 1/2] dm-ploop: fix self-deadlock in ploop_prepare_reloc_index_wb()
Andrey Zhadchenko
andrey.zhadchenko at virtuozzo.com
Wed Jun 18 15:22:32 MSK 2025
For both patches:
Reviewed-by: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>
On 6/17/25 14:49, Konstantin Khorenko wrote:
> NMI watchdog: Watchdog detected hard LOCKUP on cpu 6
> RIP: 0010:native_queued_spin_lock_slowpath+0x20d/0x2b0
>
> Call Trace:
> dump_stack_lvl+0x57/0x81
> validate_chain.cold+0x157/0x16a
> __lock_acquire+0xbb1/0x1900
> lock_acquire+0x1da/0x640
> _raw_spin_lock_irqsave+0x43/0x90
> ploop_allocate_cluster+0x12f/0x8c0 [ploop]
> ploop_alloc_cluster.isra.0+0xf7/0x1f0 [ploop]
> ploop_prepare_reloc_index_wb+0x2ab/0x4e0 [ploop]
> ploop_grow_relocate_cluster+0x849/0xcc0 [ploop]
> ploop_process_resize_cmd+0x65/0x430 [ploop]
> ploop_resize+0x415/0x680 [ploop]
> ploop_message+0x420/0xc90 [ploop]
> target_message+0x453/0x5e0 [dm_mod]
> ctl_ioctl+0x41f/0x6a0 [dm_mod]
> dm_ctl_ioctl+0xa/0x20 [dm_mod]
> __x64_sys_ioctl+0x12b/0x1a0
> do_syscall_64+0x5c/0x90
> entry_SYSCALL_64_after_hwframe+0x77/0xe1
>
> ploop_prepare_reloc_index_wb() aquires ploop->bat_lock and calls
> (holding the lock) ploop_alloc_cluster(), which also aquires
> ploop->bat_lock => deadlock.
>
> ploop_prepare_reloc_index_wb
> spin_lock_irq(&ploop->bat_lock);
> ploop_alloc_cluster
> ploop_allocate_cluster
> spin_lock_irqsave(&ploop->bat_lock, flags);
>
> Let's move ploop_alloc_cluster() out of the bat_lock, this is safe as
> MD_UPDATING bit checked and set under the lock protects us from parallel
> ploop_alloc_cluster() execution for a pertucular md_page.
> And if the function is called in parallel for different md_page, it's
> also OK as new cluster is searched in the bitmask under the bat_lock, so
> same cluster wil never be found and "allocated" in parallel.
>
> Fixes: 9caa1af11b0a ("dm-ploop: fix and rework md updates")
> https://virtuozzo.atlassian.net/browse/VSTOR-107975
>
> Signed-off-by: Konstantin Khorenko <khorenko at virtuozzo.com>
>
> Feature: dm-ploop: ploop target driver
> ---
> drivers/md/dm-ploop-map.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/dm-ploop-map.c b/drivers/md/dm-ploop-map.c
> index 35085a04ca5f..5a2ef5691405 100644
> --- a/drivers/md/dm-ploop-map.c
> +++ b/drivers/md/dm-ploop-map.c
> @@ -2746,6 +2746,7 @@ int ploop_prepare_reloc_index_wb(struct ploop *ploop,
> add_to_wblist = ploop_md_make_dirty(ploop, md);
>
> piwb = md->piwb;
> + spin_unlock_irq(&ploop->bat_lock);
>
> if (dst_clu) {
> /*
> @@ -2759,7 +2760,6 @@ int ploop_prepare_reloc_index_wb(struct ploop *ploop,
> if (err)
> goto out_reset;
> }
> - spin_unlock_irq(&ploop->bat_lock);
>
> *ret_md = md;
> *add_for_wb = add_to_wblist ? 1 : 0;
> @@ -2768,6 +2768,7 @@ int ploop_prepare_reloc_index_wb(struct ploop *ploop,
>
> out_reset:
> ploop_break_bat_update(ploop, md, piwb);
> + spin_lock_irq(&ploop->bat_lock);
> out_error:
> if (add_to_wblist)
> clear_bit(MD_DIRTY, &md->status);
More information about the Devel
mailing list