[Devel] [PATCH RHEL9 COMMIT] dm-ploop: fix self-deadlock in ploop_prepare_reloc_index_wb()

Konstantin Khorenko khorenko at virtuozzo.com
Wed Jun 25 14:10:19 MSK 2025


The commit is pushed to "branch-rh9-5.14.0-427.55.1.vz9.82.x-ovz" and will appear at git at bitbucket.org:openvz/vzkernel.git
after rh9-5.14.0-427.55.1.el9
------>
commit bff8a89e58fa29adfa498b58977886e20e7aebb9
Author: Konstantin Khorenko <khorenko at virtuozzo.com>
Date:   Tue Jun 17 15:49:47 2025 +0300

    dm-ploop: fix self-deadlock in ploop_prepare_reloc_index_wb()
    
      NMI watchdog: Watchdog detected hard LOCKUP on cpu 6
      RIP: 0010:native_queued_spin_lock_slowpath+0x20d/0x2b0
    
      Call Trace:
       dump_stack_lvl+0x57/0x81
       validate_chain.cold+0x157/0x16a
       __lock_acquire+0xbb1/0x1900
       lock_acquire+0x1da/0x640
       _raw_spin_lock_irqsave+0x43/0x90
       ploop_allocate_cluster+0x12f/0x8c0 [ploop]
       ploop_alloc_cluster.isra.0+0xf7/0x1f0 [ploop]
       ploop_prepare_reloc_index_wb+0x2ab/0x4e0 [ploop]
       ploop_grow_relocate_cluster+0x849/0xcc0 [ploop]
       ploop_process_resize_cmd+0x65/0x430 [ploop]
       ploop_resize+0x415/0x680 [ploop]
       ploop_message+0x420/0xc90 [ploop]
       target_message+0x453/0x5e0 [dm_mod]
       ctl_ioctl+0x41f/0x6a0 [dm_mod]
       dm_ctl_ioctl+0xa/0x20 [dm_mod]
       __x64_sys_ioctl+0x12b/0x1a0
       do_syscall_64+0x5c/0x90
       entry_SYSCALL_64_after_hwframe+0x77/0xe1
    
    ploop_prepare_reloc_index_wb() aquires ploop->bat_lock and calls
    (holding the lock) ploop_alloc_cluster(), which also aquires
    ploop->bat_lock => deadlock.
    
      ploop_prepare_reloc_index_wb
        spin_lock_irq(&ploop->bat_lock);
        ploop_alloc_cluster
          ploop_allocate_cluster
            spin_lock_irqsave(&ploop->bat_lock, flags);
    
    Let's move ploop_alloc_cluster() out of the bat_lock, this is safe as
    MD_UPDATING bit checked and set under the lock protects us from parallel
    ploop_alloc_cluster() execution for a pertucular md_page.
    And if the function is called in parallel for different md_page, it's
    also OK as new cluster is searched in the bitmask under the bat_lock, so
    same cluster wil never be found and "allocated" in parallel.
    
    Fixes: 9caa1af11b0a ("dm-ploop: fix and rework md updates")
    https://virtuozzo.atlassian.net/browse/VSTOR-107975
    
    Signed-off-by: Konstantin Khorenko <khorenko at virtuozzo.com>
    Reviewed-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
    Reviewed-by: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>
    
    Feature: dm-ploop: ploop target driver
---
 drivers/md/dm-ploop-map.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-ploop-map.c b/drivers/md/dm-ploop-map.c
index 62a43eaa531d5..e1a199f09c97f 100644
--- a/drivers/md/dm-ploop-map.c
+++ b/drivers/md/dm-ploop-map.c
@@ -2745,6 +2745,7 @@ int ploop_prepare_reloc_index_wb(struct ploop *ploop,
 	add_to_wblist = ploop_md_make_dirty(ploop, md);
 
 	piwb = md->piwb;
+	spin_unlock_irq(&ploop->bat_lock);
 
 	if (dst_clu) {
 		/*
@@ -2758,7 +2759,6 @@ int ploop_prepare_reloc_index_wb(struct ploop *ploop,
 		if (err)
 			goto out_reset;
 	}
-	spin_unlock_irq(&ploop->bat_lock);
 
 	*ret_md = md;
 	*add_for_wb = add_to_wblist ? 1 : 0;
@@ -2767,6 +2767,7 @@ int ploop_prepare_reloc_index_wb(struct ploop *ploop,
 
 out_reset:
 	ploop_break_bat_update(ploop, md, piwb);
+	spin_lock_irq(&ploop->bat_lock);
 out_error:
 	if (add_to_wblist)
 		clear_bit(MD_DIRTY, &md->status);


More information about the Devel mailing list