[Devel] [PATCH v2] dm-ploop: fix discard writeback

Andrey Zhadchenko andrey.zhadchenko at virtuozzo.com
Thu Jun 19 12:16:23 MSK 2025


When doing simple discard tests, we encountered the following crash
with panic_on_warn:

[4421147.245184] ------------[ cut here ]------------
[4421147.245569] WARNING: CPU: 6 PID: 505054 at drivers/md/dm-ploop.h:392 ploop_advance_local_after_bat_wb+0x2ae/0x2d0 [ploop]
[4421147.246073] Modules linked in: loop tls dm_qcow2(E) vhost_net tap xt_physdev nft_meta_bridge tun dm_zero_rq vhost_blk vhost_vsock vhost vhost_iotlb ip6t_REJECT nf_reject_ipv6 xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter 8021q garp mrp nf_tables nfnetlink rpcrdma rfkill intel_rapl_msr intel_rapl_common intel_pmc_core intel_vsec pmt_telemetry pmt_class kvm_intel kvm iTCO_wdt irqbypass iTCO_vendor_support rapl vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common virtio_console vsock i2c_i801 pcspkr lpc_ich i2c_smbus pvpanic_mmio pvpanic joydev nfsd br_netfilter veth overlay auth_rpcgss nfs_acl lockd grace bridge stp llc sunrpc binfmt_misc xfs libcrc32c sd_mod sr_mod cdrom t10_pi sg drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul syscopyarea crc32_pclmul sysfillrect ahci sysimgblt libahci fb_sys_fops virtio_net ghash_clmulni_intel libata net_failover drm virtio_scsi failover
[4421147.246120]  serio_raw crc32c_intel fuse_kio_pcs rdma_cm iw_cm ib_cm ib_core fuse push_backup dm_tracking ploop dm_mod [last unloaded: dm_qcow2]
[4421147.249733] CPU: 6 PID: 505054 Comm: kworker/6:1 ve: / Kdump: loaded Tainted: G            E  X  -------  ---  5.14.0-427.44.1.ovz9.80.29 #1 ovz9.80.29
[4421147.250261] Hardware name: Virtuozzo KVM/Virtuozzo, BIOS 1.16.3-2.vz9.2 04/01/2014
[4421147.250651] Workqueue: dio/dm-1 iomap_dio_complete_work
[4421147.250914] RIP: 0010:ploop_advance_local_after_bat_wb+0x2ae/0x2d0 [ploop]
[4421147.251216] Code: 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 8b b5 88 00 00 00 3b 8d 90 00 00 00 73 0c 89 cf f0 48 0f ab 3e e9 81 fe ff ff <0f> 0b e9 7a fe ff ff 0f 0b 0f 0b e9 22 ff ff ff 0f 0b e9 c9 fe ff
[4421147.251938] RSP: 0018:ffffc07dc8447d98 EFLAGS: 00010002
[4421147.252202] RAX: ffff9dc778699040 RBX: ffff9dc7413bd800 RCX: 0000000000000000
[4421147.252578] RDX: 0000000000000010 RSI: 0000000068746957 RDI: ffff9dcc5414c420
[4421147.252943] RBP: ffff9dcac2299000 R08: 0000000000000000 R09: ffff9dcac2299400
[4421147.253328] R10: 0000000000000001 R11: ffff9dc747460000 R12: ffffc07dc8447dd0
[4421147.253706] R13: ffffffffcedc7000 R14: 0000000000000074 R15: ffff9dc8e56aa380
[4421147.254071] FS:  0000000000000000(0000) GS:ffff9dce9fb80000(0000) knlGS:0000000000000000
[4421147.254468] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[4421147.254743] CR2: 00007ffd825d3000 CR3: 00000001052a4002 CR4: 0000000000372ee0
[4421147.255109] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[4421147.255474] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
[4421147.255848] Call Trace:
[4421147.256040]  <TASK>
[4421147.256225]  ? show_trace_log_lvl+0x1c4/0x2df
[4421147.256466]  ? show_trace_log_lvl+0x1c4/0x2df
[4421147.256712]  ? ploop_put_piwb+0x187/0x1f0 [ploop]
[4421147.256959]  ? ploop_advance_local_after_bat_wb+0x2ae/0x2d0 [ploop]
[4421147.257272]  ? __warn+0x81/0x110
[4421147.257485]  ? ploop_advance_local_after_bat_wb+0x2ae/0x2d0 [ploop]
[4421147.257775]  ? report_bug+0x10a/0x140
[4421147.257997]  ? handle_bug+0x3c/0x70
[4421147.258212]  ? exc_invalid_op+0x14/0x70
[4421147.258438]  ? asm_exc_invalid_op+0x16/0x20
[4421147.258678]  ? ploop_advance_local_after_bat_wb+0x2ae/0x2d0 [ploop]
[4421147.258962]  ? ploop_advance_local_after_bat_wb+0xcc/0x2d0 [ploop]
[4421147.259246]  ploop_put_piwb+0x187/0x1f0 [ploop]
[4421147.259493]  ploop_do_pio_endio+0x31/0x90 [ploop]
[4421147.259748]  process_one_work+0x1e2/0x3b0
[4421147.259980]  ? __pfx_worker_thread+0x10/0x10
[4421147.261238]  worker_thread+0x50/0x3a0
[4421147.261460]  ? __pfx_worker_thread+0x10/0x10
[4421147.261704]  kthread+0xdd/0x100
[4421147.261915]  ? __pfx_kthread+0x10/0x10
[4421147.262138]  ret_from_fork+0x29/0x50
[4421147.262360]  </TASK>
[4421147.262553] Kernel panic - not syncing: panic_on_warn set ...

The extra debug looks like that:

[ 3022.561985] got discard: pos 0, size 1048576
[ 3022.562373]		processing discard: clu 0 (idx 16), len 1048576
[ 3022.563182]			dropped clu 1752459607 (total 101) [page 0, clu 0 (i 16 + off -16)]

1752459607 -> 0x68746957 -> 57 69 74 68 ("With") which is a start of ploop
header. So we clearly missed the first metadata page handling.

Since ploop first metadata page contains 64bit header, we shift all
clusters in bat by this value when accessing them.
In ploop_advance_local_after_bat_wb() we iterate over the page either from
0 or 16, if it is a first page. Therefore:
 - i is the position of iterated cluster in the metadata page, so we should
use it for accessing/writing bat_entries and bat_levels.
 - i + off is the real cluster number
 - bat_entries[i] is the cluster we should punch from hole bitmap

Also the current code does out-of-bound access if the metadata page is
not the first, as it uses real cluster number as index when accessing
the bat page (which is only 4096 bytes long)

Fixes: da5f60147b62 ("dm-ploop: dm-ploop: simplify discard completion")
Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>
---
v2: dropped erroneous ploop_hole_set_bit() change, expanded commit
message for a bit

 drivers/md/dm-ploop-map.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/md/dm-ploop-map.c b/drivers/md/dm-ploop-map.c
index 6dd94af5fd94..988bf00ed4be 100644
--- a/drivers/md/dm-ploop-map.c
+++ b/drivers/md/dm-ploop-map.c
@@ -853,16 +853,15 @@ static void ploop_advance_local_after_bat_wb(struct ploop *ploop,
 	for (; i < last; i++) {
 		if (piwb->type == PIWB_TYPE_DISCARD) {
 			/* discard completed */
-			u32 clu = i + off;
-			u8 level = md->bat_levels[clu];
-			u32 d_clu = READ_ONCE(bat_entries[clu]);
+			u8 level = md->bat_levels[i];
+			u32 d_clu = READ_ONCE(bat_entries[i]);
 
 			if (success && !dst_clu[i] &&
 			    (!(d_clu == BAT_ENTRY_NONE ||
 			    level < ploop_top_level(ploop)))) {
 				WARN_ON_ONCE(ploop->nr_deltas != 1);
-				WRITE_ONCE(bat_entries[clu], BAT_ENTRY_NONE);
-				WRITE_ONCE(md->bat_levels[clu], 0);
+				WRITE_ONCE(bat_entries[i], BAT_ENTRY_NONE);
+				WRITE_ONCE(md->bat_levels[i], 0);
 				ploop_hole_set_bit(d_clu, ploop);
 			}
 
-- 
2.43.5



More information about the Devel mailing list