[Devel] [PATCH VZ9 v2 0/6] ploop BAT optimizations
Alexander Atanasov
alexander.atanasov at virtuozzo.com
Wed Oct 9 18:35:49 MSK 2024
Hi,
I haven't went in deep but one thing that looks suspicious is in 6/6 patch
+ if (pio->bi_op & REQ_PREFLUSH && (pio->bi_op & REQ_OP_MASK) !=
REQ_OP_FLUSH) {
+ flush_pio = ploop_alloc_pio(ploop, GFP_NOIO);
+ if (!flush_pio) {
+ pio->bi_status = BLK_STS_RESOURCE;
+ ploop_pio_endio(pio);
+ return;
+ }
+
+ ploop_init_pio(ploop, REQ_OP_FLUSH, flush_pio);
+ flush_pio->endio_cb = ploop_preflush_endio;
+ flush_pio->endio_cb_data = pio;
+ flush_pio->queue_list_id = PLOOP_LIST_FLUSH;
+ flush_pio->free_on_endio = true;
+ }
+
write_lock_irq(&ploop->bat_rwlock);
list_for_each_entry(md, &ploop->wb_batch_list_prio, wb_link) {
md_inflight++;
- md->piwb->flush_pio = pio;
+ md->piwb->flush_pio = flush_pio;
}
one pio is allocted and then assigned to multiple md->piwb
notably : flush_pio->free_on_endio = true;
On 9.10.24 17:35, Denis V. Lunev wrote:
> On 10/9/24 16:33, Denis V. Lunev wrote:
>> On 10/9/24 14:17, Andrey Zhadchenko wrote:
>>> With this patchset, we start to check if the upcoming requests for ploop
>>> have REQ_FUA flag.
>>> If the flag is present:
>>> - IO submits immediately, completion waits for BAT update
>>> - BAT updates submit immediately
>>> - set IOCB_DSYNC for writes and BAT updates
>>> - BAT update does not trigger flush
>>> If the flag is absent:
>>> - IO completion does not wait for BAT update
>>> - all BAT updates are periodically submitted in a batch
>>> - BAT update does not trigger flush
>>>
>>> New parameter with metadata submit delay can be specified when setting
>>> up ploop table
>>>
>>> v2 changes:
>>> - drop flush after BAT updates even for FUA requests
>>> - set IOCB_DSYNC flag on writes triggered by FUA
>>> - complete all accumulated metadata on FLUSH requests
>>> - support REQ_PREFLUSH flag
>>>
>>> Andrey Zhadchenko (3):
>>> dm-ploop: do not flush BAT writes if request was not FUA
>>> dm-ploop: do not wait for BAT update for non-FUA requests
>>> dm-ploop: add delay for metadata writeback
>>>
>>>
>>> *** BLURB HERE ***
>>>
>>> Andrey Zhadchenko (6):
>>> dm-ploop: do not flush after metadata writes
>>> dm-ploop: set IOCB_DSYNC on all FUA requests
>>> dm-ploop: do not wait for BAT update for non-FUA requests
>>> dm-ploop: add delay for metadata writeback
>>> dm-ploop: submit all postponed metadata on REQ_OP_FLUSH
>>> dm-ploop: handle REQ_PREFLUSH
>>>
>>> drivers/md/dm-ploop-bat.c | 1 +
>>> drivers/md/dm-ploop-map.c | 154 +++++++++++++++++++++++++++--------
>>> drivers/md/dm-ploop-target.c | 13 +++
>>> drivers/md/dm-ploop.h | 9 ++
>>> 4 files changed, 143 insertions(+), 34 deletions(-)
>>>
>> these patches do not work at all over 32 GB ramdisk
>>
>> dev007 /mnt/ramdisk # ploop init -s 8G /mnt/ramdisk/root.hds
>> Creating delta /mnt/ramdisk/root.hds bs=2048 size=16777216 sectors v2
>> Adding snapshot {5fbaabe3-6958-40ff-92a7-860e329aab41}
>> Storing /mnt/ramdisk/DiskDescriptor.xml
>> Opening delta /mnt/ramdisk/root.hds
>> Opening delta /mnt/ramdisk/root.hds
>> Adding delta dev=/dev/mapper/ploop30904 img=/mnt/ramdisk/root.hds (rw)
>> Update inuse state /mnt/ramdisk/root.hds state=1953459801
>> Opening delta /mnt/ramdisk/root.hds
>> Running: partprobe /dev/mapper/ploop30904
>> Set untrusted /dev/mapper/ploop30904
>> Running: parted -s /dev/mapper/ploop30904 mklabel gpt
>> Running: sgdisk -n 1:2048:16775167 /dev/mapper/ploop30904
>> Warning: The kernel is still using the old partition table.
>> The new table will be used at the next reboot or after you
>> run partprobe(8) or kpartx(8)
>> The operation has completed successfully.
>> Running: partprobe /dev/mapper/ploop30904
>> Running: mkfs.xfs -b size=4096 /dev/mapper/ploop30904p1
>> meta-data=/dev/mapper/ploop30904p1 isize=512 agcount=4,
>> agsize=524160 blks
>> = sectsz=512 attr=2, projid32bit=1
>> = crc=1 finobt=1, sparse=1,
>> rmapbt=0
>> = reflink=1 bigtime=1 inobtcount=1
>> nrext64=0
>> data = bsize=4096 blocks=2096640, imaxpct=25
>> = sunit=0 swidth=0 blks
>> naming =version 2 bsize=4096 ascii-ci=0, ftype=1
>> log =internal log bsize=4096 blocks=16384, version=2
>> = sectsz=512 sunit=0 blks, lazy-count=1
>> realtime =none extsz=4096 blocks=0, rtextents=0
>> Discarding blocks...Done.
>> Metadata CRC error detected at 0x561dffbd10f0, xfs_agf block 0x1/0x200
>> Metadata CRC error detected at 0x561dffbd10f0, xfs_agf block
>> 0x3ffc01/0x200
>> Metadata CRC error detected at 0x561dffbd10f0, xfs_agf block
>> 0x7ff801/0x200
>> Metadata CRC error detected at 0x561dffbd10f0, xfs_agf block
>> 0xbff401/0x200
>> Metadata CRC error detected at 0x561dffbee040, xfs_agi block 0x2/0x200
>> Metadata CRC error detected at 0x561dffbee040, xfs_agi block
>> 0x3ffc02/0x200
>> Metadata CRC error detected at 0x561dffbee040, xfs_agi block
>> 0x7ff802/0x200
>> Metadata CRC error detected at 0x561dffbee040, xfs_agi block
>> 0xbff402/0x200
>> mkfs.xfs: Inode allocation failed [28 - No space left on device]
>> Error in run_prg_rc (util.c:309): Command mkfs.xfs -b size=4096
>> /dev/mapper/ploop30904p1 exited with code 1
>> Umount /dev/mapper/ploop30904
>> DM command: remove /dev/mapper/ploop30904p1
>> DM message: /dev/mapper/ploop30904 get_img_name 0
>> Opening delta /mnt/ramdisk/root.hds
>> DM command: suspend /dev/mapper/ploop30904
>> DM command: resume /dev/mapper/ploop30904
>> Unmounting device /dev/mapper/ploop30904
>> DM command: resume /dev/mapper/ploop30904
>> DM command: remove /dev/mapper/ploop30904
>> Clear inuse state
>> Dropping image /mnt/ramdisk/root.hds
>> dev007 /mnt/ramdisk #
>>
>> rmmod after that has been hanged.
>>
>> Den
> [ 351.210027]
> =============================================================================
> [ 351.210348] BUG ploop-pio (Tainted: G E X ------- --- ):
> Objects remaining in ploop-pio on __kmem_cache_shutdown()
> [ 351.210620]
> -----------------------------------------------------------------------------
>
> [ 351.210886] Slab 0x00000000afe69346 objects=36 used=3
> fp=0x0000000028f0e0af
> flags=0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
> [ 351.211192] CPU: 2 PID: 15519 Comm: rmmod ve: / Kdump: loaded
> Tainted: G E X ------- --- 5.14.0-427.37.1.ovz9.78.1 #6
> ovz9.78.1
> [ 351.211564] Hardware name: Supermicro SYS-5019C-L/X11SCL-IF, BIOS 1.5
> 10/05/2020
> [ 351.211878] Call Trace:
> [ 351.212146] <TASK>
> [ 351.212452] dump_stack_lvl+0x34/0x48
> [ 351.212732] slab_err+0xa2/0x110
> [ 351.213006] __kmem_cache_shutdown+0x16a/0x310
> [ 351.213305] kmem_cache_destroy+0x51/0x160
> [ 351.213583] dm_ploop_exit+0x24/0x20d [ploop]
> [ 351.213867] __do_sys_delete_module.constprop.0+0x175/0x280
> [ 351.214148] ? syscall_trace_enter.constprop.0+0x126/0x1a0
> [ 351.214428] do_syscall_64+0x59/0x90
> [ 351.214703] ? __rseq_handle_notify_resume+0x26/0xb0
> [ 351.214980] ? exit_to_user_mode_loop+0xd0/0x130
> [ 351.215257] ? exit_to_user_mode_prepare+0xb6/0x100
> [ 351.215553] ? syscall_exit_to_user_mode+0x22/0x40
> [ 351.215830] ? do_syscall_64+0x69/0x90
> [ 351.216105] ? exc_page_fault+0x62/0x150
> [ 351.216414] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [ 351.216693] RIP: 0033:0x7f72bc83f5ab
> [ 351.216988] Code: 73 01 c3 48 8b 0d 75 a8 1b 00 f7 d8 64 89 01 48 83
> c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f
> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 45 a8 1b 00 f7 d8 64 89 01 48
> [ 351.217391] RSP: 002b:00007fff49e1e798 EFLAGS: 00000206 ORIG_RAX:
> 00000000000000b0
> [ 351.217707] RAX: ffffffffffffffda RBX: 00005580d76568f0 RCX:
> 00007f72bc83f5ab
> [ 351.217991] RDX: 000000000000000a RSI: 0000000000000800 RDI:
> 00005580d7656958
> [ 351.218313] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 0000000000000000
> [ 351.218596] R10: 00007f72bc99eac0 R11: 0000000000000206 R12:
> 00007fff49e1e9f0
> [ 351.218881] R13: 00007fff49e1f070 R14: 00005580d76562a0 R15:
> 00005580d76568f0
> [ 351.219166] </TASK>
> [ 351.219453] Disabling lock debugging due to kernel taint
> [ 351.219453] Disabling lock debugging due to kernel taint
> [ 351.219732] Kernel panic - not syncing: panic_on_warn set ...
> [ 351.220012] CPU: 2 PID: 15519 Comm: rmmod ve: / Kdump: loaded
> Tainted: G B E X ------- --- 5.14.0-427.37.1.ovz9.78.1 #6
> ovz9.78.1
> [ 351.220345] Hardware name: Supermicro SYS-5019C-L/X11SCL-IF, BIOS 1.5
> 10/05/2020
> [ 351.220677] Call Trace:
> [ 351.220945] <TASK>
> [ 351.221230] dump_stack_lvl+0x34/0x48
> [ 351.221523] panic+0x107/0x2f7
> [ 351.221796] slab_err.cold+0xc/0xc
> [ 351.222073] __kmem_cache_shutdown+0x16a/0x310
> [ 351.222368] kmem_cache_destroy+0x51/0x160
> [ 351.222643] dm_ploop_exit+0x24/0x20d [ploop]
> [ 351.222921] __do_sys_delete_module.constprop.0+0x175/0x280
> [ 351.223216] ? syscall_trace_enter.constprop.0+0x126/0x1a0
> [ 351.223519] do_syscall_64+0x59/0x90
> [ 351.223793] ? __rseq_handle_notify_resume+0x26/0xb0
> [ 351.224070] ? exit_to_user_mode_loop+0xd0/0x130
> [ 351.224345] ? exit_to_user_mode_prepare+0xb6/0x100
> [ 351.224622] ? syscall_exit_to_user_mode+0x22/0x40
> [ 351.224897] ? do_syscall_64+0x69/0x90
> [ 351.225170] ? exc_page_fault+0x62/0x150
> [ 351.225460] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [ 351.225737] RIP: 0033:0x7f72bc83f5ab
> [ 351.226011] Code: 73 01 c3 48 8b 0d 75 a8 1b 00 f7 d8 64 89 01 48 83
> c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f
> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 45 a8 1b 00 f7 d8 64 89 01 48
> [ 351.226394] RSP: 002b:00007fff49e1e798 EFLAGS: 00000206 ORIG_RAX:
> 00000000000000b0
> [ 351.226709] RAX: ffffffffffffffda RBX: 00005580d76568f0 RCX:
> 00007f72bc83f5ab
> [ 351.226995] RDX: 000000000000000a RSI: 0000000000000800 RDI:
> 00005580d7656958
> [ 351.227316] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 0000000000000000
> [ 351.227601] R10: 00007f72bc99eac0 R11: 0000000000000206 R12:
> 00007fff49e1e9f0
> [ 351.227887] R13: 00007fff49e1f070 R14: 00005580d76562a0 R15:
> 00005580d76568f0
> [ 351.228171] </TASK>
>
--
Regards,
Alexander Atanasov
More information about the Devel
mailing list