[Devel] [PATCH RH9 v2 00/10] vhost-blk: in-kernel accelerator for virtio-blk guests
Konstantin Khorenko
khorenko at virtuozzo.com
Thu Sep 15 20:04:16 MSK 2022
Pasha, Andrey,
your conversation is very positive and impressive,
for now i will apply the version 2 of the patchset - just to unblock the userspace team to develop
libvirt part,
but please, don't stop, continue the review/rework,
i will just through away the old code completely and take the whole new set version XXX once we get
the final revision.
Thank you.
--
Best regards,
Konstantin Khorenko,
Virtuozzo Linux Kernel Team
On 08.09.2022 17:32, Andrey Zhadchenko wrote:
> Although QEMU virtio-blk is quite fast, there is still some room for
> improvements. Disk latency can be reduced if we handle virito-blk requests
> in host kernel so we avoid a lot of syscalls and context switches.
> The idea is quite simple - QEMU gives us block device and we translate
> any incoming virtio requests into bio and push them into bdev.
> The biggest disadvantage of this vhost-blk flavor is raw format.
> Luckily Kirill Thai proposed device mapper driver for QCOW2 format to attach
> files as block devices: https://www.spinics.net/lists/kernel/msg4292965.html
>
> Also by using kernel modules we can bypass iothread limitation and finaly scale
> block requests with cpus for high-performance devices.
>
>
> There have already been several attempts to write vhost-blk:
>
> Asias' version: https://lkml.org/lkml/2012/12/1/174
> Badari's version: https://lwn.net/Articles/379864/
> Vitaly's https://lwn.net/Articles/770965/
>
> The main difference between them is API to access backend file. The fastest
> one is Asias's version with bio flavor. It is also the most reviewed and
> have the most features. So vhost_blk module is partially based on it. Multiple
> virtqueue support was addded, some places reworked. Added support for several
> vhost workers.
>
> test setup and results:
> fio --direct=1 --rw=randread --bs=4k --ioengine=libaio --iodepth=128
> QEMU drive options: cache=none
> filesystem: xfs
>
> SSD:
> | randread, IOPS | randwrite, IOPS |
> Host | 95.8k | 85.3k |
> QEMU virtio | 57.5k | 79.4k |
> QEMU vhost-blk | 95.6k | 84.3k |
>
> RAMDISK (vq == vcpu):
> | randread, IOPS | randwrite, IOPS |
> virtio, 1vcpu | 123k | 129k |
> virtio, 2vcpu | 253k (??) | 250k (??) |
> virtio, 4vcpu | 158k | 154k |
> vhost-blk, 1vcpu | 110k | 113k |
> vhost-blk, 2vcpu | 247k | 252k |
> vhost-blk, 8vcpu | 497k | 469k | *single kernel thread
> vhost-blk, 8vcpu | 730k | 701k | *two kernel threads
>
> v2:
>
> patch 1/10
> - removed unused VHOST_BLK_VQ
> - reworked bio handling a bit: now add all pages from signle iov into
> single bio istead of allocating one bio per page
> - changed how to calculate sector incrementation
> - check move_iovec() in vhost_blk_req_handle()
> - remove snprintf check and better check ret from copy_to_iter for
> VIRTIO_BLK_ID_BYTES requests
> - discard vq request if vhost_blk_req_handle() returned negative code
> - forbid to change nonzero backend in vhost_blk_set_backend(). First of
> all, QEMU sets backend only once. Also if we want to change backend when
> we already running requests we need to be much more careful in
> vhost_blk_handle_guest_kick() as it is not taking any references. If
> userspace want to change backend that bad it can always reset device.
> - removed EXPERIMENTAL from Kconfig
>
> patch 3/10
> - don't bother with checking dev->workers[0].worker since dev->nworkers
> will always contain 0 in this case
>
> patch 6/10
> - Make code do what docs suggest. Previously ioctl-supplied new number
> of workers were treated like an amount that should be added. Use new
> number as a ceiling instead and add workers up to that number.
>
>
> https://jira.sw.ru/browse/PSBM-139414
> Andrey Zhadchenko (10):
> drivers/vhost: vhost-blk accelerator for virtio-blk guests
> drivers/vhost: use array to store workers
> drivers/vhost: adjust vhost to flush all workers
> drivers/vhost: rework attaching cgroups to be worker aware
> drivers/vhost: rework worker creation
> drivers/vhost: add ioctl to increase the number of workers
> drivers/vhost: assign workers to virtqueues
> drivers/vhost: add API to queue work at virtqueue worker
> drivers/vhost: allow polls to be bound to workers via vqs
> drivers/vhost: queue vhost_blk works at vq workers
>
> drivers/vhost/Kconfig | 12 +
> drivers/vhost/Makefile | 3 +
> drivers/vhost/blk.c | 828 +++++++++++++++++++++++++++++++++++++
> drivers/vhost/vhost.c | 252 ++++++++---
> drivers/vhost/vhost.h | 21 +-
> include/uapi/linux/vhost.h | 14 +
> 6 files changed, 1068 insertions(+), 62 deletions(-)
> create mode 100644 drivers/vhost/blk.c
>
More information about the Devel
mailing list