[Devel] [PATCH] Fix Issue: When VirtIO Backend providing VIRTIO_BLK_F_MQ feature, The file system of the front-end OS fails to be mounted.

Konstantin Khorenko khorenko at virtuozzo.com
Tue Jun 18 14:43:56 MSK 2024


Hi Lynch,

thank you very much for your reply and for willingness of making the global open source better. :)

 From your replay i clearly understand that you have tried to increase the number of segments
and that has boosted the performance and that it did not work properly before your fix.

But why this particular patch fixes the problem?

What was the exact root cause of the problem? (not high level "front-end OS did not boot", but on the 
low level - what caused the problem?)

As the patch effectively preserves the original req->sector of the request - how did it matter?

Is there a possibility that a single request is handled in parallel somewhere where req->sector is 
checked?

Or is the same request is handled sequentially and the second+ usages suffer from screwed req->sector 
value?

Can you please show the call traces?

Thank you!

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 17.06.2024 07:51, Lynch wu wrote:
> Hi Konstantin:
>
>   Thanks for your reply.
> virtio_blk.c is the front-end driver code of the virtio block, which has the following code:
> /* We need to know how many segments before we allocate. */
> err = virtio_cread_feature(vdev, VIRTIO_BLK_F_SEG_MAX,
>   struct virtio_blk_config, seg_max,
>   &sg_elems);
>
> /* We need at least one SG element, whatever they say. */
> if (err || !sg_elems)
> sg_elems = 1;
>
> Eventually, the max number of segments will be configured through the blk_queue_max_segments interface.
> When I modify the value of the sg_elems due to performance reasons, such as 128, I will find that the
> front-end operating system cannot be started normally, which is manifested as the Android file system
> cannot be mounted normally.
>
> You can do a simple test to modify the default value of sg_elems virtio_blk.c in your company's
> virtualization solution, as follows:
> vzkernel$ git diff drivers/block/virtio_blk.c
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index fd086179f980..0a216c9cb563 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -719,7 +719,7 @@ static int virtblk_probe(struct virtio_device *vdev)
>
>         /* We need at least one SG element, whatever they say. */
>         if (err || !sg_elems)
> -               sg_elems = 1;
> +               sg_elems = 2;
>
>         /* Prevent integer overflows and honor max vq size */
>         sg_elems = min_t(u32, sg_elems, VIRTIO_BLK_MAX_SG_ELEMS - 2);
>
> At this point, the expected test symptom is that the front-end OS does not boot properly.
>
> In our virtualization solution, we used your company's blk.c as the back-end driver of Virtio Block,
> found the above problems, and made problem fixes, and at the same time, the performance of Virtio
> Block has also been improved, so we want to give back to the open source community.
>
> Konstantin Khorenko <khorenko at virtuozzo.com> 于2024年6月14日周五 19:23写道:
>
>     Hi Lynch, Andrey,
>
>     thank you for the patch, but can you please describe the problem it fixes in a bit more details?
>     i see that the patch preserves original req->sector, but why/how that becomes important in case
>     VIRTIO_BLK_F_MQ feature is set?
>
>     Thank you.
>
>     --
>     Best regards,
>
>     Konstantin Khorenko,
>     Virtuozzo Linux Kernel Team
>
>     On 17.05.2024 11:09, Andrey Zhadchenko wrote:
>     > Hi
>     >
>     > Thank you for the patch.
>     > vhost-blk didn't spark enough interest to be reviewed and merged into
>     > the upstream and the code is not present here.
>     > I have forwarded your patch to relevant openvz kernel mailing list.
>     >
>     > On 5/17/24 07:34, Lynch wrote:
>     >> ---
>     >>    drivers/vhost/blk.c | 6 ++++--
>     >>    1 file changed, 4 insertions(+), 2 deletions(-)
>     >>
>     >> diff --git a/drivers/vhost/blk.c b/drivers/vhost/blk.c
>     >> index 44fbf253e773..0e946d9dfc33 100644
>     >> --- a/drivers/vhost/blk.c
>     >> +++ b/drivers/vhost/blk.c
>     >> @@ -251,6 +251,7 @@ static int vhost_blk_bio_make(struct vhost_blk_req *req,
>     >>      struct page **pages, *page;
>     >>      struct bio *bio = NULL;
>     >>      int bio_nr = 0;
>     >> +    sector_t sector_tmp;
>     >>
>     >>      if (unlikely(req->bi_opf == REQ_OP_FLUSH))
>     >>              return vhost_blk_bio_make_simple(req, bdev);
>     >> @@ -270,6 +271,7 @@ static int vhost_blk_bio_make(struct vhost_blk_req *req,
>     >>              req->bio = req->inline_bio;
>     >>      }
>     >>
>     >> +    sector_tmp = req->sector;
>     >>      req->iov_nr = 0;
>     >>      for (i = 0; i < iov_nr; i++) {
>     >>              int pages_nr = iov_num_pages(&iov[i]);
>     >> @@ -302,7 +304,7 @@ static int vhost_blk_bio_make(struct vhost_blk_req *req,
>     >>                              bio = bio_alloc(GFP_KERNEL, pages_nr_total);
>     >>                              if (!bio)
>     >>                                      goto fail;
>     >> - bio->bi_iter.bi_sector  = req->sector;
>     >> + bio->bi_iter.bi_sector  = sector_tmp;
>     >>                              bio_set_dev(bio, bdev);
>     >>                              bio->bi_private = req;
>     >>                              bio->bi_end_io  = vhost_blk_req_done;
>     >> @@ -314,7 +316,7 @@ static int vhost_blk_bio_make(struct vhost_blk_req *req,
>     >>                      iov_len         -= len;
>     >>
>     >>                      pos = (iov_base & VHOST_BLK_SECTOR_MASK) + iov_len;
>     >> -                    req->sector += pos >> VHOST_BLK_SECTOR_BITS;
>     >> +                    sector_tmp += pos >> VHOST_BLK_SECTOR_BITS;
>     >>              }
>     >>
>     >>              pages += pages_nr;
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/devel/attachments/20240618/1fda9124/attachment-0001.html>


More information about the Devel mailing list