[Devel] [PATCH RHEL9 COMMIT] ms/block: add plug while submitting IO

Konstantin Khorenko khorenko at virtuozzo.com
Thu Jan 9 15:51:45 MSK 2025


The commit is pushed to "branch-rh9-5.14.0-427.44.1.vz9.80.x-ovz" and will appear at git at bitbucket.org:openvz/vzkernel.git
after rh9-5.14.0-427.44.1.vz9.80.3
------>
commit ee8584fda4df0ffcf5c56d3bd1b09e05b1a6eac7
Author: Yu Kuai <yukuai3 at huawei.com>
Date:   Wed Jan 8 12:59:59 2025 +0800

    ms/block: add plug while submitting IO
    
    So that if caller didn't use plug, for example, __blkdev_direct_IO_simple()
    and __blkdev_direct_IO_async(), block layer can still benefit from caching
    nsec time in the plug.
    
    Signed-off-by: Yu Kuai <yukuai3 at huawei.com>
    Link: https://lore.kernel.org/r/20240509123825.3225207-1-yukuai1@huaweicloud.com
    Signed-off-by: Jens Axboe <axboe at kernel.dk>
    
    +++
    block: fix lost bio for plug enabled bio based device
    
    With the following two conditions, bio will be lost:
    
    1) blk plug is not enabled, for example, __blkdev_direct_IO_simple() and
    __blkdev_direct_IO_async();
    2) bio plug is enabled, for example write IO for raid1/raid10 while
    bitmap is enabled;
    
    Root cause is that blk_finish_plug() will add the bio to
    curent->bio_list, while such bio will not be handled:
    
    __submit_bio_noacct
     current->bio_list = bio_list_on_stack;
     blk_start_plug
    
     do {
      dm_submit_bio
       md_handle_request
        raid10_write_request
         -> generate new bio for underlying disks
         raid1_add_bio_to_plug -> bio is added to plug
     } while ((bio = bio_list_pop(&bio_list_on_stack[0])))
     -> previous bio are all handled
    
     blk_finish_plug
      raid10_unplug
       raid1_submit_write
        submit_bio_noacct
         if (current->bio_list)
          bio_list_add(&current->bio_list[0], bio)
          -> add new bio
    
     current->bio_list = NULL
     -> new bio is lost
    
    Fix the problem by moving the plug into the while loop, so that
    current->bio_list will still be handled after blk_finish_plug().
    
    By the way, enable plug for raid1/raid10 in this case will also prevent
    delay IO handling into daemon thread, which should also improve IO
    performance.
    
    Fixes: 060406c61c7c ("block: add plug while submitting IO")
    Reported-by: Changhui Zhong <czhong at redhat.com>
    Closes: https://lore.kernel.org/all/CAGVVp+Xsmzy2G9YuEatfMT6qv1M--YdOCQ0g7z7OVmcTbBxQAg@mail.gmail.com/
    Signed-off-by: Yu Kuai <yukuai3 at huawei.com>
    Tested-by: Changhui Zhong <czhong at redhat.com>
    Link: https://lore.kernel.org/r/20240521200308.983986-1-yukuai1@huaweicloud.com
    Signed-off-by: Jens Axboe <axboe at kernel.dk>
    
    We see decrease of performance between vz9.40.9 and vz9.78.5 kernels,
    due to patch [1] being ported via RHEL. The patch [1] basically removes
    plug when submitting bio on this path:
    
      +-> blkdev_write_iter
        +-> blk_start_plug # removed in [1]
        +-> __generic_file_write_iter
          +-> generic_file_direct_write
            +-> blkdev_direct_IO
              +-> __blkdev_direct_IO_async
                +-> submit_bio
                  +-> submit_bio_noacct
                    +-> submit_bio_noacct_nocheck
                      +-> __submit_bio
                        +-> blk_start_plug # added back in [2] and [3]
                        +-> blk_mq_submit_bio
    
    And there already are two mainstream patches which bring this plug back
    as block layer can benefit from it, so let's port them, after that
    performance degradation disappears.
    
    As fixup patch [3] basically reverts [2], let's merge them together to
    simplify further rebases of those patches.
    
    Fixes: 712c7364655f6 ("block: don't plug in blkdev_write_iter") [1]
    (cherry picked from commit 060406c61c7cb4bbd82a02d179decca9c9bb3443) [2]
    (cherry picked from commit 9a42891c35d50a8472b42c61256867b4dfcc1941) [3]
    https://virtuozzo.atlassian.net/browse/VSTOR-94335
    Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
    
    Feature: fix ms/fs
---
 block/blk-core.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index 6abad29dd501..12a0e48d0112 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -592,9 +592,14 @@ static inline blk_status_t blk_check_zone_append(struct request_queue *q,
 
 static void __submit_bio(struct bio *bio)
 {
+	/* If plug is not used, add new plug here to cache nsecs time. */
+	struct blk_plug plug;
+
 	if (unlikely(!blk_crypto_bio_prep(&bio)))
 		return;
 
+	blk_start_plug(&plug);
+
 	if (!bio->bi_bdev->bd_has_submit_bio) {
 		blk_mq_submit_bio(bio);
 	} else if (likely(bio_queue_enter(bio) == 0)) {
@@ -603,6 +608,8 @@ static void __submit_bio(struct bio *bio)
 		disk->fops->submit_bio(bio);
 		blk_queue_exit(disk->queue);
 	}
+
+	blk_finish_plug(&plug);
 }
 
 /*


More information about the Devel mailing list