[Devel] [PATCH RHEL7 COMMIT] bcache: Fix crashes of bcache used with raid1 #PSBM-106785

Vasily Averin vvs at virtuozzo.com
Tue Sep 15 11:54:59 MSK 2020


The commit is pushed to "branch-rh7-3.10.0-1127.18.2.vz7.163.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-1127.18.2.vz7.163.21
------>
commit 7c1c0cf9bf43ac7e880c9aa71adeca4b100c9fb5
Author: Andrey Ryabinin <aryabinin at virtuozzo.com>
Date:   Tue Sep 15 11:54:59 2020 +0300

    bcache: Fix crashes of bcache used with raid1 #PSBM-106785
    
    When bcache is built on top of raid1 devices, the following
    warning happens:
    
     WARNING: CPU: 2 PID: 8138 at include/linux/bio.h:559 raid1_write_request+0x994/0xba0 [raid1]
     Call Trace:
      dump_stack+0x19/0x1b
      __warn+0xd8/0x100
      warn_slowpath_null+0x1d/0x20
      raid1_write_request+0x994/0xba0 [raid1]
      raid1_make_request+0x8a/0x5b0 [raid1]
      md_handle_request+0xd0/0x150
      md_make_request+0x79/0x190
      generic_make_request+0x147/0x380
      bch_generic_make_request_hack+0x2a/0xc0 [bcache]
      bch_generic_make_request+0x3d/0x190 [bcache]
      write_dirty+0x7e/0x110 [bcache]
      process_one_work+0x185/0x440
      worker_thread+0x126/0x3c0
      kthread+0xd1/0xe0
      ret_from_fork_nospec_begin+0x21/0x21
    
    And immediately followed by the crash:
     kernel BUG at drivers/md/bcache/closure.c:53!
     Call Trace:
      dirty_endio+0x28/0x30 [bcache]
      bio_endio+0x8c/0x130
      call_bio_endio+0x2f/0x40 [raid1]
      raid_end_bio_io+0x2e/0x90 [raid1]
      r1_bio_write_done+0x35/0x50 [raid1]
      raid1_end_write_request+0x118/0x2f0 [raid1]
      bio_endio+0x8c/0x130
      blk_update_request+0x90/0x370
      blk_mq_end_request+0x1a/0x90
      virtblk_request_done+0x3f/0x70 [virtio_blk]
      __blk_mq_complete_request_remote+0x19/0x20
      flush_smp_call_function_queue+0x63/0x130
      generic_smp_call_function_single_interrupt+0x13/0x30
      smp_call_function_single_interrupt+0x2d/0x40
      call_function_single_interrupt+0x16a/0x170
    
    So this happens because bcache doesn't allocate & initialize 'bio_aux'
    structure needed by raid1 device. Add 'bio_aux' to 'dirty_io' struct
    and initialize it along with the 'bio' in dirty_init() to fix this.
    
    https://jira.sw.ru/browse/PSBM-106785
    Signed-off-by: Andrey Ryabinin <aryabinin at virtuozzo.com>
---
 drivers/md/bcache/writeback.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index 841f049..c2bda70 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -17,6 +17,7 @@ static void read_dirty(struct closure *);
 struct dirty_io {
 	struct closure		cl;
 	struct cached_dev	*dc;
+	struct bio_aux		bio_aux;
 	struct bio		bio;
 };
 
@@ -122,6 +123,7 @@ static void dirty_init(struct keybuf_key *w)
 	bio->bi_max_vecs	= DIV_ROUND_UP(KEY_SIZE(&w->key), PAGE_SECTORS);
 	bio->bi_private		= w;
 	bio->bi_io_vec		= bio->bi_inline_vecs;
+	bio_init_aux(&io->bio, &io->bio_aux);
 	bch_bio_map(bio, NULL);
 }
 


More information about the Devel mailing list