[Devel] [PATCH RHEL7 COMMIT] Revert "blk-mq: fix corruption with direct issue"

Thu Jun 6 15:07:15 MSK 2019

The commit is pushed to "branch-rh7-3.10.0-957.12.2.vz7.96.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-957.12.2.vz7.96.12
------>
commit 8b064bb4874d701f744611eb8c9f6fade8d61929
Author: Konstantin Khorenko <khorenko at virtuozzo.com>
Date:   Mon Jun 3 13:14:09 2019 +0300

    Revert "blk-mq: fix corruption with direct issue"
    
    This reverts RedHat patch
    - [block] blk-mq: fix corruption with direct issue (Ming Lei) [1670511
      1656654]
    
    https://pmc.acronis.com/browse/VSTOR-23501
    
    Signed-off-by: Konstantin Khorenko <khorenko at virtuozzo.com>
    
    Original comment:
    
     From 9f622f766107b7aac938dc5229adaf7c1340b692 Mon Sep 17 00:00:00 2001
     From: Ming Lei <ming.lei at redhat.com>
     Date: Sat, 8 Dec 2018 13:07:11 +0100
     Subject: [PATCH 10493/10502] [block] blk-mq: fix corruption with direct issue
     Message-id: <20181208130712.10869-2-ming.lei at redhat.com>
     Patchwork-id: 234763
     O-Subject: [RHEL7.7 1/2] blk-mq: fix corruption with direct issue
     Bugzilla: 1656654
     Z-Bugzilla: 1670511
     RH-Acked-by: Jeff Moyer <jmoyer at redhat.com>
     RH-Acked-by: Xiao Ni <xni at redhat.com>
     RH-Acked-by: Gopal Tiwari <gtiwari at redhat.com>
     From: Jens Axboe <axboe at kernel.dk>
     BZ: 1656654
     commit ffe81d45322cc3cb140f0db080a4727ea284661e
     Author: Jens Axboe <axboe at kernel.dk>
     Date:   Tue Dec 4 20:06:48 2018 -0700
         blk-mq: fix corruption with direct issue
         If we attempt a direct issue to a SCSI device, and it returns BUSY, then
         we queue the request up normally. However, the SCSI layer may have
         already setup SG tables etc for this particular command. If we later
         merge with this request, then the old tables are no longer valid. Once
         we issue the IO, we only read/write the original part of the request,
         not the new state of it.
         This causes data corruption, and is most often noticed with the file
         system complaining about the just read data being invalid:
         [  235.934465] EXT4-fs error (device sda1): ext4_iget:4831: inode #7142: comm dpkg-query: bad extra_isize 24937 (inode size 256)
         because most of it is garbage...
         This doesn't happen from the normal issue path, as we will simply defer
         the request to the hardware queue dispatch list if we fail. Once it's on
         the dispatch list, we never merge with it.
         Fix this from the direct issue path by flagging the request as
         REQ_NOMERGE so we don't change the size of it before issue.
         See also:
           https://bugzilla.kernel.org/show_bug.cgi?id=201685
         Tested-by: Guenter Roeck <linux at roeck-us.net>
         Fixes: 6ce3dd6eec1 ("blk-mq: issue directly if hw queue isn't busy in case of 'none'")
         Cc: stable at vger.kernel.org
         Signed-off-by: Jens Axboe <axboe at kernel.dk>
     There is conflict because RHEL7 uses BLK_MQ_RQ_QUEUE_BUSY & BLK_MQ_RQ_QUEUE_DEV_BUSY,
     instead of BLK_STS_RESOURCE & BLK_STS_DEV_RESOURCE for describing queueing rq result.
     So solve it.
     Signed-off-by: Ming Lei <ming.lei at redhat.com>
     Signed-off-by: Jan Stancek <jstancek at redhat.com>
    
     https://access.redhat.com/labs/rhcb/RHEL-7.6/kernel-3.10.0-957.10.1.el7/patches/10493-block-blk-mq-fix-corruption-with-direct-issue.patch
---
 block/blk-mq.c | 26 +-------------------------
 1 file changed, 1 insertion(+), 25 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 0166c2a2b346..3dd6f1d72cee 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1695,15 +1695,6 @@ static int __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *r
 		break;
 	case BLK_MQ_RQ_QUEUE_BUSY:
 	case BLK_MQ_RQ_QUEUE_DEV_BUSY:
-		/*
-		 * If direct dispatch fails, we cannot allow any merging on
-		 * this IO. Drivers (like SCSI) may have set up permanent state
-		 * for this request, like SG tables and mappings, and if we
-		 * merge to it later on then we'll still only do IO to the
-		 * original part.
-		 */
-		rq->cmd_flags |= REQ_NOMERGE;
-
 		blk_mq_update_dispatch_busy(hctx, true);
 		__blk_mq_requeue_request(rq);
 		break;
@@ -1715,18 +1706,6 @@ static int __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, struct request *r
 	return ret;
 }
 
-/*
- * Don't allow direct dispatch of anything but regular reads/writes,
- * as some of the other commands can potentially share request space
- * with data we need for the IO scheduler. If we attempt a direct dispatch
- * on those and fail, we can't safely add it to the scheduler afterwards
- * without potentially overwriting data that the driver has already written.
- */
-static bool blk_rq_can_direct_dispatch(struct request *rq)
-{
-	return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE;
-}
-
 static int __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 						struct request *rq,
 						bool bypass_insert)
@@ -1747,7 +1726,7 @@ static int __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 		goto insert;
 	}
 
-	if (!blk_rq_can_direct_dispatch(rq) || (q->elevator && !bypass_insert))
+	if (q->elevator && !bypass_insert)
 		goto insert;
 
 	if (!blk_mq_get_dispatch_budget(hctx))
@@ -1806,9 +1785,6 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
 		struct request *rq = list_first_entry(list, struct request,
 				queuelist);
 
-		if (!blk_rq_can_direct_dispatch(rq))
-			break;
-
 		list_del_init(&rq->queuelist);
 		ret = blk_mq_request_issue_directly(rq);
 		if (ret != BLK_MQ_RQ_QUEUE_OK) {