[Devel] [PATCH RHEL7 COMMIT] scsi: rollback to reset request in a request timer handler
Konstantin Khorenko
khorenko at virtuozzo.com
Thu Mar 28 11:44:55 MSK 2019
The commit is pushed to "branch-rh7-3.10.0-957.10.1.vz7.85.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-957.10.1.vz7.85.4
------>
commit 47d0b4aca0b0c7827ca1d928d7db759d7db55db1
Author: Denis Plotnikov <dplotnikov at virtuozzo.com>
Date: Thu Mar 28 11:44:53 2019 +0300
scsi: rollback to reset request in a request timer handler
There is a race condition with a long request:
Each request has a timer. When timer fires it sets REQ_ATOM_COMPLETE and
clears it after finishing. The request completion checks REQ_ATOM_COMPLETE
and if it is set the completion returns doing nothing and never executes again,
thinkg that the request doesn't need any attention anymore.
Thus, if the request completion starts executing when the timer handler is
in progress it just returns, then the timer clears the complete flag and
the request stays in the system forever executing the timer handler again
and again which just rearms itself.
So, in the the system we have a hung request which may block the user
processes turning them in D-state.
Fix the problem by reverting the patch:
commit e72c9a2a67a6400c8ef3d01d4c461dbbbfa0e1f0
Author: Paolo Bonzini <pbonzini at redhat.com>
Date: Wed Jun 21 16:35:46 2017 +0200
scsi: virtio_scsi: let host do exception handling
virtio_scsi tries to do exception handling after the default 30 seconds
timeout expires. However, it's better to let the host control the
timeout, otherwise with a heavy I/O load it is likely that an abort will
also timeout. This leads to fatal errors like filesystems going
offline.
Disable the 'sd' timeout and allow the host to do exception handling,
following the precedent of the storvsc driver.
Hannes has a proposal to introduce timeouts in virtio, but this provides
an immediate solution for stable kernels too.
Reported-by: Douglas Miller <dougmill at linux.vnet.ibm.com>
Cc: "James E.J. Bottomley" <jejb at linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen at oracle.com>
Cc: Hannes Reinecke <hare at suse.de>
Cc: linux-scsi at vger.kernel.org
Cc: stable at vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini at redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
https://jira.sw.ru/browse/PSBM-92312
Signed-off-by: Denis Plotnikov <dplotnikov at virtuozzo.com>
---
drivers/scsi/virtio_scsi.c | 12 ------------
1 file changed, 12 deletions(-)
diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 744db2a91a1b..0e3230286473 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -740,16 +740,6 @@ static void virtscsi_target_destroy(struct scsi_target *starget)
kfree(tgt);
}
-/*
- * The host guarantees to respond to each command, although I/O
- * latencies might be higher than on bare metal. Reset the timer
- * unconditionally to give the host a chance to perform EH.
- */
-static enum blk_eh_timer_return virtscsi_eh_timed_out(struct scsi_cmnd *scmnd)
-{
- return BLK_EH_RESET_TIMER;
-}
-
static struct scsi_host_template virtscsi_host_template_single = {
.module = THIS_MODULE,
.name = "Virtio SCSI HBA",
@@ -758,7 +748,6 @@ static struct scsi_host_template virtscsi_host_template_single = {
.queuecommand = virtscsi_queuecommand_single,
.eh_abort_handler = virtscsi_abort,
.eh_device_reset_handler = virtscsi_device_reset,
- .eh_timed_out = virtscsi_eh_timed_out,
.slave_alloc = virtscsi_device_alloc,
.can_queue = 1024,
@@ -776,7 +765,6 @@ static struct scsi_host_template virtscsi_host_template_multi = {
.queuecommand = virtscsi_queuecommand_multi,
.eh_abort_handler = virtscsi_abort,
.eh_device_reset_handler = virtscsi_device_reset,
- .eh_timed_out = virtscsi_eh_timed_out,
.can_queue = 1024,
.dma_boundary = UINT_MAX,
More information about the Devel
mailing list