[Devel] [PATCH RH7 09/12] rq-qos: fix missed wake-ups in rq_qos_throttle

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Thu Sep 29 14:30:09 MSK 2022


From: Josef Bacik <josef at toxicpanda.com>

We saw a hang in production with WBT where there was only one waiter in
the throttle path and no outstanding IO.  This is because of the
has_sleepers optimization that is used to make sure we don't steal an
inflight counter for new submitters when there are people already on the
list.

We can race with our check to see if the waitqueue has any waiters (this
is done locklessly) and the time we actually add ourselves to the
waitqueue.  If this happens we'll go to sleep and never be woken up
because nobody is doing IO to wake us up.

Fix this by checking if the waitqueue has a single sleeper on the list
after we add ourselves, that way we have an uptodate view of the list.

Reviewed-by: Oleg Nesterov <oleg at redhat.com>
Signed-off-by: Josef Bacik <josef at toxicpanda.com>
Signed-off-by: Jens Axboe <axboe at kernel.dk>

Changes when porting to vz7:
- original patch is patching block/blk-rq-qos.c:rq_qos_wait, but in vz7
  similar hunk is in block/blk-wbt.c:__wbt_wait

https://jira.sw.ru/browse/PSBM-141883
(cherry picked from commit 545fbd0775bafcefc8f7bc844291bd13c44b7fdc)
Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
---
 block/blk-wbt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/blk-wbt.c b/block/blk-wbt.c
index 49d11e089c97..5477c3ffe7a7 100644
--- a/block/blk-wbt.c
+++ b/block/blk-wbt.c
@@ -571,6 +571,7 @@ static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct,
 		return;
 
 	prepare_to_wait_exclusive(&rqw->wait, &data.wq, TASK_UNINTERRUPTIBLE);
+	has_sleeper = !wq_has_single_sleeper(&rqw->wait);
 	do {
 		if (data.got_token)
 			break;
-- 
2.37.1



More information about the Devel mailing list