[Devel] [PATCH RHEL7 COMMIT] ms/rq-qos: fix missed wake-ups in rq_qos_throttle try two
Konstantin Khorenko
khorenko at virtuozzo.com
Fri Mar 29 20:54:09 MSK 2024
The commit is pushed to "branch-rh7-3.10.0-1160.108.1.vz7.221.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-1160.108.1.vz7.221.2
------>
commit 83b166fdd5d4a8b07598f3ee0e632a32a25d82b8
Author: Jan Kara <jack at suse.cz>
Date: Mon Jun 7 13:26:13 2021 +0200
ms/rq-qos: fix missed wake-ups in rq_qos_throttle try two
Commit 545fbd0775ba ("rq-qos: fix missed wake-ups in rq_qos_throttle")
tried to fix a problem that a process could be sleeping in rq_qos_wait()
without anyone to wake it up. However the fix is not complete and the
following can still happen:
CPU1 (waiter1) CPU2 (waiter2) CPU3 (waker)
rq_qos_wait() rq_qos_wait()
acquire_inflight_cb() -> fails
acquire_inflight_cb() -> fails
completes IOs, inflight
decreased
prepare_to_wait_exclusive()
prepare_to_wait_exclusive()
has_sleeper = !wq_has_single_sleeper() -> true as there are two sleepers
has_sleeper = !wq_has_single_sleeper() -> true
io_schedule() io_schedule()
Deadlock as now there's nobody to wakeup the two waiters. The logic
automatically blocking when there are already sleepers is really subtle
and the only way to make it work reliably is that we check whether there
are some waiters in the queue when adding ourselves there. That way, we
are guaranteed that at least the first process to enter the wait queue
will recheck the waiting condition before going to sleep and thus
guarantee forward progress.
mFixes: 545fbd0775ba ("rq-qos: fix missed wake-ups in rq_qos_throttle")
CC: stable at vger.kernel.org
Signed-off-by: Jan Kara <jack at suse.cz>
Link: https://lore.kernel.org/r/20210607112613.25344-1-jack@suse.cz
Signed-off-by: Jens Axboe <axboe at kernel.dk>
Porting to vz7 notices:
- wait_queue_head in mainstream, but still wait_queue_head_t in vz7
- use waitqueue_active() in vz7 instead of straight list_empty() check in ms
https://virtuozzo.atlassian.net/browse/PSBM-153598
(cherry picked from commit 11c7aa0ddea8611007768d3e6b58d45dc60a19e1)
Ported-by: Konstantin Khorenko <khorenko at virtuozzo.com>
---
block/blk-wbt.c | 4 ++--
include/linux/wait.h | 2 +-
kernel/wait.c | 9 +++++++--
3 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/block/blk-wbt.c b/block/blk-wbt.c
index 099678cd0d04..b9d9e1d68a40 100644
--- a/block/blk-wbt.c
+++ b/block/blk-wbt.c
@@ -571,8 +571,8 @@ static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct,
if (!has_sleeper && rq_wait_inc_below(rqw, get_limit(rwb, rw)))
return;
- prepare_to_wait_exclusive(&rqw->wait, &data.wq, TASK_UNINTERRUPTIBLE);
- has_sleeper = !wq_has_single_sleeper(&rqw->wait);
+ has_sleeper = !prepare_to_wait_exclusive(&rqw->wait, &data.wq,
+ TASK_UNINTERRUPTIBLE);
do {
/* The memory barrier in set_task_state saves us here. */
if (data.got_token)
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 12075edebfd6..3583ceb21a78 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -1044,7 +1044,7 @@ extern long interruptible_sleep_on_timeout(wait_queue_head_t *q,
* Waitqueues which are removed from the waitqueue_head at wakeup time
*/
void prepare_to_wait(wait_queue_head_t *q, wait_queue_t *wait, int state);
-void prepare_to_wait_exclusive(wait_queue_head_t *q, wait_queue_t *wait, int state);
+bool prepare_to_wait_exclusive(wait_queue_head_t *q, wait_queue_t *wait, int state);
void finish_wait(wait_queue_head_t *q, wait_queue_t *wait);
void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait,
unsigned int mode, void *key);
diff --git a/kernel/wait.c b/kernel/wait.c
index 15c1d0892c71..5303bb6de60b 100644
--- a/kernel/wait.c
+++ b/kernel/wait.c
@@ -79,17 +79,22 @@ prepare_to_wait(wait_queue_head_t *q, wait_queue_t *wait, int state)
}
EXPORT_SYMBOL(prepare_to_wait);
-void
+/* Returns true if we are the first waiter in the queue, false otherwise. */
+bool
prepare_to_wait_exclusive(wait_queue_head_t *q, wait_queue_t *wait, int state)
{
unsigned long flags;
+ bool was_empty = false;
wait->flags |= WQ_FLAG_EXCLUSIVE;
spin_lock_irqsave(&q->lock, flags);
- if (list_empty(&wait->task_list))
+ if (list_empty(&wait->task_list)) {
+ was_empty = !waitqueue_active(q);
__add_wait_queue_tail(q, wait);
+ }
set_current_state(state);
spin_unlock_irqrestore(&q->lock, flags);
+ return was_empty;
}
EXPORT_SYMBOL(prepare_to_wait_exclusive);
More information about the Devel
mailing list