[Devel] [PATCH RHEL7 COMMIT] sched: fix cfs_rq::nr_iowait accounting
Konstantin Khorenko
khorenko at virtuozzo.com
Thu May 30 17:39:10 MSK 2019
The commit is pushed to "branch-rh7-3.10.0-957.12.2.vz7.96.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-957.12.2.vz7.96.9
------>
commit 4a84e35cab24b5088a1f0acdb7d1391da212e8fc
Author: Jan Dakinevich <jan.dakinevich at virtuozzo.com>
Date: Thu May 30 17:39:08 2019 +0300
sched: fix cfs_rq::nr_iowait accounting
After recent RedHat (b6be9ae "rh7: import RHEL7 kernel-3.10.0-957.12.2.el7")
following sequence:
update_stats_dequeue()
dequeue_sleeper()
cfs_rq->nr_iowait++
is called conditionally and cfs_rq::nr_iowait incremented if
schedstat_enabled() is true.
However, it is expected that this counter handled independently on
other scheduler statistics gathering. To fix it, move cfs_rq::nr_iowait
incrementing out of schedstat_enabled() checking.
Fixes: 4bf14016e ("sched: Account cfs_rq::nr_iowait")
https://jira.sw.ru/browse/PSBM-93850
Signed-off-by: Jan Dakinevich <jan.dakinevich at virtuozzo.com>
Reviewed-by: Kirill Tkhai <ktkhai at virtuozzo.com>
Reviewed-by: Konstantin Khorenko <khorenko at virtuozzo.com>
khorenko@ note: after this patch "nr_iowait" should be accounted properly until
disk io limits are set for a Container and throttling is activated. Taking into
account at the moment "nr_iowait" is always broken, let's apply current patch
and rework "nr_iowait" accounting to honor throttle code later.
At the moment throttle_cfs_rq() will inc nr_iowait (in dequeue_entity()) while
unthrottle_cfs_rq() won't decrement it in enqueue_entity().
---
kernel/sched/core.c | 5 ++++-
kernel/sched/fair.c | 9 +++++++--
2 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3e7c8004dafd..a9cc1fe90981 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2021,8 +2021,11 @@ static void try_to_wake_up_local(struct task_struct *p)
if (!(p->state & TASK_NORMAL))
goto out;
- if (!task_on_rq_queued(p))
+ if (!task_on_rq_queued(p)) {
+ if (p->in_iowait && p->sched_class->nr_iowait_dec)
+ p->sched_class->nr_iowait_dec(p);
ttwu_activate(rq, p, ENQUEUE_WAKEUP);
+ }
ttwu_do_wakeup(rq, p, 0);
if (schedstat_enabled())
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9a8f859c7f3e..39453200ff89 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -882,8 +882,6 @@ static void dequeue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se)
se->statistics->sleep_start = rq_clock(rq_of(cfs_rq));
if (tsk->state & TASK_UNINTERRUPTIBLE)
se->statistics->block_start = rq_clock(rq_of(cfs_rq));
- if (tsk->in_iowait)
- cfs_rq->nr_iowait++;
} else if (!cfs_rq_throttled(group_cfs_rq(se))) {
if (group_cfs_rq(se)->nr_iowait)
se->statistics->block_start = rq_clock(rq_of(cfs_rq));
@@ -3266,6 +3264,13 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
if (schedstat_enabled())
update_stats_dequeue(cfs_rq, se, flags);
+ if ((flags & DEQUEUE_SLEEP) && entity_is_task(se)) {
+ struct task_struct *tsk = task_of(se);
+
+ if (tsk->in_iowait)
+ cfs_rq->nr_iowait++;
+ }
+
clear_buddies(cfs_rq, se);
if (cfs_rq->prev == se)
More information about the Devel
mailing list