[Devel] [PATCH vz9 16/27] sched: show CPU stats for a cgroup in cpu.proc.stat file
Nikita Yushchenko
nikita.yushchenko at virtuozzo.com
Wed Oct 6 11:57:31 MSK 2021
From: Evgenii Shatokhin <eshatokhin at virtuozzo.com>
To implement its policies, vcmmd needs stats for each CPU core used by a
given container or VM, similar to what /proc/stat shows for the system
as a whole. The VZ8 kernel already has VZCTL_GET_CPU_STAT ioctl to fetch
CPU stats, however, only total CPU times, rather than per-core, seem to
be obtained that way, which is not enough here.
In VZ7, part of commit 33cf55658533 ("sched: use cpuacct->cpustat for showing
cpu stats") added "cpu.proc.stat" file for each cgroup with "cpu" subsystem
for that purpose. Data from both "cpu" and "cpuacct" subsystems were needed,
but it was assumed that these subsystems were always mounted together, so
a cgroup could have either both or none.
This patch adds support for "cpu.proc.stat" to VZ8, building on top of
cpu_cgroup_proc_stat() machinery, already ported here in commit
90368f957e01 ("ve/sched/stat: Introduce functions to calculate vcpustat data").
Same as in VZ7, both "cpu" and "cpuacct" are needed. The file belongs to "cpu"
subsystem, for consistency with VZ7, so it gets "cpuacct" from the cgroup.
rcu_read_lock/unlock and css_get/put are probably not needed here (the
file that belongs to the cgroup is open at the moment, so the cgroup cannot
go away, neither can "cpu" subsystem). However, they are here to keep
code analysis tools happier and - for a theoretical scenario where "cpuacct"
subsystem is somehow used independent on "cpu" subsystem.
https://jira.sw.ru/browse/PSBM-101155
Signed-off-by: Evgenii Shatokhin <eshatokhin at virtuozzo.com>
Reviewed-by: Konstantin Khorenko <khorenko at virtuozzo.com>
(cherry-picked from vz8 commit f79d2766b90a ("sched: show CPU stats for
a cgroup in cpu.proc.stat file"))
Signed-off-by: Nikita Yushchenko <nikita.yushchenko at virtuozzo.com>
---
kernel/sched/core.c | 6 ++++++
kernel/sched/cpuacct.c | 30 ++++++++++++++++++++++++++++++
2 files changed, 36 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0402b26b92b3..e96fab5b93ec 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10592,6 +10592,8 @@ int cpu_cgroup_proc_loadavg(struct cgroup_subsys_state *css,
return 0;
}
+int cpu_cgroup_proc_stat_show(struct seq_file *sf, void *v);
+
static struct cftype cpu_legacy_files[] = {
#ifdef CONFIG_FAIR_GROUP_SCHED
{
@@ -10659,6 +10661,10 @@ static struct cftype cpu_legacy_files[] = {
.write = cpu_uclamp_max_write,
},
#endif
+ {
+ .name = "proc.stat",
+ .seq_show = cpu_cgroup_proc_stat_show,
+ },
{ } /* Terminate */
};
diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index 5d7b02253e2b..489211d78f84 100644
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -785,3 +785,33 @@ int cpu_cgroup_get_stat(struct cgroup_subsys_state *cpu_css,
return 0;
}
+
+int cpu_cgroup_proc_stat_show(struct seq_file *sf, void *v)
+{
+ struct cgroup_subsys_state *cpu_css = seq_css(sf);
+ struct cgroup_subsys_state *cpuacct_css;
+ int ret;
+
+ /*
+ * The cgroup the file is associated with should not disappear from
+ * under us (the file is open, after all). Still, it won't hurt to
+ * use RCU read-side lock as cgroup->subsys[] might need it.
+ */
+ rcu_read_lock();
+ /*
+ * Data from both 'cpu' and 'cpuacct' subsystems are needed. These
+ * subsystems are often used together, but let us check if 'cpuacct'
+ * is available for the cgroup, just in case.
+ */
+ cpuacct_css = rcu_dereference(cpu_css->cgroup->subsys[cpuacct_cgrp_id]);
+ if (!cpuacct_css) {
+ rcu_read_unlock();
+ return -ENOENT;
+ }
+ css_get(cpuacct_css);
+ rcu_read_unlock();
+
+ ret = cpu_cgroup_proc_stat(cpu_css, cpuacct_css, sf);
+ css_put(cpuacct_css);
+ return ret;
+}
--
2.30.2
More information about the Devel
mailing list