[Devel] [PATCH RH8] sched: show CPU stats for a cgroup in cpu.proc.stat file

Evgenii Shatokhin eshatokhin at virtuozzo.com
Thu Jul 8 14:02:10 MSK 2021


To implement its policies, vcmmd needs stats for each CPU core used by a
given container or VM, similar to what /proc/stat shows for the system
as a whole. The VZ8 kernel already has VZCTL_GET_CPU_STAT ioctl to fetch
CPU stats, however, only total CPU times, rather than per-core, seem to
be obtained that way, which is not enough here.

In VZ7, part of commit 33cf55658533 ("sched: use cpuacct->cpustat for showing
cpu stats") added "cpu.proc.stat" file for each cgroup with "cpu" subsystem
for that purpose. Data from both "cpu" and "cpuacct" subsystems were needed,
but it was assumed that these subsystems were always mounted together, so
a cgroup could have either both or none.

This patch adds support for "cpu.proc.stat" to VZ8, building on top of
cpu_cgroup_proc_stat() machinery, already ported here in commit
90368f957e01 ("ve/sched/stat: Introduce functions to calculate vcpustat data").
Same as in VZ7, both "cpu" and "cpuacct" are needed. The file belongs to "cpu"
subsystem, for consistency with VZ7, so it gets "cpuacct" from the cgroup.

rcu_read_lock/unlock and css_get/put are probably not needed here (the
file that belongs to the cgroup is open at the moment, so the cgroup cannot
go away, neither can "cpu" subsystem). However, they are here to keep
code analysis tools happier and - for a theoretical scenario where "cpuacct"
subsystem is somehow used independent on "cpu" subsystem.

https://jira.sw.ru/browse/PSBM-101155

Signed-off-by: Evgenii Shatokhin <eshatokhin at virtuozzo.com>
---
 kernel/sched/core.c    |  6 ++++++
 kernel/sched/cpuacct.c | 30 ++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c2880cf6cf60..bdd3217c5cc8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7398,6 +7398,8 @@ int cpu_cgroup_proc_loadavg(struct cgroup_subsys_state *css,
 	return 0;
 }
 
+int cpu_cgroup_proc_stat_show(struct seq_file *sf, void *v);
+
 static struct cftype cpu_legacy_files[] = {
 #ifdef CONFIG_FAIR_GROUP_SCHED
 	{
@@ -7446,6 +7448,10 @@ static struct cftype cpu_legacy_files[] = {
 		.write_u64 = cpu_rt_period_write_uint,
 	},
 #endif
+	{
+		.name = "proc.stat",
+		.seq_show = cpu_cgroup_proc_stat_show,
+	},
 	{ }	/* Terminate */
 };
 
diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index 33b6987700c2..a1522c878472 100644
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -779,3 +779,33 @@ int cpu_cgroup_get_stat(struct cgroup_subsys_state *cpu_css,
 
 	return 0;
 }
+
+int cpu_cgroup_proc_stat_show(struct seq_file *sf, void *v)
+{
+	struct cgroup_subsys_state *cpu_css = seq_css(sf);
+	struct cgroup_subsys_state *cpuacct_css;
+	int ret;
+
+	/*
+	 * The cgroup the file is associated with should not disappear from
+	 * under us (the file is open, after all). Still, it won't hurt to
+	 * use RCU read-side lock as cgroup->subsys[] might need it.
+	 */
+	rcu_read_lock();
+	/*
+	 * Data from both 'cpu' and 'cpuacct' subsystems are needed. These
+	 * subsystems are often used together, but let us check if 'cpuacct'
+	 * is available for the cgroup, just in case.
+	 */
+	cpuacct_css = rcu_dereference(cpu_css->cgroup->subsys[cpuacct_cgrp_id]);
+	if (!cpuacct_css) {
+		rcu_read_unlock();
+		return -ENOENT;
+	}
+	css_get(cpuacct_css);
+	rcu_read_unlock();
+
+	ret = cpu_cgroup_proc_stat(cpu_css, cpuacct_css, sf);
+	css_put(cpuacct_css);
+	return ret;
+}
-- 
2.29.0



More information about the Devel mailing list