[Devel] [PATCH RH8] sched: show CPU stats for a cgroup in cpu.proc.stat file
Konstantin Khorenko
khorenko at virtuozzo.com
Mon Jul 12 15:34:11 MSK 2021
On 07/08/2021 02:02 PM, Evgenii Shatokhin wrote:
> To implement its policies, vcmmd needs stats for each CPU core used by a
> given container or VM, similar to what /proc/stat shows for the system
> as a whole. The VZ8 kernel already has VZCTL_GET_CPU_STAT ioctl to fetch
> CPU stats, however, only total CPU times, rather than per-core, seem to
> be obtained that way, which is not enough here.
>
> In VZ7, part of commit 33cf55658533 ("sched: use cpuacct->cpustat for showing
> cpu stats") added "cpu.proc.stat" file for each cgroup with "cpu" subsystem
> for that purpose. Data from both "cpu" and "cpuacct" subsystems were needed,
> but it was assumed that these subsystems were always mounted together, so
> a cgroup could have either both or none.
>
> This patch adds support for "cpu.proc.stat" to VZ8, building on top of
> cpu_cgroup_proc_stat() machinery, already ported here in commit
> 90368f957e01 ("ve/sched/stat: Introduce functions to calculate vcpustat data").
> Same as in VZ7, both "cpu" and "cpuacct" are needed. The file belongs to "cpu"
> subsystem, for consistency with VZ7, so it gets "cpuacct" from the cgroup.
>
> rcu_read_lock/unlock and css_get/put are probably not needed here (the
> file that belongs to the cgroup is open at the moment, so the cgroup cannot
> go away, neither can "cpu" subsystem). However, they are here to keep
> code analysis tools happier and - for a theoretical scenario where "cpuacct"
> subsystem is somehow used independent on "cpu" subsystem.
>
> https://jira.sw.ru/browse/PSBM-101155
>
> Signed-off-by: Evgenii Shatokhin <eshatokhin at virtuozzo.com>
Reviewed-by: Konstantin Khorenko <khorenko at virtuozzo.com>
> ---
> kernel/sched/core.c | 6 ++++++
> kernel/sched/cpuacct.c | 30 ++++++++++++++++++++++++++++++
> 2 files changed, 36 insertions(+)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index c2880cf6cf60..bdd3217c5cc8 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7398,6 +7398,8 @@ int cpu_cgroup_proc_loadavg(struct cgroup_subsys_state *css,
> return 0;
> }
>
> +int cpu_cgroup_proc_stat_show(struct seq_file *sf, void *v);
> +
> static struct cftype cpu_legacy_files[] = {
> #ifdef CONFIG_FAIR_GROUP_SCHED
> {
> @@ -7446,6 +7448,10 @@ static struct cftype cpu_legacy_files[] = {
> .write_u64 = cpu_rt_period_write_uint,
> },
> #endif
> + {
> + .name = "proc.stat",
> + .seq_show = cpu_cgroup_proc_stat_show,
> + },
> { } /* Terminate */
> };
>
> diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
> index 33b6987700c2..a1522c878472 100644
> --- a/kernel/sched/cpuacct.c
> +++ b/kernel/sched/cpuacct.c
> @@ -779,3 +779,33 @@ int cpu_cgroup_get_stat(struct cgroup_subsys_state *cpu_css,
>
> return 0;
> }
> +
> +int cpu_cgroup_proc_stat_show(struct seq_file *sf, void *v)
> +{
> + struct cgroup_subsys_state *cpu_css = seq_css(sf);
> + struct cgroup_subsys_state *cpuacct_css;
> + int ret;
> +
> + /*
> + * The cgroup the file is associated with should not disappear from
> + * under us (the file is open, after all). Still, it won't hurt to
> + * use RCU read-side lock as cgroup->subsys[] might need it.
> + */
> + rcu_read_lock();
> + /*
> + * Data from both 'cpu' and 'cpuacct' subsystems are needed. These
> + * subsystems are often used together, but let us check if 'cpuacct'
> + * is available for the cgroup, just in case.
> + */
> + cpuacct_css = rcu_dereference(cpu_css->cgroup->subsys[cpuacct_cgrp_id]);
> + if (!cpuacct_css) {
> + rcu_read_unlock();
> + return -ENOENT;
> + }
> + css_get(cpuacct_css);
> + rcu_read_unlock();
> +
> + ret = cpu_cgroup_proc_stat(cpu_css, cpuacct_css, sf);
> + css_put(cpuacct_css);
> + return ret;
> +}
>
More information about the Devel
mailing list