[Devel] [PATCH RH8] sched: show CPU stats for a cgroup in cpu.proc.stat file

Konstantin Khorenko khorenko at virtuozzo.com
Mon Jul 12 15:34:11 MSK 2021


On 07/08/2021 02:02 PM, Evgenii Shatokhin wrote:
> To implement its policies, vcmmd needs stats for each CPU core used by a
> given container or VM, similar to what /proc/stat shows for the system
> as a whole. The VZ8 kernel already has VZCTL_GET_CPU_STAT ioctl to fetch
> CPU stats, however, only total CPU times, rather than per-core, seem to
> be obtained that way, which is not enough here.
>
> In VZ7, part of commit 33cf55658533 ("sched: use cpuacct->cpustat for showing
> cpu stats") added "cpu.proc.stat" file for each cgroup with "cpu" subsystem
> for that purpose. Data from both "cpu" and "cpuacct" subsystems were needed,
> but it was assumed that these subsystems were always mounted together, so
> a cgroup could have either both or none.
>
> This patch adds support for "cpu.proc.stat" to VZ8, building on top of
> cpu_cgroup_proc_stat() machinery, already ported here in commit
> 90368f957e01 ("ve/sched/stat: Introduce functions to calculate vcpustat data").
> Same as in VZ7, both "cpu" and "cpuacct" are needed. The file belongs to "cpu"
> subsystem, for consistency with VZ7, so it gets "cpuacct" from the cgroup.
>
> rcu_read_lock/unlock and css_get/put are probably not needed here (the
> file that belongs to the cgroup is open at the moment, so the cgroup cannot
> go away, neither can "cpu" subsystem). However, they are here to keep
> code analysis tools happier and - for a theoretical scenario where "cpuacct"
> subsystem is somehow used independent on "cpu" subsystem.
>
> https://jira.sw.ru/browse/PSBM-101155
>
> Signed-off-by: Evgenii Shatokhin <eshatokhin at virtuozzo.com>

Reviewed-by: Konstantin Khorenko <khorenko at virtuozzo.com>

> ---
>  kernel/sched/core.c    |  6 ++++++
>  kernel/sched/cpuacct.c | 30 ++++++++++++++++++++++++++++++
>  2 files changed, 36 insertions(+)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index c2880cf6cf60..bdd3217c5cc8 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7398,6 +7398,8 @@ int cpu_cgroup_proc_loadavg(struct cgroup_subsys_state *css,
>  	return 0;
>  }
>
> +int cpu_cgroup_proc_stat_show(struct seq_file *sf, void *v);
> +
>  static struct cftype cpu_legacy_files[] = {
>  #ifdef CONFIG_FAIR_GROUP_SCHED
>  	{
> @@ -7446,6 +7448,10 @@ static struct cftype cpu_legacy_files[] = {
>  		.write_u64 = cpu_rt_period_write_uint,
>  	},
>  #endif
> +	{
> +		.name = "proc.stat",
> +		.seq_show = cpu_cgroup_proc_stat_show,
> +	},
>  	{ }	/* Terminate */
>  };
>
> diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
> index 33b6987700c2..a1522c878472 100644
> --- a/kernel/sched/cpuacct.c
> +++ b/kernel/sched/cpuacct.c
> @@ -779,3 +779,33 @@ int cpu_cgroup_get_stat(struct cgroup_subsys_state *cpu_css,
>
>  	return 0;
>  }
> +
> +int cpu_cgroup_proc_stat_show(struct seq_file *sf, void *v)
> +{
> +	struct cgroup_subsys_state *cpu_css = seq_css(sf);
> +	struct cgroup_subsys_state *cpuacct_css;
> +	int ret;
> +
> +	/*
> +	 * The cgroup the file is associated with should not disappear from
> +	 * under us (the file is open, after all). Still, it won't hurt to
> +	 * use RCU read-side lock as cgroup->subsys[] might need it.
> +	 */
> +	rcu_read_lock();
> +	/*
> +	 * Data from both 'cpu' and 'cpuacct' subsystems are needed. These
> +	 * subsystems are often used together, but let us check if 'cpuacct'
> +	 * is available for the cgroup, just in case.
> +	 */
> +	cpuacct_css = rcu_dereference(cpu_css->cgroup->subsys[cpuacct_cgrp_id]);
> +	if (!cpuacct_css) {
> +		rcu_read_unlock();
> +		return -ENOENT;
> +	}
> +	css_get(cpuacct_css);
> +	rcu_read_unlock();
> +
> +	ret = cpu_cgroup_proc_stat(cpu_css, cpuacct_css, sf);
> +	css_put(cpuacct_css);
> +	return ret;
> +}
>


More information about the Devel mailing list