[Devel] [PATCH RH7] cgroup: switch from cgroup_mutex to cgroup_root_mutex in proc_cgroup_show
Pavel Tikhomirov
ptikhomirov at virtuozzo.com
Wed Jun 1 16:33:25 MSK 2022
Please add this comment below to commit message then merging.
On 01.06.2022 11:29, Pavel Tikhomirov wrote:
> We saw systemd reading /proc/<pid>/cgroup quiet a lot and we also see
> that cgroup_mutex is contended enough to make it's waiters wait for
> several seconds.
>
> Assuming that cgroup_root_mutex should be less contended as roots change
> not so much let's switch /proc/<pid>/cgroup to it.
Why we can do this?
- roots list is protected by both cgroup_mutex and cgroup_root_mutex
- without cgroup_mutex ve_hide_cgroups() can be racy when current task
is moved to other ve cgroup, but that's not too bad
- for a root from roots list (not "rootnode" root), root->subsys_list is
also protected by both mutexes
- __task_cgroup_from_root() iterates over task->cgroups->cg_links while
holding css_set_lock, which should be safe, and to protect found
cgroup from dissapearing after we release css_set_lock, we take
rcu_read_lock().
>
> https://jira.sw.ru/browse/PSBM-139206
>
> Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
> ---
> Let's build rpm for it, test it on customer test nodes and see if it
> helps first.
> ---
> kernel/cgroup.c | 25 +++++++++++++++++++------
> 1 file changed, 19 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 64f4227b2305..1f5d80130070 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -705,13 +705,14 @@ static inline bool css_has_host_cgroups(struct css_set *css_set)
> * Return the cgroup for "task" from the given hierarchy. Must be
> * called with cgroup_mutex held.
> */
> -static struct cgroup *task_cgroup_from_root(struct task_struct *task,
> - struct cgroupfs_root *root)
> +static struct cgroup *__task_cgroup_from_root(struct task_struct *task,
> + struct cgroupfs_root *root,
> + int nocgmtx)
> {
> struct css_set *css;
> struct cgroup *res = NULL;
>
> - BUG_ON(!mutex_is_locked(&cgroup_mutex));
> + BUG_ON(!nocgmtx && !mutex_is_locked(&cgroup_mutex));
> read_lock(&css_set_lock);
> /*
> * No need to lock the task - since we hold cgroup_mutex the
> @@ -736,6 +737,12 @@ static struct cgroup *task_cgroup_from_root(struct task_struct *task,
> return res;
> }
>
> +static struct cgroup *task_cgroup_from_root(struct task_struct *task,
> + struct cgroupfs_root *root)
> +{
> + return __task_cgroup_from_root(task, root, false);
> +}
> +
> /*
> * There is one global cgroup mutex. We also require taking
> * task_lock() when dereferencing a task's cgroup subsys pointers.
> @@ -5578,7 +5585,8 @@ int proc_cgroup_show(struct seq_file *m, void *v)
>
> retval = 0;
>
> - mutex_lock(&cgroup_mutex);
> + /* without cgroup_mutex we must protect roots list */
> + mutex_lock(&cgroup_root_mutex);
>
> for_each_active_root(root) {
> struct cgroup_subsys *ss;
> @@ -5594,8 +5602,13 @@ int proc_cgroup_show(struct seq_file *m, void *v)
> seq_printf(m, "%sname=%s", count ? "," : "",
> root->name);
> seq_putc(m, ':');
> - cgrp = task_cgroup_from_root(tsk, root);
> +
> + /* without cgroup_mutex we take rcu to make cgrp valid */
> + rcu_read_lock();
> + cgrp = __task_cgroup_from_root(tsk, root, true);
> retval = cgroup_path_ve(cgrp, buf, PAGE_SIZE);
> + rcu_read_unlock();
> +
> if (retval < 0)
> goto out_unlock;
> seq_puts(m, buf);
> @@ -5603,7 +5616,7 @@ int proc_cgroup_show(struct seq_file *m, void *v)
> }
>
> out_unlock:
> - mutex_unlock(&cgroup_mutex);
> + mutex_unlock(&cgroup_root_mutex);
> put_task_struct(tsk);
> out_free:
> kfree(buf);
--
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.
More information about the Devel
mailing list