[Devel] [PATCH v2 vz9] ve: cgroup -- don't use atomic lock in the sleepable context

Cyrill Gorcunov gorcunov at gmail.com
Tue Nov 9 13:07:10 MSK 2021


We should not keep @css_set_lock spinlock while removing group files
which leads to

 | BUG: sleeping function called from invalid context at kernel/locking/mutex.c:938

This is because cgroup_rm_file() function takes @kernfs_mutex itself.
Since we're keeping the reference to cset in VE itself we don't even
need to take spinlock at all, because cset won't disappear until last
put_ve() call.

https://jira.sw.ru/browse/PSBM-135460

Signed-off-by: Cyrill Gorcunov <gorcunov at gmail.com>
---
 kernel/cgroup/cgroup.c |   13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Index: vzkernel/kernel/cgroup/cgroup.c
===================================================================
--- vzkernel.orig/kernel/cgroup/cgroup.c
+++ vzkernel/kernel/cgroup/cgroup.c
@@ -2152,9 +2152,9 @@ void cgroup_unmark_ve_roots(struct ve_st
 	struct cftype *cft;
 
 	cft = get_cftype_by_name(CGROUP_FILENAME_RELEASE_AGENT);
+	BUG_ON(!cft || cft->file_offset);
 
 	mutex_lock(&cgroup_mutex);
-	spin_lock_irq(&css_set_lock);
 
 	/*
 	 * We can safely use ve->ve_ns without rcu_read_lock here, as we are
@@ -2163,19 +2163,26 @@ void cgroup_unmark_ve_roots(struct ve_st
 	 */
 	cset = rcu_dereference_protected(ve->ve_ns,
 			lockdep_is_held(&ve->op_sem))->cgroup_ns->root_cset;
+	BUG_ON(!cset);
 
+	/*
+	 * Traversing @cgrp_links without @css_set_lock is safe here because
+	 * we ourself are keeping the reference to the @cset so it won't
+	 * disappear until the last reference get dropped upon ve destruction
+	 * via put_ve() call.
+	 */
 	list_for_each_entry(link, &cset->cgrp_links, cgrp_link) {
 		cgrp = link->cgrp;
 
 		if (!is_virtualized_cgroup(cgrp))
 			continue;
 
-		cgroup_rm_file(cgrp, cft);
+		if (!cgroup_is_dead(cgrp))
+			cgroup_rm_file(cgrp, cft);
 		rcu_assign_pointer(cgrp->ve_owner, NULL);
 		clear_bit(CGRP_VE_ROOT, &cgrp->flags);
 	}
 
-	spin_unlock_irq(&css_set_lock);
 	mutex_unlock(&cgroup_mutex);
 	/* ve_owner == NULL will be visible */
 	synchronize_rcu();


More information about the Devel mailing list