[Devel] [PATCH RH9 3/6] ve/cgroup: Implement per-ve workqueue

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Mon Oct 18 15:50:07 MSK 2021


From: Valeriy Vdovin <valeriy.vdovin at virtuozzo.com>

Per-ve workqueue is started at ve cgroup start before ve->is_running is
set to true, it is stopped at ve cgroup stop after ve->is_running is set
to false and is guarded via synchronize_rcu, stop also implies waiting
for all queued and yet unfinished works to finish.

So the user of ve->wq should take rcu_read_lock, under the lock check
ve->is_running, if ve->is_running is true user can safely queue_work on
the workqueue and release the lock.

We need per-ve workqueue for per-ve release-agent implementation.
Release-agent code (in next patches) would use call_usermodehelper_ve
from per-ve workqueue worker to run the release-agent binary in actual
ve-context, so that workqueue is started after umh started and stopped
before umh kthread stopped.

Short history of the patch:

https://jira.sw.ru/browse/PSBM-83887
Signed-off-by: Valeriy Vdovin <valeriy.vdovin at virtuozzo.com>
Reviewed-by: Kirill Tkhai <ktkhai at virtuozzo.com>
(cherry picked from vz7 commit 0293870666c4f96bd56f612d94f560626c76e2fd)
https://jira.sw.ru/browse/PSBM-108270
(cherry picked from vz8 commit 58b7b3bb335b2bafca9caccfee8c435242baa664)
+++
cgroup/ve: Do not run release_agent on non-running ve
+++
cgroup/ve: Move ve workqueue stop to ve_stop_ns()
(cherry picked from vz8 commit 1481d130afa32eecffe28b34fd00b57e11d2666f)

vz9 changes:
- merge per-ve workqueue fixes from 1481d130afa3
- add comment to wq field about syncronization using rcu and is_running

https://jira.sw.ru/browse/PSBM-134002
Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
---
 include/linux/ve.h |  3 +++
 kernel/ve/ve.c     | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/include/linux/ve.h b/include/linux/ve.h
index d3499853e6dd..474863746a48 100644
--- a/include/linux/ve.h
+++ b/include/linux/ve.h
@@ -104,6 +104,9 @@ struct ve_struct {
 	unsigned long		aio_nr;
 	unsigned long		aio_max_nr;
 #endif
+	/* Should take rcu_read_lock and check ve->is_running before queue */
+	struct workqueue_struct	*wq;
+
 	struct vfsmount		*devtmpfs_mnt;
 };
 
diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c
index 827c6089a4c3..30ace4a07568 100644
--- a/kernel/ve/ve.c
+++ b/kernel/ve/ve.c
@@ -460,6 +460,22 @@ static int ve_start_kthreadd(struct ve_struct *ve)
 	return err;
 }
 
+static int ve_workqueue_start(struct ve_struct *ve)
+{
+	ve->wq = alloc_workqueue("ve_wq_%s", WQ_SYSFS|WQ_FREEZABLE|WQ_UNBOUND,
+				 8, ve->ve_name);
+
+	if (!ve->wq)
+		return -ENOMEM;
+	return 0;
+}
+
+static void ve_workqueue_stop(struct ve_struct *ve)
+{
+	destroy_workqueue(ve->wq);
+	ve->wq = NULL;
+}
+
 /* under ve->op_sem write-lock */
 static int ve_start_container(struct ve_struct *ve)
 {
@@ -504,6 +520,10 @@ static int ve_start_container(struct ve_struct *ve)
 	if (err)
 		goto err_umh;
 
+	err = ve_workqueue_start(ve);
+	if (err)
+		goto err_workqueue;
+
 	err = ve_hook_iterate_init(VE_SS_CHAIN, ve);
 	if (err < 0)
 		goto err_iterate;
@@ -523,6 +543,8 @@ static int ve_start_container(struct ve_struct *ve)
 err_mark_ve:
 	ve_hook_iterate_fini(VE_SS_CHAIN, ve);
 err_iterate:
+	ve_workqueue_stop(ve);
+err_workqueue:
 	ve_stop_umh(ve);
 err_umh:
 	ve_stop_kthreadd(ve);
@@ -552,6 +574,14 @@ void ve_stop_ns(struct pid_namespace *pid_ns)
 	 * ve_mutex works as barrier for ve_can_attach().
 	 */
 	ve->is_running = 0;
+	synchronize_rcu();
+
+	/*
+	 * release_agent works on top of umh_worker, so we must make sure, that
+	 * ve workqueue is stopped first.
+	 */
+	ve_workqueue_stop(ve);
+
 	/*
 	 * Neither it can be in pseudosuper state
 	 * anymore, setup it again if needed.
@@ -1531,6 +1561,8 @@ static int __init ve_subsys_init(void)
 {
 	ve_cachep = KMEM_CACHE_USERCOPY(ve_struct, SLAB_PANIC, core_pattern);
 	list_add(&ve0.ve_list, &ve_list_head);
+	ve0.wq = alloc_workqueue("ve0_wq", WQ_FREEZABLE|WQ_UNBOUND, 8);
+	BUG_ON(!ve0.wq);
 	return 0;
 }
 late_initcall(ve_subsys_init);
-- 
2.31.1



More information about the Devel mailing list