<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=koi8-r">

<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>

</head>

<body dir="ltr">

<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

It seems the original patch mail didn't go through.</div>

<div id="appendonsend"></div>

<hr style="display:inline-block;width:98%" tabindex="-1">

<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>От:</b> Konstantin Khorenko &lt;khorenko@virtuozzo.com&gt;<br>

<b>Отправлено:</b> 28 октября 2025 г. 19:15<br>

<b>Кому:</b> Pavel Tikhomirov &lt;ptikhomirov@virtuozzo.com&gt;; Aleksei Oladko &lt;aleksey.oladko@virtuozzo.com&gt;<br>

<b>Копия:</b> devel@openvz.org &lt;devel@openvz.org&gt;<br>

<b>Тема:</b> Re: [Devel] [RESEND PATCH vz10 1/2] ve/kthread: fix race when work can be added to stopped kthread worker #VSTOR-106887</font>

<div>&nbsp;</div>

</div>

<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">

<div class="PlainText">And where is the original patch mail?<br>

<br>

i don't see it in the list:<br>

<a href="https://lists.openvz.org/pipermail/devel/2025-October/date.html">https://lists.openvz.org/pipermail/devel/2025-October/date.html</a><br>

<br>

--<br>

Best regards,<br>

<br>

Konstantin Khorenko,<br>

Virtuozzo Linux Kernel Team<br>

<br>

On 10/21/25 12:25, Pavel Tikhomirov wrote:<br>

&gt; Reviewed-by: Pavel Tikhomirov &lt;ptikhomirov@virtuozzo.com&gt;<br>

&gt; <br>

&gt; (for both patches, will merge it tomorrow)<br>

&gt; <br>

&gt; On 10/21/25 17:18, Aleksei Oladko wrote:<br>

&gt;&gt; This patch reintroduces the feature from commit 6d43ed1, which was reverted<br>

&gt;&gt; due to a scenario where a hanging user-mode-helper in one container could<br>

&gt;&gt; block the startup of other containers.<br>

&gt;&gt;<br>

&gt;&gt; Race can be reproduced with steps below:<br>

&gt;&gt;<br>

&gt;&gt; 1) Add these test patch to increase the race probability:<br>

&gt;&gt;<br>

&gt;&gt; kernel/umh.c<br>

&gt;&gt; @@ -455,6 +456,8 @@ int call_usermodehelper_exec_ve(struct ve_struct *ve,<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sub_info-&gt;queue = call_usermodehelper_queue_ve;<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; kthread_init_work(&amp;sub_info-&gt;ve_work,<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; call_usermodehelper_exec_work_ve);<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while (VE_IS_RUNNING(ve))<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cond_resched();<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } else {<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sub_info-&gt;queue = call_usermodehelper_queue;<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INIT_WORK(&amp;sub_info-&gt;work, call_usermodehelper_exec_work);<br>

&gt;&gt;<br>

&gt;&gt; 2) Set corepattern to type pipe in CT:<br>

&gt;&gt;<br>

&gt;&gt; echo &quot;|/bin/dd of=/vz/coredumps/core-%e-%t.%p&quot; &gt; /proc/sys/kernel/core_pattern<br>

&gt;&gt;<br>

&gt;&gt; 3) Generate high CPU load on all cores using tools like stress-ng<br>

&gt;&gt;<br>

&gt;&gt; stress-ng --futex $(nproc) --timeout 60s<br>

&gt;&gt;<br>

&gt;&gt; 4) Produce a segfault inside a container and next try to stop the<br>

&gt;&gt; container killing init.<br>

&gt;&gt;<br>

&gt;&gt; Coredump creates &quot;dd&quot; work and ads it to ve_umh_worker, which is already<br>

&gt;&gt; stopped and will never handle these work, and our work will hang<br>

&gt;&gt; forever, and container will never stop:<br>

&gt;&gt;<br>

&gt;&gt; [&lt;0&gt;] call_usermodehelper_exec+0x168/0x1a0<br>

&gt;&gt; [&lt;0&gt;] call_usermodehelper_exec_ve+0x96/0xe0<br>

&gt;&gt; [&lt;0&gt;] do_coredump+0x60f/0xf40<br>

&gt;&gt; [&lt;0&gt;] get_signal+0x834/0x960<br>

&gt;&gt; [&lt;0&gt;] arch_do_signal_or_restart+0x29/0xf0<br>

&gt;&gt; [&lt;0&gt;] irqentry_exit_to_user_mode+0x12e/0x1a0<br>

&gt;&gt; [&lt;0&gt;] asm_exc_page_fault+0x26/0x30<br>

&gt;&gt;<br>

&gt;&gt; Fix is:<br>

&gt;&gt;<br>

&gt;&gt; 1) Before calling call_usermodehelper_exec for a ve, check that<br>

&gt;&gt; the ve is running before adding work.<br>

&gt;&gt;<br>

&gt;&gt; 2) Add separate hepler counters for each ve.<br>

&gt;&gt;<br>

&gt;&gt; 3) In ve_stop_ns, after setting the VE_STATE_STOPPING state,<br>

&gt;&gt; wait for the running helpers count to reach 0 before stopping umh.<br>

&gt;&gt;<br>

&gt;&gt; There are 2 cases:<br>

&gt;&gt;<br>

&gt;&gt; If the call_usermodehelper_exec_ve thread reaches call_usermodehelper_exec<br>

&gt;&gt; after ve_stop_ns started wait_khelpers, it will already see that<br>

&gt;&gt; the ve is no in the running state and will no queue the work.<br>

&gt;&gt;<br>

&gt;&gt; If the call_usermodehelper_exec_ve thread reaches call_usermodehelper_exec<br>

&gt;&gt; before the VE_STATE_STOPPING state is set for the ve, then ve_stop_ns will<br>

&gt;&gt; wait in wait_khelpers until all works have been queued. After that all<br>

&gt;&gt; queued works will be processed in kthread_flush_worker.<br>

&gt;&gt;<br>

&gt;&gt; <a href="https://virtuozzo.atlassian.net/browse/VSTOR-106887">https://virtuozzo.atlassian.net/browse/VSTOR-106887</a><br>

&gt;&gt;<br>

&gt;&gt; Signed-off-by: Aleksei Oladko &lt;aleksey.oladko@virtuozzo.com&gt;<br>

&gt;&gt; ---<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; include/linux/umh.h |&nbsp; 2 ++<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; include/linux/ve.h&nbsp; |&nbsp; 2 ++<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; kernel/umh.c&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | 27 ++++++++++++++++++++++++++-<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; kernel/ve/ve.c&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; |&nbsp; 5 +++++<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; 4 files changed, 35 insertions(+), 1 deletion(-)<br>

&gt;&gt;<br>

&gt;&gt; diff --git a/include/linux/umh.h b/include/linux/umh.h<br>

&gt;&gt; index 5647f1e64e39..35fc4023df74 100644<br>

&gt;&gt; --- a/include/linux/umh.h<br>

&gt;&gt; +++ b/include/linux/umh.h<br>

&gt;&gt; @@ -56,6 +56,8 @@ extern int<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; call_usermodehelper_exec_ve(struct ve_struct *ve,<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; struct subprocess_info *info, int wait);<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; <br>

&gt;&gt; +extern void wait_khelpers(struct ve_struct *ve);<br>

&gt;&gt; +<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; #else /* CONFIG_VE */<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; <br>

&gt;&gt;&nbsp;&nbsp;&nbsp; #define call_usermodehelper_ve(ve, ...)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \<br>

&gt;&gt; diff --git a/include/linux/ve.h b/include/linux/ve.h<br>

&gt;&gt; index e944132f972f..37562dff25aa 100644<br>

&gt;&gt; --- a/include/linux/ve.h<br>

&gt;&gt; +++ b/include/linux/ve.h<br>

&gt;&gt; @@ -93,6 +93,8 @@ struct ve_struct {<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; struct kthread_worker&nbsp;&nbsp; *kthreadd_worker;<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; struct task_struct&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *kthreadd_task;<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; <br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; atomic_t&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; umh_running_helpers;<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; struct wait_queue_head&nbsp; umh_helpers_waitq;<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; struct kthread_worker&nbsp;&nbsp; umh_worker;<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; struct task_struct&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *umh_task;<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; <br>

&gt;&gt; diff --git a/kernel/umh.c b/kernel/umh.c<br>

&gt;&gt; index bd49c108eb90..699403efe382 100644<br>

&gt;&gt; --- a/kernel/umh.c<br>

&gt;&gt; +++ b/kernel/umh.c<br>

&gt;&gt; @@ -447,9 +447,30 @@ static void call_usermodehelper_exec_work_ve(struct kthread_work *work)<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; }<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; <br>

&gt;&gt; +static void umh_running_helpers_inc(struct ve_struct *ve)<br>

&gt;&gt; +{<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; atomic_inc(&amp;ve-&gt;umh_running_helpers);<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; smp_mb__after_atomic();<br>

&gt;&gt; +}<br>

&gt;&gt; +<br>

&gt;&gt; +static void umh_running_helpers_dec(struct ve_struct *ve)<br>

&gt;&gt; +{<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; if (atomic_dec_and_test(&amp;ve-&gt;umh_running_helpers))<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; wake_up(&amp;ve-&gt;umh_helpers_waitq);<br>

&gt;&gt; +}<br>

&gt;&gt; +<br>

&gt;&gt; +void wait_khelpers(struct ve_struct *ve)<br>

&gt;&gt; +{<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; wait_event(ve-&gt;umh_helpers_waitq,<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; atomic_read(&amp;ve-&gt;umh_running_helpers) == 0);<br>

&gt;&gt; +}<br>

&gt;&gt; +EXPORT_SYMBOL(wait_khelpers);<br>

&gt;&gt; +<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; int call_usermodehelper_exec_ve(struct ve_struct *ve,<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; struct subprocess_info *sub_info, int wait)<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; {<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; int ret = -EBUSY;<br>

&gt;&gt; +<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (!ve_is_super(ve)) {<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sub_info-&gt;ve = ve;<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sub_info-&gt;queue = call_usermodehelper_queue_ve;<br>

&gt;&gt; @@ -460,7 +481,11 @@ int call_usermodehelper_exec_ve(struct ve_struct *ve,<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; INIT_WORK(&amp;sub_info-&gt;work, call_usermodehelper_exec_work);<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; <br>

&gt;&gt; -&nbsp;&nbsp;&nbsp; return call_usermodehelper_exec(sub_info, wait);<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; umh_running_helpers_inc(ve);<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; if (VE_IS_RUNNING(ve))<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ret = call_usermodehelper_exec(sub_info, wait);<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; umh_running_helpers_dec(ve);<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; return ret;<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; }<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; EXPORT_SYMBOL(call_usermodehelper_exec_ve);<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; <br>

&gt;&gt; diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c<br>

&gt;&gt; index cb77f7b7e4cd..d5dc15942ab5 100644<br>

&gt;&gt; --- a/kernel/ve/ve.c<br>

&gt;&gt; +++ b/kernel/ve/ve.c<br>

&gt;&gt; @@ -88,6 +88,8 @@ struct ve_struct ve0 = {<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; .nd_neigh_nr&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = ATOMIC_INIT(0),<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; .mnt_nr&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = ATOMIC_INIT(0),<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; .meminfo_val&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = VE_MEMINFO_SYSTEM,<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; .umh_running_helpers&nbsp;&nbsp;&nbsp; = ATOMIC_INIT(0),<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; .umh_helpers_waitq&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = __WAIT_QUEUE_HEAD_INITIALIZER(ve0.umh_helpers_waitq),<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; .vdso_64&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = (struct vdso_image*)&amp;vdso_image_64,<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; .vdso_32&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = (struct vdso_image*)&amp;vdso_image_32,<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; .release_list_lock&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = __SPIN_LOCK_UNLOCKED(<br>

&gt;&gt; @@ -480,6 +482,8 @@ static int ve_start_umh(struct ve_struct *ve)<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; {<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; struct task_struct *task;<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; <br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; atomic_set(&amp;ve-&gt;umh_running_helpers, 0);<br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; init_waitqueue_head(&amp;ve-&gt;umh_helpers_waitq);<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; kthread_init_worker(&amp;ve-&gt;umh_worker);<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; <br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; task = kthread_create_on_node_ve_flags(ve, 0, kthread_worker_fn,<br>

&gt;&gt; @@ -814,6 +818,7 @@ void ve_stop_ns(struct pid_namespace *pid_ns)<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; */<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; up_write(&amp;ve-&gt;op_sem);<br>

&gt;&gt;&nbsp;&nbsp;&nbsp; <br>

&gt;&gt; +&nbsp;&nbsp;&nbsp; wait_khelpers(ve);<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /*<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; * release_agent works on top of umh_worker, so we must make sure, that<br>

&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; * ve workqueue is stopped first.<br>

&gt; <br>

<br>

</div>

</span></font></div>

</body>

</html>