[CRIU] [PATCH v2 0/3] Create pid_ns helpers as children of criu main task

Kirill Tkhai ktkhai at virtuozzo.com
Mon Jul 31 11:56:13 MSK 2017


On 20.07.2017 21:17, Andrei Vagin wrote:
> On Thu, Jul 20, 2017 at 10:51:32AM -0700, Andrei Vagin wrote:
>> On Fri, Jul 14, 2017 at 03:40:05PM +0300, Kirill Tkhai wrote:
>>> This patchset makes pid_ns helpers to be childred
>>> of criu main task. The main goal is to make fail
>>> path of restore more stable. Now we kill the helpers
>>> from usernsd, but there is possible a situation,
>>> that someone was killed without releasing of usernsd
>>> mutex. This case, we may never destroy pid_ns helpers
>>> and may hang. Creating them as children of criu main
>>> task decides such problem: we just kill and wait
>>> them directly.
>>>
>>
>> BTW: may be we need to set PR_SET_PDEATHSIG for them? criu may
>> segfaulted too.
> 
> The idea to create helper in usernsd and destrey them from criu
> looks strange. We want to be sure that all processes will be killed in a
> case when criu crashes. Does this patch solves this problem?

Killing of helpers is not a problem, because they die after the init
of pid namespace is died. The reason is a little more difficult.
Pid ns helpers are created as children of the only process (currently,
parent is usernsd, after patch the parent will be criu main task)
for better ordering and watching for them. We should be able to differ
the problem situations, when helper dies accidentally, and to reap it
it sane way. If we don't reap it, pid namespace reapers won't die,
as they wait for all tasks of the pid namespace.

So, summarizing the above:

1)We can't make the helpers be autoreaped, as problem situations will
not be seen in this way; so we create them with SIGCHLD and handle
unexpected exit in signal handler in standard way;

2)We kill pid ns helpers in case of restore fail and we wait them,
to allow another tasks be reaped in standard way. If we do not wait
pid ns helpers firstly, pid namespaces init tasks won't be able to
become zombie and to be reaped.

3)Patch makes pid ns helpers children of criu root task, to allow
kill them and wait them directly from criu task. It's just ordinary
alpine strategy: construction is safer, when it has less unstable
components. The patch throws the unstable components out at all.

>>
>>> https://travis-ci.org/tkhai/criu/builds/253543048
>>>
>>> v2: Make the code more compact
>>>
>>> ---
>>>
>>> Kirill Tkhai (3):
>>>       pid_ns: Extract functionality of exit of pid_ns helper in function
>>>       utils: Add sys_clone_unified()
>>>       ns: Make pid_ns helpers as children of criu main process
>>>
>>>
>>>  criu/cr-restore.c         |    4 ++++
>>>  criu/include/namespaces.h |    1 +
>>>  criu/include/util.h       |    3 +++
>>>  criu/namespaces.c         |   49 ++++++++++++++-------------------------------
>>>  criu/util.c               |   25 ++++++++++++++---------
>>>  test/zdtm/lib/test.c      |   23 +++++++++++++--------
>>>  6 files changed, 52 insertions(+), 53 deletions(-)
>>>
>>> --
>>> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
>> _______________________________________________
>> CRIU mailing list
>> CRIU at openvz.org
>> https://lists.openvz.org/mailman/listinfo/criu


More information about the CRIU mailing list