[CRIU] [PATCH] restore: define root_as_sibling before using it
Pavel Emelyanov
xemul at parallels.com
Wed Sep 10 03:18:31 PDT 2014
On 09/10/2014 12:42 AM, Tycho Andersen wrote:
> On Tue, Sep 09, 2014 at 11:58:00PM +0400, Pavel Emelyanov wrote:
>> On 09/09/2014 11:44 PM, Tycho Andersen wrote:
>>> On Tue, Sep 09, 2014 at 11:23:47PM +0400, Pavel Emelyanov wrote:
>>>> On 09/09/2014 10:15 PM, Tycho Andersen wrote:
>>>>> root_as_sibling is used in criu_signals_setup(), but was only defined later
>>>>> (when forking the root task for the first time). This meant that the
>>>>> SA_NOCLDSTOP was never masked off, which meant SIGCHLD was never delivered
>>>>> after ptracing the root task. Thus, when the a child of the root task died
>>>>> (e.g. from cr_system), the root task sat in PTRACE_STOP, and the restore task
>>>>> never PTRACE_CONT'd, resulting in a deadlock.
>>>>>
>>>>> We also drop the pdeath_sig constraint from setting root_as_sibling when in
>>>>> --restore-detached mode; in --restore-detached we /always/ need to have
>>>>> root_as_sibling, but we only need to clone the parent if pdeath_sig is set and
>>>>> we want to restore the task as alive.
>>>>>
>>>>> v2: re-work the condition for CLONE_PARENT
>>>>>
>>>>> Signed-off-by: Tycho Andersen <tycho.andersen at canonical.com>
>>>>> ---
>>>>> cr-restore.c | 30 ++++++++++++++++++++----------
>>>>> 1 file changed, 20 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/cr-restore.c b/cr-restore.c
>>>>> index 2735d0d..fe6c798 100644
>>>>> --- a/cr-restore.c
>>>>> +++ b/cr-restore.c
>>>>> @@ -956,25 +956,21 @@ struct cr_clone_arg {
>>>>> static void maybe_clone_parent(struct pstree_item *item,
>>>>> struct cr_clone_arg *ca)
>>>>> {
>>>>> + /*
>>>>> + * zdtm runs in kernel 3.11, which has the problem described below. We
>>>>> + * avoid this by including the pdeath_sig test. Once users/zdtm migrate
>>>>> + * off of 3.11, this condition can be simplified to just test
>>>>> + * root_as_sibling.
>>>>> + */
>>>>> if (opts.swrk_restore ||
>>>>> (opts.restore_detach && ca->core->thread_core->pdeath_sig)) {
>>>>> /*
>>>>> - * This means we're called from lib's criu_restore_child().
>>>>> - * In that case create the root task as the child one to+
>>>>> - * the caller. This is the only way to correctly restore the
>>>>> - * pdeath_sig of the root task. But also looks nice.
>>>>> - *
>>>>> - * Alternatively, if we are --restore-detached, a similar trick is
>>>>> - * needed to correctly restore pdeath_sig and prevent processes from
>>>>> - * dying once restored.
>>>>> - *
>>>>> * There were a problem in kernel 3.11 -- CLONE_PARENT can't be
>>>>> * set together with CLONE_NEWPID, which has been solved in further
>>>>> * versions of the kernels, but we treat 3.11 as a base, so at
>>>>> * least warn a user about potential problems.
>>>>> */
>>>>> item->rst->clone_flags |= CLONE_PARENT;
>>>>> - root_as_sibling = 1;
>>>>> if (item->rst->clone_flags & CLONE_NEWPID)
>>>>> pr_warn("Set CLONE_PARENT | CLONE_NEWPID but it might cause restore problem,"
>>>>> "because not all kernels support such clone flags combinations!\n");
>>>>> @@ -1792,6 +1788,20 @@ int cr_restore_tasks(void)
>>>>> {
>>>>> int ret = -1;
>>>>>
>>>>> + if (opts.swrk_restore || opts.restore_detach) {
>>>>> + /*
>>>>> + * This means we're called from lib's criu_restore_child().
>>>>> + * In that case create the root task as the child one to+
>>>>> + * the caller. This is the only way to correctly restore the
>>>>> + * pdeath_sig of the root task. But also looks nice.
>>>>> + *
>>>>> + * Alternatively, if we are --restore-detached, a similar trick is
>>>>> + * needed to correctly restore pdeath_sig and prevent processes from
>>>>> + * dying once restored.
>>>>> + */
>>>>> + root_as_sibling = 1;
>>>>
>>>> The root_as_sibling and item->rst->clone_flags above should come
>>>> in sync, they don't work one w/o another. Thus the whole complex
>>>> logic about whether or not to do it should sit here and the
>>>> maybe_clone_parent() should just check the root_as_sibling.
>>>
>>> But isn't it wrong to --restore-detach and not set root_as_sibling?
>>
>> If we --restore-detach and don't set this variable the root task
>> would be born as criu's child, then criu would exit and the fresh
>> task would get re-parented to init. Why is it wrong from your
>> point of view?
>
> I guess I don't think either way is particularly wrong, but I think
> it should be one or the other always, not creating the process tree
> with some parent based on some internal state of the dumped process,
> since that is "hard" to predict.
More information about the CRIU
mailing list