[CRIU] [PATCH] restore: handle exit code of the unlock network script

Andrey Wagin avagin at gmail.com
Tue Mar 25 06:52:54 PDT 2014


2014-03-25 17:45 GMT+04:00 Pavel Emelyanov <xemul at parallels.com>:
> On 03/25/2014 05:41 PM, Andrew Vagin wrote:
>> On Tue, Mar 25, 2014 at 05:23:07PM +0400, Pavel Emelyanov wrote:
>>> On 03/25/2014 05:13 PM, Andrew Vagin wrote:
>>>> On Tue, Mar 25, 2014 at 05:06:53PM +0400, Pavel Emelyanov wrote:
>>>>> On 03/25/2014 12:41 PM, Andrew Vagin wrote:
>>>>>> On Tue, Mar 25, 2014 at 02:27:33AM +0400, Pavel Emelyanov wrote:
>>>>>>> On 03/24/2014 03:07 PM, Andrey Vagin wrote:
>>>>>>>> When we are migrating processes from one host to another host,
>>>>>>>> we need to know the moment, when processes can be killed on the source
>>>>>>>> host.
>>>>>>>> If a migration script is killed (segv, exception, etc), the process tree
>>>>>>>> must not live on both nodes and we need to reduce the chance of
>>>>>>>> killing processes.
>>>>>>>
>>>>>>> I didn't quite get why the existing scheme used by p.haul is flawed.
>>>>>>> Can you draw a two-sided diagram of source-destination interaction
>>>>>>> and show where the problem is and how you propose to solve it?
>>>>>>
>>>>>> source                            destination
>>>>>> criu dump
>>>>>> post-dump
>>>>>>                           criu restore
>>>>>>                           network unlock
>>>>>>                           post-restore
>>>>>>                           kill p.haul before receiving cr_rpc.RESTORE
>>>>>> resume
>>>>>>
>>>>>> In this case both hosts will have alive process trees...
>>>>>>
>>>>>> And I want to move post-restore before network_unlock, because we can't
>>>>>> fail after unlocking network.
>>>>>
>>>>> OK, but this patch does something different.
>>>>
>>>> No, it doesn't. It doesn't move post-restore, it will be done in another
>>>> patch. But network_unlock is a line after which the tree can't be
>>>> resumed on the source host.
>>>>
>>>
>>> OK, so this is preparatory.
>>> Show me the resulting 2-sided diagram you want to achieve
>>
>> source                                destination
>> criu dump
>> post-dump
>>                               criu restore
>>                               network unlock
>>               <--- kill processes
>> exit from post_dump
>>               [    window     ]
>>                               exit from network_unlock
>>                               resume the process tree
>>
>> In this scheme you can kill p.haul in any moment, but the process tree
>> will be resumed only on one side. And we have a small window, when the
>> tree will not be resumed at all.
>>
>>> or send the full set.
>>
>> I want to understand that I have missed nothing before doing anything
>> else.
>
> You told that you want to move post-restore before netowork-unlock,
> but it's not in the diagram above. Probably this.

You are trying to troll me. If post_dump will be not interesting for
us, if it will be called before network_unlock. post_dump was added to
its place by mistake.

        /*
         * -------------------------------------------------------------
         * Below this line nothing can fail, because network is unlocked
         */

        ret = restore_switch_stage(CR_STATE_RESTORE_CREDS);
        BUG_ON(ret);

        timing_stop(TIME_RESTORE);

        ret = run_scripts("post-restore");


>
>> Thanks.
>> .
>>
>
>


More information about the CRIU mailing list