[CRIU] [PATCH] restore: handle exit code of the unlock network script

Pavel Emelyanov xemul at parallels.com
Tue Mar 25 06:45:32 PDT 2014


On 03/25/2014 05:41 PM, Andrew Vagin wrote:
> On Tue, Mar 25, 2014 at 05:23:07PM +0400, Pavel Emelyanov wrote:
>> On 03/25/2014 05:13 PM, Andrew Vagin wrote:
>>> On Tue, Mar 25, 2014 at 05:06:53PM +0400, Pavel Emelyanov wrote:
>>>> On 03/25/2014 12:41 PM, Andrew Vagin wrote:
>>>>> On Tue, Mar 25, 2014 at 02:27:33AM +0400, Pavel Emelyanov wrote:
>>>>>> On 03/24/2014 03:07 PM, Andrey Vagin wrote:
>>>>>>> When we are migrating processes from one host to another host,
>>>>>>> we need to know the moment, when processes can be killed on the source
>>>>>>> host.
>>>>>>> If a migration script is killed (segv, exception, etc), the process tree
>>>>>>> must not live on both nodes and we need to reduce the chance of
>>>>>>> killing processes.
>>>>>>
>>>>>> I didn't quite get why the existing scheme used by p.haul is flawed.
>>>>>> Can you draw a two-sided diagram of source-destination interaction
>>>>>> and show where the problem is and how you propose to solve it?
>>>>>
>>>>> source				destination
>>>>> criu dump
>>>>> post-dump
>>>>> 				criu restore
>>>>> 				network unlock
>>>>> 				post-restore
>>>>> 				kill p.haul before receiving cr_rpc.RESTORE
>>>>> resume
>>>>>
>>>>> In this case both hosts will have alive process trees...
>>>>>
>>>>> And I want to move post-restore before network_unlock, because we can't
>>>>> fail after unlocking network.
>>>>
>>>> OK, but this patch does something different.
>>>
>>> No, it doesn't. It doesn't move post-restore, it will be done in another
>>> patch. But network_unlock is a line after which the tree can't be
>>> resumed on the source host.
>>>
>>
>> OK, so this is preparatory.
>> Show me the resulting 2-sided diagram you want to achieve
> 
> source				destination
> criu dump
> post-dump
> 				criu restore
> 				network unlock
> 		<--- kill processes
> exit from post_dump
> 		[    window     ]
> 				exit from network_unlock
> 				resume the process tree
> 
> In this scheme you can kill p.haul in any moment, but the process tree
> will be resumed only on one side. And we have a small window, when the
> tree will not be resumed at all.
> 
>> or send the full set.
> 
> I want to understand that I have missed nothing before doing anything
> else.

You told that you want to move post-restore before netowork-unlock,
but it's not in the diagram above. Probably this.

> Thanks.
> .
> 




More information about the CRIU mailing list