[CRIU] [PATCH] restore: handle exit code of the unlock network script

Andrew Vagin avagin at parallels.com
Tue Mar 25 06:41:19 PDT 2014


On Tue, Mar 25, 2014 at 05:23:07PM +0400, Pavel Emelyanov wrote:
> On 03/25/2014 05:13 PM, Andrew Vagin wrote:
> > On Tue, Mar 25, 2014 at 05:06:53PM +0400, Pavel Emelyanov wrote:
> >> On 03/25/2014 12:41 PM, Andrew Vagin wrote:
> >>> On Tue, Mar 25, 2014 at 02:27:33AM +0400, Pavel Emelyanov wrote:
> >>>> On 03/24/2014 03:07 PM, Andrey Vagin wrote:
> >>>>> When we are migrating processes from one host to another host,
> >>>>> we need to know the moment, when processes can be killed on the source
> >>>>> host.
> >>>>> If a migration script is killed (segv, exception, etc), the process tree
> >>>>> must not live on both nodes and we need to reduce the chance of
> >>>>> killing processes.
> >>>>
> >>>> I didn't quite get why the existing scheme used by p.haul is flawed.
> >>>> Can you draw a two-sided diagram of source-destination interaction
> >>>> and show where the problem is and how you propose to solve it?
> >>>
> >>> source				destination
> >>> criu dump
> >>> post-dump
> >>> 				criu restore
> >>> 				network unlock
> >>> 				post-restore
> >>> 				kill p.haul before receiving cr_rpc.RESTORE
> >>> resume
> >>>
> >>> In this case both hosts will have alive process trees...
> >>>
> >>> And I want to move post-restore before network_unlock, because we can't
> >>> fail after unlocking network.
> >>
> >> OK, but this patch does something different.
> > 
> > No, it doesn't. It doesn't move post-restore, it will be done in another
> > patch. But network_unlock is a line after which the tree can't be
> > resumed on the source host.
> > 
> 
> OK, so this is preparatory.
> Show me the resulting 2-sided diagram you want to achieve

source				destination
criu dump
post-dump
				criu restore
				network unlock
		<--- kill processes
exit from post_dump
		[    window     ]
				exit from network_unlock
				resume the process tree

In this scheme you can kill p.haul in any moment, but the process tree
will be resumed only on one side. And we have a small window, when the
tree will not be resumed at all.

> or send the full set.

I want to understand that I have missed nothing before doing anything
else.

Thanks.


More information about the CRIU mailing list