[CRIU] [PATCH] restore: handle exit code of the unlock network script
Andrew Vagin
avagin at parallels.com
Tue Mar 25 06:41:19 PDT 2014
On Tue, Mar 25, 2014 at 05:23:07PM +0400, Pavel Emelyanov wrote:
> On 03/25/2014 05:13 PM, Andrew Vagin wrote:
> > On Tue, Mar 25, 2014 at 05:06:53PM +0400, Pavel Emelyanov wrote:
> >> On 03/25/2014 12:41 PM, Andrew Vagin wrote:
> >>> On Tue, Mar 25, 2014 at 02:27:33AM +0400, Pavel Emelyanov wrote:
> >>>> On 03/24/2014 03:07 PM, Andrey Vagin wrote:
> >>>>> When we are migrating processes from one host to another host,
> >>>>> we need to know the moment, when processes can be killed on the source
> >>>>> host.
> >>>>> If a migration script is killed (segv, exception, etc), the process tree
> >>>>> must not live on both nodes and we need to reduce the chance of
> >>>>> killing processes.
> >>>>
> >>>> I didn't quite get why the existing scheme used by p.haul is flawed.
> >>>> Can you draw a two-sided diagram of source-destination interaction
> >>>> and show where the problem is and how you propose to solve it?
> >>>
> >>> source destination
> >>> criu dump
> >>> post-dump
> >>> criu restore
> >>> network unlock
> >>> post-restore
> >>> kill p.haul before receiving cr_rpc.RESTORE
> >>> resume
> >>>
> >>> In this case both hosts will have alive process trees...
> >>>
> >>> And I want to move post-restore before network_unlock, because we can't
> >>> fail after unlocking network.
> >>
> >> OK, but this patch does something different.
> >
> > No, it doesn't. It doesn't move post-restore, it will be done in another
> > patch. But network_unlock is a line after which the tree can't be
> > resumed on the source host.
> >
>
> OK, so this is preparatory.
> Show me the resulting 2-sided diagram you want to achieve
source destination
criu dump
post-dump
criu restore
network unlock
<--- kill processes
exit from post_dump
[ window ]
exit from network_unlock
resume the process tree
In this scheme you can kill p.haul in any moment, but the process tree
will be resumed only on one side. And we have a small window, when the
tree will not be resumed at all.
> or send the full set.
I want to understand that I have missed nothing before doing anything
else.
Thanks.
More information about the CRIU
mailing list