Fwd: Re: [CRIU] Signalling processes before CRIU/after unCRIU

Andrew Vagin avagin at parallels.com
Wed Oct 10 13:27:50 EDT 2012


On Wed, Oct 10, 2012 at 06:26:59PM +0400, Alex/AT wrote:
> > When we dump a container should we make sure systemd knows how to talk
> > to the rest of the zoo? This sounds ... strange.
> Yep. And determination of "parent" may be non-trivial too. It is probably
> best to be left to the user to signal the "parent" process.
> 
> When we suspend a large tree from the parent, all the child processes
> should got signalled before suspending, and be waited for each one to
> respond (or not respond). Each one may be execution-halted after getting
> response, and then all the non-responding processes in the tree should be
> execution-halted at once.
> 
> > Frankly, I don't want to re-invent TCP for such a simple case.
> It's not even needed. You were thinking about library mechanism, and that
> is the best way in terms of predictability.

It's needed, because it's an easiest way to spread this functionality.

Nobody wants to link an additional library, because it is unreliably
and insecure. In particular if there is not used a standard mechanism
(like dbus).

I think we should look at dbus documentation. If we find nothing, we
can ask an advice in a dbus maillist. And if all attempts are failed,
we can start to create own mechanism.

> 
> So it gets to the simple way.
> 
> Suspension:
> 
> 1. Check, if each process in the tree (starting from user specified
> process) has "START SUSPEND" callback. If not, postpone suspension until
> all processes having callbacks are halted (step 3).
> 2. For each process with the callback:
> 2A. Flag the process as "IN SUSPEND"
> 2B. Enter "START SUSPEND" callback. Return code from "START SUSPEND" may
> specify, if the process wants to do some housekeeping, or be postponed to
> the step 3.
> 2C. Wait for the "COMPLETE" callback to be called by the process. On
> "COMPLETE" call, execution-halt the process.
> 2D. Processes that have not responded in given time (user-specified) get
> to the step 3.
> 3. Execution-halt and suspend each process postponed in steps 1 or 2D.
> 
> Unsuspension:
> 
> 1. For all the processes:
> 1A. Restore process, and unmark its "IN SUSPEND" mode.
> 1B. Execution-start them. This returns from "COMPLETE" call on all the
> processes inside "COMPLETE" callback.
> 
> "COMPLETE" callbacks from processes not marked as "IN SUSPEND" should be
> ignored. Such calls may be made by processes timed out to housekeep in
> step 2D, and suspended while in housekeeping.
> 
> 
> -- 
> Regards,
> Alexey Asemov
> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://openvz.org/mailman/listinfo/criu


More information about the CRIU mailing list