[Devel] Re: [PATCH 0/9] OpenVZ kernel based checkpointing/restart

Louis Rilling Louis.Rilling at kerlabs.com
Thu Oct 30 11:14:18 PDT 2008


On Thu, Oct 30, 2008 at 01:45:25PM -0400, Oren Laadan wrote:
> 
> 
> Louis Rilling wrote:
> > In Kerrighed this is kernel-based, and will remain kernel-based because we
> > checkpoint a distributed task tree, and want to restart it as mush as possible
> > with the same distribution. The distributed protocol used for restart is
> > currently too fragile and complex to rely on customized user-space
> > implementations. That said, if someone brings very good arguments in favor of
> > userspace implementations, we might consider changing this.
> 
> Zap also has distributed checkpoint which does not require strict
> kernel-side ordering. Do you need that because you do SSI ?

Yes. Tasks from different nodes have parent-children, session leader, etc.
relationships, and the distributed management of struct pid lifecycle is a bit
touchy too. By the way, splitting the checkpoint image in one file for each task
helps us a lot to make restart parallel, because it is more efficient for the file
system to handle parallel reads of different files from different nodes than
parallel reads on a single file descriptor from different nodes.

> 
> > 
> > Without taking distributed restart into account, I also tend to prefer
> > kernel-based, mainly for two (not so strong) reasons:
> > 1) this prevents userspace from doing weird things, like changing the task tree
> > and let the kernel detect it and deal with the mess this creates (think about
> > two threads being restarted in separate processes that do not even share their
> > parents). But one can argue that userspace can change the checkpoint image as
> > well, so that the kernel must check for such weird things anyway.
> 
> I don't really buy this argument. First, as you say, user can change
> the checkpoint image file. Second, you can verify in the kernel that
> the real relationships of the processes match those specified (and
> expected from) the image file. That's pretty straightforward.
> 
> > 2) restart will be more efficient with respect to shared objects.
> 
> Can you elaborate on this ?  In what sense "more efficient" ?
> 
> Note that the topic in question is not whether to do the entire restart
> from user space (and I argue that most work should be done in the kernel),
> but rather whether process creation (and only that) should be done in
> kernel or user space.

See my answer to Dave.

> 
> Quick thoughts of pros/cons of each approach are:
> 
> user space:
> 
> + re-use existing api (fork)
> + easier to debug
> + will allow 'handmade' resources restart: it was mentioned before that
>   one may want to reattach stdout to a different place after restart; a
>   user based restart of processes can make this much easier: e.g. the
>   user process can create the alternative resources, give them to the
>   kernel and only then call sys_restart)
> + arch-independent code
> 
> - a bit slower than in kernel space
> - requires a clone-with-specific-pid syscall or interface
> 
> kernel space:
> 
> + a bit easier to control everything
> + a bit faster than user space
> + no need for user-visible interface for clone-with-...
> 
> - arch-dependent code
> - needs special code to fight 'fork-bomb'
> 
> So, I'm not convinced, and I even think there may be room to both, for
> the time being. I volunteer to support the user-space alternative while
> we make up our minds.

Yes, I hope that investigating both approaches will give us stronger arguments.

Louis

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.openvz.org/pipermail/devel/attachments/20081030/fdd0ff8c/attachment-0001.sig>
-------------- next part --------------
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers


More information about the Devel mailing list