[Devel] Re: [PATCH 0/9] OpenVZ kernel based checkpointing/restart

Louis Rilling Louis.Rilling at kerlabs.com
Thu Oct 30 11:01:33 PDT 2008


On Thu, Oct 30, 2008 at 10:08:44AM -0700, Dave Hansen wrote:
> On Thu, 2008-10-30 at 12:47 +0100, Louis Rilling wrote:
> > 1) this prevents userspace from doing weird things, like changing the task tree
> > and let the kernel detect it and deal with the mess this creates (think about
> > two threads being restarted in separate processes that do not even share their
> > parents). But one can argue that userspace can change the checkpoint image as
> > well, so that the kernel must check for such weird things anyway.
> 
> To me, this is one of the strongest arguments out there for doing
> restart as much as possible with existing user<->kernel APIs.  Having
> the kernel detect and clean up userspace's messes is not going to work.
> We might as well just do things in the kernel rather than do that.
> 
> What we *should* do is leverage all of the existing APIs that we already
> have instead of creating completely new code paths into which my butter
> fingers can introduce new kernel bugs.
> 
> > 2) restart will be more efficient with respect to shared objects.
> 
> Can you quantify this?  Which objects?  How much more efficient?

Quantify? No. I expect that investigating both approaches will show us numbers.
Unless Oren already has some?

Which objects? I think that two kinds will especially matter: objects usually
shared only inside a thread group (mm_struct, fs_struct, files_struct,
signal_struct and sighand_struct), and individual file descriptors. The point is
to avoid creating new structures before destroying them because the restarted
task shares them with a previously restarted one.

Concerning individual file descriptors, limiting the number of open files before
calling sys_restart() may avoid these useless creations/destructions (actually
the "useless" work mainly consists in managing ref counts since file descriptors
are shared after fork()).

Concerning thread-shared structures, it is probably easy for userspace to guess
which clone flags to use when restarting threads, but
1) kernel-space will have to check that the sharing is correct anyway, and
2) kernel-space will have to fix it anyway if structures are not shared in an
obvious manner between tasks (think about A creating B with shared files_struct,
B creating C with shared files_struct, B unsharing its files_struct, and then
checkpoint).

So, with a userspace implementation, useless structures will be created anyway,
and optimizing the common cases (regular threads) just duplicates kernel's work
of checking which shared structure to use for each task to restart.
With a kernel-space implementation, all useless creations can be avoided, and no
duplicate work is needed.

That said, numbers may show us that useless creations are not so
time-consuming, but we won't know before seeing them...

Louis

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.openvz.org/pipermail/devel/attachments/20081030/6e2b7ad7/attachment-0001.sig>
-------------- next part --------------
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers


More information about the Devel mailing list