[Devel] Re: [RFC v16][PATCH 23/43] c/r: restart multiple processes

Oren Laadan orenl at cs.columbia.edu
Wed May 27 14:38:46 PDT 2009



Alexey Dobriyan wrote:
> On Wed, May 27, 2009 at 01:32:49PM -0400, Oren Laadan wrote:
>> Restarting of multiple processes expects all restarting tasks to call
>> sys_restart(). Once inside the system call, each task will restart
>> itself at the same order that they were saved. The internals of the
>> syscall will take care of in-kernel synchronization bewteen tasks.
>>
>> This patch does _not_ create the task tree in the kernel. Instead it
>> assumes that all tasks are created in some way and then invoke the
>> restart syscall. You can use the userspace mktree.c program to do
>> that.
>>
>> The init task (*) has a special role: it allocates the restart context
>> (ctx), and coordinates the operation. In particular, it first waits
>> until all participating tasks enter the kernel, and provides them the
>> common restart context. Once everyone in ready, it begins to restart
>> itself.
>>
>> In contrast, the other tasks enter the kernel, locate the init task (*)
>> and grab its restart context, and then wait for their turn to restore.
>>
>> When a task (init or not) completes its restart, it hands the control
>> over to the next in line, by waking that task.
>>
>> An array of pids (the one saved during the checkpoint) is used to
>> synchronize the operation. The first task in the array is the init
>> task (*). The restart context (ctx) maintain a "current position" in
>> the array, which indicates which task is currently active. Once the
>> currently active task completes its own restart, it increments that
>> position and wakes up the next task.
>>
>> Restart assumes that userspace provides meaningful data, otherwise
>> it's garbage-in-garbage-out. In this case, the syscall may block
>> indefinitely, but in TASK_INTERRUPTIBLE, so the user can ctrl-c or
>> otherwise kill the stray restarting tasks.
>>
>> In terms of security, restart runs as the user the invokes it, so it
>> will not allow a user to do more than is otherwise permitted by the
>> usual system semantics and policy.
>>
>> Currently we ignore threads and zombies
> 
> Let's discuss threads and zombies.
> 
> 1. Will zombie end up in a image?

Zombies will be mentioned in the hierarchy description, and will
have very little state saved (e.g. exit status, parent).

> 2. If yes, how it will be restored. Will it be forked, call restart(2)
>    and then somehow zombified inside kernel?

(not part of this patchset, but soon will be added to ckpt-v16-dev)
Zombie will be restarted as a normal process, will restore bare
minimum needed, and will call do_exit(). It will have to ensure
that there are no side effects on (=signals to) parent/children.

> 3. How thread group will be restored, will every thread be CLONE_THREAD'ed?
>    What to do with exited thread group leaders, will they be forked, then
>    CLONE_THREAD thread group?

First, user space creates the entire tree hierarchy, including
zombies. Then each task calls sys_restart(). Inside, they are
coordinated to restore their state one after the other. So that
eventually, the to-be-zombies, be it a thread-group-leader or not,
will call do_exit() and zombify themselves.

Take a look at mktree.c (part of the user tools). It's already done
there using CLONE_THREAD.  The reason I wrote that it isn't supported
well is because I think that in full-container mode the link count
won't work correctly. Other than that, threads should work as long
as you don't play with "partial" sharing (e.g. only CLONE_FS).

Oren.


_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers




More information about the Devel mailing list