[Devel] Re: [RFC][v6][PATCH 9/9]: Document clone_with_pids() syscall
Randy Dunlap
randy.dunlap at oracle.com
Thu Sep 10 08:26:59 PDT 2009
On Wed, 9 Sep 2009 23:14:13 -0700 Sukadev Bhattiprolu wrote:
>
> Subject: [RFC][v6][PATCH 9/9]: Document clone_with_pids() syscall
>
> This gives a brief overview of the clone_with_pids() system call. We should
> eventually describe more details either in clone(2) or in a new man page.
>
> Signed-off-by: Sukadev Bhattiprolu <sukadev at vnet.linux.ibm.com>
> ---
> Documentation/clone-with-pids | 58 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 58 insertions(+)
>
> Index: linux-2.6/Documentation/clone-with-pids
> ===================================================================
> --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/Documentation/clone-with-pids 2009-09-09 21:53:30.000000000 -0700
> @@ -0,0 +1,58 @@
> +
> +struct pid_set {
> + unsigned int num_pids;
> + pid_t pids[];
> +};
> +
> +clone_with_pids(int flags, void *child_stack_base, int *parent_tid_ptr,
> + int *child_tid_ptr, NULL, struct pid_set *pid_setp)
> +
> + The clone_with_pids() system call is identical to clone(), except
> + that it allows the user to specify a pid for the child process
> + in each of the child processes' pid name spaces.
> +
namespaces. {as below}
> + This system call is meant to be used when restarting an application
> + from an earlier checkpoint. When restarting the application, the
> + processes in the application must get the same pids they had at the
> + time of the checkpoint.
> +
> + The 'pid_setp' parameter defines a set of pids to use, one for each
> + pid-namespace of the child process. The order pids in '->pids[]'
order of pids
> + corresponds to the nesting order of pid-namespaces, with ->pids[0]
> + corresponding to the init_pid_ns.
> +
> + If a pid in the ->pids list is 0, the kernel will assign the next
> + available pid in the pid namespace, for the process.
> +
> + If a pid in the ->pids[] list is non-zero, the kernel tries to assign
> + the specified pid in that namespace. If that pid is already in use
> + by another process, the system call fails with -EBUSY.
> +
> + On success, the system call returns the pid of the child process in
> + the parent's active pid namespace.
> +
> + On failure, clone_with_pids() returns -1 and sets 'errno' to one of
> + following values (the child process is not created).
> +
> + EPERM Caller does not have the SYS_ADMIN privilege needed to excute
execute
> + this call.
> +
> + EINVAL The number of pids specified in 'pid_set.num_pids' exceeds
> + the current nesting level of parent process
> +
> + EBUSY A requested 'pid' is in use by another process in that name
> + space.
> +
> +Example:
> +
> + struct pid_set pid_set { 3, {0, 99, 177} };
> + void *child_stack = malloc(STACKSIZE);
> +
> + /* set up child_stack, like with clone() */
> + rc = clone_with_pids(clone_flags, child_stack, NULL, NULL, &pid_set);
> +
> + if (rc < 0) {
> + perror("clone_with_pids()");
> + exit(1);
> + }
What happens when one of the pids is busy? Say the last one in the
example above [177]. Are the first 2 children already cloned
or are all pids checked for availability before cloning?
If the latter, is there a race there?
and what value is returned?
---
~Randy
LPC 2009, Sept. 23-25, Portland, Oregon
http://linuxplumbersconf.org/2009/
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list