[Devel] Re: [RFC][v3][PATCH 7/7] Define clone_with_pids syscall
Oren Laadan
orenl at cs.columbia.edu
Mon Jun 1 09:54:10 PDT 2009
Serge E. Hallyn wrote:
> Quoting Oren Laadan (orenl at cs.columbia.edu):
>> The two issues are related, and this is intentional. The idea
>> was that when you use CLONE_NEWPID, you imply a new nesting level
>> and a pid==1 there. So you are not supposed to specify the pid
>> of that new level.
>>
>> IOW, the parent specify the pid's of the child from the _parent's_
>> level and up (of the desired depth). CLONE_NEWPID creates a new
>> pidns level below the parent's, and that is not covered in the
>> array of pids.
>>
>> By allocation an extra slot and forcing it to be 0, we ensure that
>> the case of CLONE_NEWPID is covered correctly. Clearly, if this
>> flag isn't set, then the extra slot is redundant (but doesn't hurt).
>
> Ok, I see - I guess I don't mind those semantics. So:
>
>>>> + j = knum_pids - unum_pids;
>>> j = 1, so we copy the 3 pids in the right place.
>>>
>>>> + rc = copy_from_user(&target_pids[j], pid_set.target_pids, size);
>>>> + if (rc) {
>>>> + rc = -EFAULT;
>>>> + goto out_free;
>>>> + }
>>>> +
>>>> + return target_pids;
>>>
>>> For the second one, we have a parent task
>>>
>>> level no | pid
>>> 0 5009
>>> 1 1000
>>> 2 49
>>>
>>> calling clone_with_pid with CLONE_NEWPID and {1001,50,1} to produce:
>>>
>>> level no | pid
>>> 0 5010
>>> 1 1001
>>> 2 50
>>> 3 1
>>>
>>> So the numbers in your code become:
>>>
>>>> + unum_pids = pid_set.num_pids;
>>> unum_pids = 3
>> This is a "bug" of the parent. The parent should specify the pids
>> from the parent's level only and up, and not include the new level
>> below that will be created. (After all, it will have to be 1).
>>
>> So unum_pids = 3 will not do what you want; instead it will try to
>> create the process:
>>
>> 0 1001
>> 1 50
>> 2 1
>> 3 1
>>
>> And will fail, of course, because pid==1 at level 2 is already
>> taken.
>>
>> Instead, parent should say use {1001, 50}.
>
> Ok, but then we have the task:
>
> level no | pid
> 0 5009
> 1 1000
> 2 49
>
> calling clone(CLONE_NEWPID) with unum_pids = 2, so
>
>>>> + knum_pids = task_pid(current)->level + 1;
>>> knum_pids = 2 + 1 = 3
>>>
>>>> + target_pids = kzalloc((knum_pids + 1) * sizeof(pid_t), GFP_KERNEL);
>>> target_pids gets room for 4 pids
>>>
>>>> + j = knum_pids - unum_pids;
>
> j = 3 - 2 = 1, so we copy 1001 into pid[1] and
> 50 into pid[2], with 0 in pid[0] and pid[3]
>
> Looks good. Thanks for indulging me :)
>
> Acked-by: Serge Hallyn <serue at us.ibm.com>
>
> One last thought - should there be an explicit check to make sure that
> if CLONE_NEWPID, then at the end pid[knum_pids+1] = 0? Or is that
> there and I just missed it?
the wonders of kzalloc() ...
Oren.
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list