[Devel] Re: [PATCH 1/1] RFC: taking a crack at targeted capabilities
Eric W. Biederman
ebiederm at xmission.com
Wed Jan 6 12:57:30 PST 2010
"Serge E. Hallyn" <serue at us.ibm.com> writes:
> Quoting Eric W. Biederman (ebiederm at xmission.com):
>> "Serge E. Hallyn" <serue at us.ibm.com> writes:
>>
>> > So i was thinking about how to safely but incrementally introduce
>> > targeted capabilities - which we decided was a prereq to making VFS
>> > handle user namespaces - and the following seemed doable. My main
>> > motivations were (in order):
>> >
>> > 1. don't make any unconverted capable() checks unsafe
>> > 2. minimize performance impact on non-container case
>> > 3. minimize performance impact on containers
>>
>> My motivation is a bit different. I would like to get to the
>> unprivileged creation of new namespaces. It looks like this gets us
>> 90% of the way there, with only potential uid confusion issues left.
>
> Yup, that was actually what I was thinking about last night when I decided
> to give it a shot :) IMO, my patch + a dummy version of user_namespaces
> for vfs (done in a clean way that can be an incremental step toward full
> vfs userns support - which I haven't yet thought through) is enough to
> give you safe fully unprivileged containers. Now with the API I have,
> you'd have a program with either setuid-root or cap_sys_admin,cap_setpcap=pe
> which does the prctl and the unshares, but it would theoretically be safe
> to hand that program to unprivileged users.
Yes.
>> I still need to handle getting all caps after creation but otherwise I
>> think I have a good starter patch that achieves all of your goals.
>
> Well in my patch we don't need to clear out the bounding set, or set
> SETUID_NOROOT - so running a setuid root program or becoming root should
> still give you capabilities! They'll just be targeted at your container.
>
> I really think this is what you need.
Yes. So far things don't look too hard. What I meant is that after
CLONE_USERNS you should become uid 0 with a full set of capabilities in
a new user namespace. Those capabilities aren't good for anything because
they are user namespace relative.
I believe we have a bug today where the new uid 0 does not have a full set
of capabilities, but that it is hidden because only uid 0 can unshare
the user namespace.
>> Of course kill_permission needs the checks you have suggested as well.
>
> Ok, I can't look at your patch in detail right now and don't quite get
> where you're going with a quick glance, so will look in closer detail
> later. Will also think about a way to get "just-enough" vfs userns
> support to completely give you what you need for privileged users in
> unprivileged containers.
Sounds good. That uid 0 problem is particularly interesting, because half
the world is owned by uid 0.
As for my patch. The heart of it is the cap_capable implementation.
The rest is just the obvious consequences of adding a user_namespace parameter
to a security->capable().
int cap_capable(struct task_struct *tsk, const struct cred *cred,
struct user_namespace *targ_ns, int cap, int audit)
{
for (;;) {
/* Do we have the necessary capabilities? */
if (targ_ns == cred->user->user_ns)
return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
/* The creator of the user namespace has all caps. */
if (targ_ns->creator == cred->user)
return 0;
/* Have we tried all of the parent namespaces? */
if (targ_ns == &init_user_ns)
return -EPERM;
/* If you have the capability in a parent user ns you have it
* in the over all children user namespaces as well, so see
* if this process has the capability in the parent user
* namespace.
*/
targ_ns = targ_ns->creator->user_ns;
}
}
Eric
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list