[CRIU] [PATCH] kernel: reduce required permission for prctl_set_mm
Kees Cook
keescook at chromium.org
Wed Feb 12 13:50:35 PST 2014
On Wed, Feb 12, 2014 at 1:32 PM, Andrew Morton
<akpm at linux-foundation.org> wrote:
> On Wed, 12 Feb 2014 19:40:11 +0400 Andrey Vagin <avagin at openvz.org> wrote:
>
>> Currently prctl_set_mm requires the global CAP_SYS_RESOURCE,
>> this patch reduce requiremence to CAP_SYS_RESOURCE in the current
>> namespace.
>>
>> When we restore a task we need to set up text, data and data heap sizes
>> from userspace to the values a task had at checkpoint time.
>>
>> Currently we can not restore these parameters, if a task lives in
>> a non-root user name space, because it has no capabilities in the
>> parent namespace.
>>
>> prctl_set_mm() changes parameters of the current task and doesn't affect
>> other tasks.
>>
>> This patch affects the RLIMIT_DATA limit, because a consumtiuon is
>> calculated relatively to mm->end_data, mm->start_data, mm->start_brk.
>
> I can't for the life of me work out what you were trying to say here.
> Please fix and resend this paragraph?
>
>> rlim = rlimit(RLIMIT_DATA);
>> if (rlim < RLIM_INFINITY && (brk - mm->start_brk) +
>> (mm->end_data - mm->start_data) > rlim)
>> goto out;
>>
>> This limit affects calls to brk() and sbrk(), but it doesn't affect
>> mmap. So I think requirement of CAP_SYS_RESOURCE in the current
>> namespace is enough for this limit.
>>
>> ...
>>
>> Cc: security at kernel.org
>
> That list is for reporting kernel security bugs.
>
>>
>> --- a/kernel/sys.c
>> +++ b/kernel/sys.c
>> @@ -1701,7 +1701,7 @@ static int prctl_set_mm(int opt, unsigned long addr,
>> if (arg5 || (arg4 && opt != PR_SET_MM_AUXV))
>> return -EINVAL;
>>
>> - if (!capable(CAP_SYS_RESOURCE))
>> + if (!ns_capable(current_user_ns(), CAP_SYS_RESOURCE))
>> return -EPERM;
>>
>> if (opt == PR_SET_MM_EXE_FILE)
>
> This looks harmless.
I want to be convinced of this, but weakening this cap check seems
like an easy way for a process to hide itself trivially from the real
root user. It can change it's exe file link, and dodge RLIMIT_DATA by
changing the brk addresses. The whole reason this cap check was there
was to stop that kind of thing. Limiting it to a namespace isn't great
since USER_NS means unprivileged processes can enter a new NS as the
NS root user.
-Kees
--
Kees Cook
Chrome OS Security
More information about the CRIU
mailing list