[CRIU] [PATCH] kernel: reduce required permission for prctl_set_mm

Eric W. Biederman ebiederm at xmission.com
Wed Feb 12 15:14:07 PST 2014


Andrey Vagin <avagin at openvz.org> writes:

> Currently prctl_set_mm requires the global CAP_SYS_RESOURCE,
> this patch reduce requiremence to CAP_SYS_RESOURCE in the current
> namespace.
>
> When we restore a task we need to set up text, data and data heap sizes
> from userspace to the values a task had at checkpoint time.
>
> Currently we can not restore these parameters, if a task lives in
> a non-root user name space, because it has no capabilities in the
> parent namespace.
>
> prctl_set_mm() changes parameters of the current task and doesn't affect
> other tasks.
>
> This patch affects the RLIMIT_DATA limit, because a consumtiuon is
> calculated relatively to mm->end_data, mm->start_data, mm->start_brk.
>
> rlim = rlimit(RLIMIT_DATA);
> if (rlim < RLIM_INFINITY && (brk - mm->start_brk) +
> 		(mm->end_data - mm->start_data) > rlim)
> 	goto out;
>
> This limit affects calls to brk() and sbrk(), but it doesn't affect
> mmap. So I think requirement of CAP_SYS_RESOURCE in the current
> namespace is enough for this limit.

Ick.  No.

You do not have an argument for reducing the capable call here to
ns_capable.  ns_capable(current_user_ns(), CAP_SYS_RESOURCE) does not
currently allow anything.  If ns_capable(current_user_ns(),
CAP_SYS_RESOURCE) were to allow things there would still need to be a
check for a root setable maximum which is not present in this patch.

Either you have an argument for completely removing the capability check
or your reasoning is broken.

Reading through the code and reading through brk I an fairly confident
that your reasoning is broken.

The rlimit test needs to be when any of start_brk, end_data, or
start_data are changed, and that test is most definitely not performed.

Checks for enforcing the stack_size are completely missing.

It does look like with care we can remove or make much more precise the
capable checks from in prctl_set_mm but this patch definitely does not
take that needed care.

Nacked-by: "Eric W. Biederman" <ebiederm at xmission.com>

> Cc: Andrew Morton <akpm at linux-foundation.org>
> Cc: Oleg Nesterov <oleg at redhat.com>
> Cc: Robin Holt <holt at sgi.com>
> Cc: Al Viro <viro at zeniv.linux.org.uk>
> Cc: Kees Cook <keescook at chromium.org>
> Cc: "Eric W. Biederman" <ebiederm at xmission.com>
> Cc: Chen Gang <gang.chen at asianux.com>
> Cc: Stephen Rothwell <sfr at canb.auug.org.au>
> Cc: Pavel Emelyanov <xemul at parallels.com>
> Cc: Aditya Kali <adityakali at google.com>
> Cc: security at kernel.org
> Signed-off-by: Andrey Vagin <avagin at openvz.org>
> ---
>  kernel/sys.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sys.c b/kernel/sys.c
> index c0a58be..6f36fb3 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -1701,7 +1701,7 @@ static int prctl_set_mm(int opt, unsigned long addr,
>  	if (arg5 || (arg4 && opt != PR_SET_MM_AUXV))
>  		return -EINVAL;
>  
> -	if (!capable(CAP_SYS_RESOURCE))
> +	if (!ns_capable(current_user_ns(), CAP_SYS_RESOURCE))
>  		return -EPERM;
>  
>  	if (opt == PR_SET_MM_EXE_FILE)


More information about the CRIU mailing list