[CRIU] [PATCH 2/3] capabilities: add a secure bit to allow changing a task exe link
Serge E. Hallyn
serge at hallyn.com
Mon Feb 17 20:53:39 PST 2014
Quoting Andrey Vagin (avagin at openvz.org):
> When we restore a task we need to restore its exe link from userspace to
> the values the task had at checkpoint time.
>
> Currently this operations required the global CAP_SYS_RESOURCE, which is
> always absent in a non-root user namespace.
>
> So this patch introduces a new security bit which:
> * can be set only if a task has the global CAP_SYS_RESOURCE
> * inherited by child processes
> * is saved when a task moves in another userns
> * allows to change a task exe link even if a task doesn't have CAP_SYS_RESOURCE
I'm late to this party anyway, but fwiw I don't like this use
of securebits. It also seems to prevent c/r in a nested
container anyway so wouldn't seem to suffice.
But I assume I don't really need to argue it as it appears Pavel
and Eric are looking into a better all-around design.
> Cc: Andrew Morton <akpm at linux-foundation.org>
> Cc: Oleg Nesterov <oleg at redhat.com>
> Cc: Al Viro <viro at zeniv.linux.org.uk>
> Cc: Kees Cook <keescook at chromium.org>
> Cc: "Eric W. Biederman" <ebiederm at xmission.com>
> Cc: Stephen Rothwell <sfr at canb.auug.org.au>
> Cc: Pavel Emelyanov <xemul at parallels.com>
> Cc: Aditya Kali <adityakali at google.com>
> Signed-off-by: Andrey Vagin <avagin at openvz.org>
> ---
> include/uapi/linux/securebits.h | 12 +++++++++++-
> kernel/sys.c | 5 +++++
> kernel/user_namespace.c | 3 ++-
> security/commoncap.c | 7 +++++++
> 4 files changed, 25 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/securebits.h b/include/uapi/linux/securebits.h
> index 985aac9..c99803b 100644
> --- a/include/uapi/linux/securebits.h
> +++ b/include/uapi/linux/securebits.h
> @@ -43,9 +43,19 @@
> #define SECBIT_KEEP_CAPS (issecure_mask(SECURE_KEEP_CAPS))
> #define SECBIT_KEEP_CAPS_LOCKED (issecure_mask(SECURE_KEEP_CAPS_LOCKED))
>
> +/* When set, a process can do PR_SET_MM_EXE_FILE even if it doesn't
> + * have CAP_SYS_RESOURCE. Setting of this bit requires CAP_SYS_RESOURCE.
> + * This bit is not dropped when a task moves in another userns. */
> +#define SECURE_SET_EXE_FILE 6
> +#define SECURE_SET_EXE_FILE_LOCKED 7 /* make bit-6 immutable */
> +
> +#define SECBIT_SET_EXE_FILE (issecure_mask(SECURE_SET_EXE_FILE))
> +#define SECBIT_SET_EXE_FILE_LOCKED (issecure_mask(SECURE_SET_EXE_FILE_LOCKED))
> +
> #define SECURE_ALL_BITS (issecure_mask(SECURE_NOROOT) | \
> issecure_mask(SECURE_NO_SETUID_FIXUP) | \
> - issecure_mask(SECURE_KEEP_CAPS))
> + issecure_mask(SECURE_KEEP_CAPS) | \
> + issecure_mask(SECURE_SET_EXE_FILE))
> #define SECURE_ALL_LOCKS (SECURE_ALL_BITS << 1)
>
> #endif /* _UAPI_LINUX_SECUREBITS_H */
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 939370c..2f0925d 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -18,6 +18,7 @@
> #include <linux/kernel.h>
> #include <linux/workqueue.h>
> #include <linux/capability.h>
> +#include <linux/securebits.h>
> #include <linux/device.h>
> #include <linux/key.h>
> #include <linux/times.h>
> @@ -1714,6 +1715,10 @@ static int prctl_set_mm(int opt, unsigned long addr,
> if (rlimit(RLIMIT_STACK) < RLIM_INFINITY)
> return -EPERM;
> break;
> + case PR_SET_MM_EXE_FILE:
> + if (!issecure(SECURE_SET_EXE_FILE))
> + return -EPERM;
> + break;
> default:
> return -EPERM;
> }
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index 240fb62..59584fe 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -34,7 +34,8 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
> /* Start with the same capabilities as init but useless for doing
> * anything as the capabilities are bound to the new user namespace.
> */
> - cred->securebits = SECUREBITS_DEFAULT;
> + cred->securebits = SECUREBITS_DEFAULT |
> + (cred->securebits & SECBIT_SET_EXE_FILE);
> cred->cap_inheritable = CAP_EMPTY_SET;
> cred->cap_permitted = CAP_FULL_SET;
> cred->cap_effective = CAP_FULL_SET;
> diff --git a/security/commoncap.c b/security/commoncap.c
> index b9d613e..eda1eb8 100644
> --- a/security/commoncap.c
> +++ b/security/commoncap.c
> @@ -907,6 +907,13 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
> )
> /* cannot change a locked bit */
> goto error;
> +
> + /* Setting SECURE_SET_EXE_FILE requires CAP_SYS_RESOURCE */
> + if ((arg2 & SECBIT_SET_EXE_FILE) &&
> + !(new->securebits & SECBIT_SET_EXE_FILE) &&
> + !capable(CAP_SYS_RESOURCE))
> + goto error;
> +
> new->securebits = arg2;
> goto changed;
>
> --
> 1.8.5.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
More information about the CRIU
mailing list