[Devel] [PATCH RH7 00/22] Port: user namespace owned mounts

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Tue Mar 15 02:11:25 PDT 2016


Questions from xemul@ and pros/cons about nexessity of these patchset:

1) Do we able to remount external mounts superblock readonly(which 
should not be permitted)?

VZ6: we check mnt->owner in ve_remount_allowed - so we can not remount.
VZ7: we CAN remount, if have dev in ve->devmnt_list, as we do not have 
mnt->owner as in VZ6(mainstream way now is to prohibit remount from 
unpriviledged mounts at all).

In VZ7 "Port: user namespace owned mounts" should help with that problem.

2) Why it is bed to also gain capabilities on host, in contrary to just 
getting out from CT's userns to host's one?(VZ7)

a) With setns we can get out of all other namespaces with:
nsenter  --mount=/proc/1/ns/mnt
if we have CAP_SYS_ADMIN and we used some imaginary exploit to get out 
of both pid and userns. I don't know the way to do it without capability.

b) With root userns and CAP_SYS_PTRACE and other needed caps, one can 
attach to all ptraceable tasks on node. - Not sure if it is very bad.

c) all ns_capable checks will succeed in root userns e.g. with CAP_KILL 
exploit will be able to kill everybody's processes.

3) How to escape VZCT so that we do not have caps in init_user_ns?(VZ7)

a)In commit_creds/override_creds/revert_creds we change task->cred. If 
attacker can somehow pass there credentials of non-root user from host. 
Thus having task->cred->user_ns set to init_user_ns, but becoming 
unprivileged.

b)Function set_cred_user_ns is not the case as anyway changes caps to 
CAP_FULL_SET, and where seem no other place where user_ns is changed on 
cred.

It seem that is not an easy task to escape without getting caps, but do 
we sure that at some time that would not happen?

Additionally:

CONs:
a) These is not yet mainstream ebiderman@ can easily decide to rewrite 
it from scratch.

PROs:
a)(VZ7) These patch set protects non-root users on host from being able 
to gain all capabilities. - Not so important as we have only admin on host.

b) These patch set will allow users with CAP_FSETID not drop suid on 
write. Now behavior differs in VZ6 and VZ7 CT as we moved to user 
namespaces in VZ7:
(Have CAP_FSETID in both cases.)

VZ6CT# touch test; chmod u+s test; echo "change" > test; ls -l test; cat 
test
-rwSr--r-- 1 root root 7 Mar 15 10:00 test
change

VZ7CT# touch test; chmod u+s test; echo "change" > test; ls -l test; cat 
test
-rwxrwxrwx 1 root root 7 Mar 15 10:01 test
change

(VZ7) - only in VZ7 as VZ6 don't have mainstream user-namespaces for 
containers.

On 03/10/2016 07:31 PM, Pavel Tikhomirov wrote:
> We need it as secure way to provide privileged access to mounts
> in containers. For instance setting suid bit, security.capability
> xattr, allowing remount in CT.
>
> Those patches are not all in MS now, but actually "[PATCH v4 0/7]
> Initial support for user namespace owned mounts" patch series is
> partially in Eric W. Biederman's testing tree, here:
> git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git for-testing
>
> Other patches are Acked and have stayed without any feedback since
> December 2015, see "[PATCH v2 00/19] Support fuse mounts in user
> namespaces".
>
> https://jira.sw.ru/browse/PSBM-43294
> https://jira.sw.ru/browse/PSBM-43267
>
> Major changes when porting:
>
> 1) dropped patches for selinux and other LSM as we do not
> support them yet
> 2) dropped patch "new/fs: Update posix_acl support to handle
> user namespace mounts" as that will need porting >=8 additional
> patches from mainstream and now I do not know about any real
> need of per s_user_ns posix ACLs in Virtuozzo, please correct
> me if I'm wrong.
> 3) dropped fuse specific patches
> 4) port several needed patches from MS and add some fixes - migth
> need to send them upstream.
>
> Now we mount ploop disk to container root before userns for CT has
> been created so root mount will have wrong s_user_ns on sb, for
> testing purpose I include last patch which allows to change s_user_ns
> on remount, but that is not safe, need to think how we can add it to
> right userns from the begining.
>
> Pavel Tikhomirov (22):
>    ms/fs/super.c: fix WARN on alloc_super() fail path
>    ebiederm/fs: Add user namesapace member to struct super_block
>    fs: fix a posible leak of allocated superblock
>    ms/mnt: Only change user settable mount flags in remount
>    ms/mnt: Move the test for MNT_LOCK_READONLY from change_mount_flags
>      into do_remount
>    ms/mnt: Correct permission checks in do_remount
>    ebiederm/userns: Simpilify MNT_NODEV handling.
>    ebiederm/fs: Limit file caps to the user namespace of the super block
>    port/block_dev: Support checking inode permissions in lookup_bdev()
>    port/block_dev: Check permissions towards block device inode when
>      mounting
>    port/fs: Treat foreign mounts as nosuid
>    fs: remove excess check for in_userns
>    port/userns: Replace in_userns with current_in_userns
>    port/fs: Check for invalid i_uid in may_follow_link()
>    port/cred: Reject inodes with invalid ids in set_create_file_as()
>    port/fs: Refuse uid/gid changes which don't map into s_user_ns
>    port/fs: Ensure the mounter of a filesystem is privileged towards its
>      inodes
>    port/fs: Don't remove suid for CAP_FSETID in s_user_ns
>    ms/fs: Add a missing permission check to do_umount
>    port/fs: Allow superblock owner to access do_remount_sb()
>    port/capabilities: Allow privileged user in s_user_ns to set
>      security.* xattrs
>    draft/ext4: add option to set userns of superblock
>
>   drivers/md/dm-table.c          |  2 +-
>   drivers/mtd/mtdsuper.c         |  2 +-
>   fs/attr.c                      | 11 ++++++
>   fs/block_dev.c                 | 20 ++++++++---
>   fs/exec.c                      |  2 +-
>   fs/ext4/super.c                | 68 +++++++++++++++++++++++++++++++++----
>   fs/inode.c                     |  6 +++-
>   fs/namei.c                     | 11 ++++--
>   fs/namespace.c                 | 76 ++++++++++++++++++++++++++++++++++--------
>   fs/proc/namespaces.c           |  2 ++
>   fs/proc/root.c                 |  3 +-
>   fs/quota/quota.c               |  2 +-
>   fs/super.c                     | 46 +++++++++++++++++++++----
>   include/linux/fs.h             | 15 ++++++++-
>   include/linux/mount.h          | 10 +++++-
>   include/linux/uidgid.h         | 10 ++++++
>   include/linux/user_namespace.h |  6 ++++
>   kernel/capability.c            | 13 +++++---
>   kernel/cred.c                  |  2 ++
>   kernel/user_namespace.c        | 14 ++++++++
>   security/commoncap.c           | 14 +++++---
>   security/selinux/hooks.c       |  2 +-
>   22 files changed, 286 insertions(+), 51 deletions(-)
>

-- 
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.


More information about the Devel mailing list