[Devel] [PATCH RH8] ve/fs/namespace: fix allowing submounts in non-init userns

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Fri Jun 25 13:47:24 MSK 2021


From: Konstantin Khorenko <khorenko at virtuozzo.com>

When mounting nfs4 mount inside container with something like:

  mount -t nfs4 $NODEIP:/root/build/criu /mnt

we can see that because the source "root" path is several directories
long we do create several submounts.

Adding perf probes to list mountpoint->d_sb->s_user_ns and
mountpoint->d_iname from vfs_submount we see:

crash > p &init_user_ns
$2 = (struct user_namespace *) 0xffffffff9644efc0

1) First submount created has mountpoint dentry "root" and ve userns:
mount.nfs4 ...:         probe:vfs_submount: (ffffffff95a970e0)
user_ns=0xffff8b6d6e86a000 dentry="root"

2) Second submount created has mountpoint dentry "build" from first
submount and init userns of host:
mount.nfs4 ...:         probe:vfs_submount: (ffffffff95a970e0)
user_ns=0xffffffff9644efc0 dentry="build"

So on first step we have ve userns and on second init userns. Either
compairing it to one of init userns or ve userns would not work because
we can have both of them. So easy solution here is to disable the check
completely like we do in vz7.

Note: this patch allows nfs4 mounts in containers, thus we overcome
nfs3 rpcbind non-dumpable socket migration problems, as now nfs mounts
in v4 mode by default.

https://jira.sw.ru/browse/PSBM-102629
Fixes: 81a2b734416d ("ve/fs/namespace: allow submounts in non-init
userns")
Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
---
 fs/namespace.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 75aa3ae9585e..321a79198aac 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1017,13 +1017,14 @@ struct vfsmount *
 vfs_submount(const struct dentry *mountpoint, struct file_system_type *type,
 	     const char *name, void *data)
 {
+#if 0
 	/* Until it is worked out how to pass the user namespace
 	 * through from the parent mount to the submount don't support
 	 * unprivileged mounts with submounts.
 	 */
 	/* Simple NFS mount inside a Container brings us here, so if we want to
-	 * enable NFS inside a Container (read - in CT root userns), we have
-	 * to soften the check.
+	 * enable NFS inside a Container (read - in non-init userns), we have
+	 * to omit the check.
 	 *
 	 *  SyS_mount
 	 *   do_mount
@@ -1044,8 +1045,9 @@ vfs_submount(const struct dentry *mountpoint, struct file_system_type *type,
 	 *		    nfs_do_submount
 	 *		     vfs_submount
 	 */
-	if (mountpoint->d_sb->s_user_ns != ve_init_user_ns())
+	if (mountpoint->d_sb->s_user_ns != &init_user_ns)
 		return ERR_PTR(-EPERM);
+#endif
 
 	return vfs_kern_mount(type, SB_SUBMOUNT, name, data);
 }
-- 
2.31.1



More information about the Devel mailing list