[Devel] [PATCH RH9] ve/cgroup: hide non-virtualized cgroups in container

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Fri Oct 22 16:16:52 MSK 2021


On container(ve) start "virtualized" (is_virtualized_cgroup() == true)
root cgroups of container are checked to insure that each container has
own non intersecting set of those cgroup directories.

We don't check all cgroups because new named empty cgroups can be
created on host at any moment and vzctl can't controll it, so vzctl
creates own cgroups for container only in a predefined set of
"virtualized" cgroups.

Non-"virtualized" cgroups are not checked, thus can be assumed host root
cgroups, thus we should not show them in container.

So we need to prohibit mounting all except "virtualized" in container,
let's also mangle non-"virtualized" in /proc/self/cgroup and
/proc/cgroups.

https://jira.sw.ru/browse/PSBM-134994

Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
---
 kernel/cgroup/cgroup-v1.c |  3 +++
 kernel/cgroup/cgroup.c    | 32 +++++++++++++++++++++-----------
 2 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index fe781bda5962..d58faf071e2c 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -1318,6 +1318,9 @@ int cgroup1_get_tree(struct fs_context *fc)
 
 	mutex_unlock(&cgroup_mutex);
 
+	if (!ret && ve_hide_cgroups(ctx->root))
+		ret = -EPERM;
+
 	if (!ret)
 		ret = cgroup_do_get_tree(fc);
 
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 396c0dc98b64..83fa33063a94 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -2002,30 +2002,35 @@ struct ve_struct *get_curr_ve(void)
  * do "mount -t cgroup cgroup -onone,name=namedcgroup /mnt", and this should
  * not break containers.
  */
-static inline bool is_virtualized_cgroup(struct cgroup *cgrp)
+static inline bool is_virtualized_cgroot(struct cgroup_root *cgroot)
 {
 	/* Cgroup v2 */
-	if (cgrp->root == &cgrp_dfl_root)
+	if (cgroot == &cgrp_dfl_root)
 		return false;
 
 #if IS_ENABLED(CONFIG_CGROUP_DEBUG)
-	if (cgrp->subsys[debug_cgrp_id])
+	if (cgroot->subsys_mask & (1 << debug_cgrp_id))
 		return false;
 #endif
 #if IS_ENABLED(CONFIG_CGROUP_MISC)
-	if (cgrp->subsys[misc_cgrp_id])
+	if (cgroot->subsys_mask & (1 << misc_cgrp_id))
 		return false;
 #endif
 
-	if (cgrp->root->subsys_mask)
+	if (cgroot->subsys_mask)
 		return true;
 
-	if (!strcmp(cgrp->root->name, "systemd"))
+	if (!strcmp(cgroot->name, "systemd"))
 		return true;
 
 	return false;
 }
 
+static inline bool is_virtualized_cgroup(struct cgroup *cgrp)
+{
+	return is_virtualized_cgroot(cgrp->root);
+}
+
 /*
  * Iterate all cgroups in a given css_set and for all obligatory Virtuozzo
  * container cgroups check that container has its own cgroup subdirectory:
@@ -2410,7 +2415,11 @@ static int cgroup_get_tree(struct fs_context *fc)
 	cgroup_get_live(&cgrp_dfl_root.cgrp);
 	ctx->root = &cgrp_dfl_root;
 
-	ret = cgroup_do_get_tree(fc);
+	if (ve_hide_cgroups(ctx->root))
+		ret = -EPERM;
+
+	if (!ret)
+		ret = cgroup_do_get_tree(fc);
 	if (!ret)
 		apply_cgroup_root_flags(ctx->flags);
 	return ret;
@@ -6214,11 +6223,12 @@ int ve_hide_cgroups(struct cgroup_root *root)
 	unsigned long hidden_mask = (1UL << ve_cgrp_id);
 
 	/*
-	 * Hide ve cgroup in CT for docker,
-	 * still showing it to pseudosuper (criu)
+	 * Hide ve cgroup in CT for docker, still showing it to pseudosuper
+	 * (criu), and also hide non-virtualized cgroups.
 	 */
-	return !ve_is_super(ve) && !ve->is_pseudosuper
-		&& (root->subsys_mask & hidden_mask);
+	return !ve_is_super(ve) && !ve->is_pseudosuper &&
+	       ((root->subsys_mask & hidden_mask) ||
+		!is_virtualized_cgroot(root));
 }
 #endif
 
-- 
2.31.1



More information about the Devel mailing list