[Devel] [PATCH RHEL7 COMMIT] cgroup: Mangle cgroups root from inside of VE view

Konstantin Khorenko khorenko at virtuozzo.com
Fri May 29 05:50:55 PDT 2015


The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.7
------>
commit e5f41176c3b8653ff237f1fd695d608a176aa46b
Author: Cyrill Gorcunov <gorcunov at odin.com>
Date:   Fri May 29 16:50:55 2015 +0400

    cgroup: Mangle cgroups root from inside of VE view
    
    We're bindmounting cgroups for container so if say a container
    is having CTID=200 then @cgroups and @mountinfo output will
    contain /200 as a root. Which makes Docker to lookup for
    appropriate directory inside /sys/fs/cgroup/<controller>
    which of course not present because of been bindmounted
    from the node (note we can't bindmount into
    <controller>/<container> here because it confuses container's
    systemd instance and it stuck on boot).
    
    Thus we simply mangle root here so when one is accessing
    @cgroups or @mountinfo kernel shows '/' instead of $ctid
    which makes both docker and systemd happy.
    
    https://jira.sw.ru/browse/PSBM-33757
    
    Signed-off-by: Cyrill Gorcunov <gorcunov at virtuozzo.com>
    Reviewed-by: Vladimir Davydov <vdavydov at parallels.com>
    
    CC: Konstantin Khorenko <khorenko at virtuozzo.com>
    CC: Pavel Emelyanov <xemul at virtuozzo.com>
    CC: Andrey Vagin <avagin at virtuozzo.com>
    
    kgorkunov@: we have 2 variants:
    
    1) don't hide top-level, then
    
       - if we mount as /sys/fs/cgroup/memory/200, then systemd
         hangs on start
    
       - if we mount as /sys/fs/cgroup/memory/, then systemd starts,
         but Docker inside a CT starts blaiming becase paths /200
         present in /proc/pid/cgroups and /proc/pid/mountinfo,
         while /sys/fs/cgroup/memory does not show them.
    
    2) Hide top-level, then in cgroups|mountinfo output we'll see only "/",
       and both systemd and Docker works fine.
---
 kernel/cgroup.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 2e40430..6c87800 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1386,10 +1386,24 @@ static int cgroup_remount(struct super_block *sb, int *flags, char *data)
 	return ret;
 }
 
+#ifdef CONFIG_VE
+int cgroup_show_path(struct seq_file *m, struct dentry *dentry)
+{
+	if (!ve_is_super(get_exec_env()))
+		seq_puts(m, "/");
+	else
+		seq_dentry(m, dentry, " \t\n\\");
+	return 0;
+}
+#endif
+
 static const struct super_operations cgroup_ops = {
 	.statfs = simple_statfs,
 	.drop_inode = generic_delete_inode,
 	.show_options = cgroup_show_options,
+#ifdef CONFIG_VE
+	.show_path = cgroup_show_path,
+#endif
 	.remount_fs = cgroup_remount,
 };
 
@@ -1807,6 +1821,21 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
 		return 0;
 	}
 
+#ifdef CONFIG_VE
+	/*
+	 * Containers cgroups are bind-mounted from node
+	 * so they are like '/' from inside, thus we have
+	 * to mangle cgroup path output.
+	 */
+	if (!ve_is_super(get_exec_env())) {
+		if (cgrp->parent && !cgrp->parent->parent) {
+			if (strlcpy(buf, "/", buflen) >= buflen)
+				return -ENAMETOOLONG;
+			return 0;
+		}
+	}
+#endif
+
 	start = buf + buflen - 1;
 	*start = '\0';
 



More information about the Devel mailing list