[Devel] [RFC rhel7] Disabling mounting cgroups from inside of container

Cyrill Gorcunov gorcunov at virtuozzo.com
Sat Jan 16 14:12:25 PST 2016


On Sat, Jan 16, 2016 at 10:45:55PM +0100, Stanislav Kinsburskiу wrote:
> 
> 16 янв. 2016 г. 9:51 PM пользователь Cyrill Gorcunov <gorcunov at virtuozzo.com> написал:
> >
> > On Sat, Jan 16, 2016 at 09:32:39PM +0100, Stanislav Kinsburskiу wrote: 
> > > Hi, 
> > > 
> > > What it's the reason behind this proposal? 
> >
> > 1) Fix the restore problem introduced with your commit 
> 
> Could you elaborate a bit on the problem?

have you looked into the patches ;) I tried to describe the
details there ;)

 | Still curent design of restore procedure is that we're creating
 | cgroups from inside of libvzctl on ve0 and then move self
 | into veX and proceed restore from there.
 | 
 | CRIU in turn restores cgroups remounting them from veX context
 | (strictly speaking CRIU works in this way to be able to restore
 |  not only Virtuozzo based containers but general containers as well).

Basically with commit "ve/cgroup: add VE mark to each user cgroup name
on mount" we lo longer can proceed the restore. This comes from:

 - libvzctl creates cgroups and move self into veX
 - calls "restore" scripts
  - restore script calls for criu
   - criu reach stage where it need to restore cgroups and
     - creates that named cgroup yard: just a directory where
       all controllers gonna be mounted and restored the values
       from the image

but here the kerel finds that root for cgroup altready exist
and doesn't allow us to continue restore

59052 access(".criu.cgyard.OppSAB/net_cls", F_OK) = -1 ENOENT (No such file or directory)
59052 write(1023, "(00.031574) cg: \tMaking controller dir .criu.cgyard.OppSAB/net_cls (net_cls)\n", 77) = 77
59052 mkdir(".criu.cgyard.OppSAB/net_cls", 0700) = 0
59052 mount("none", ".criu.cgyard.OppSAB/net_cls", "cgroup", 0, "net_cls") = -1 EBUSY (Device or resource busy)
59052 write(1023, "(00.032026) Error (cgroup.c:1281): cg: \tCan't mount controller dir .criu.cgyard.OppSAB/net_cls: Device or resource busy\n", 120) = 120
59052 exit_group(1)                     = ?
59052 +++ exited with 1 +++
59047 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 59052

there were also idea to NOT create cgroups inside libvzctl when we're
restoring containers but I'm not sure about it. Since I believe restricting
cgroups mounting inside containers is needed anyway pseudosuper might solve
all problems I hope.

> 
> > 2) Performance or uncontrollable mount of cgroups from 
> >    inside of container is _really_ a huge problem affecting 
> >    the node. Until there is a strong reason to allow mounting 
> >    we should disable it. 
> >
> 
> It sounds like forbidding of cgroups is a way to protectagains "cgroups bomb". Is it?

Yes. Allowing artitrary moutning of cgroups inside container is pain in the a**

> 
> > > The only thing you mentioned and which used not fixed is perfomance issues. 
> > > If so, then it's not a sufficient reason from my POW, because we are loosing generic functionality. 
> > > I suspect, that the are programs, which use cgroups for their internal needs. 
> > > What will we do with them, if cgroup mounts are forbidden? 
> >
> > I don't know ones which require own mounting. iirc docker was able to 
> > work if cgroups mounting is disabled and all cgroups are already 
> > preconfigured (but this should be double checked). Note that we're 
> > talking about _mounting_, because you still can create new cgroups 
> > nested. 
> 
> Yeah, probably not so many programs does so.
> But forbidding such functionality in a container looks very aggressive for me.

I would take the reserse, grip everything and relax requirements only
where really need.

Stas, lets continue talking on monday, i'll be out tomorrow most probably.

	Cyrill


More information about the Devel mailing list