[Devel] What does glommer think about kmem cgroup ?
Glauber Costa
glommer at parallels.com
Thu Oct 13 08:50:29 PDT 2011
Hi guys,
So, linuxcon is approaching. To help making our discussions more
productive, I sketched a basic prototype of a kmem cgroup that can
control the size of the dentry cache. I am sending the code here so you
guys can have an idea, but keep in mind this is a *sketch*. This is my
view of how our controller *could be*, not necessarily what it *should
be*. All your input is more than welcome.
Let me first explain a bit of my approach: (there are some comments
inline as well)
* So far it only works with the slab (you will see that something
similar can be done for at least the slub) Since most of us is
concerned mostly with memory abuse (I think), I neglected for simplicity
the initial memory allocated for the arrays. Only when
cache_grow is called to allocated more pages, is that we bill then.
* I avoid resorting to the shrinkers, trying to free the slab pages
themselves whenever possible.
* We don't limit the size of all caches. They have to register
themselves explicitly (and in this PoC, I am using the dentry cache as
an example)
* The object is billed to whoever touched it first. Other policies are
of course possible.
What I am *not* concerned about in this PoC: (left for future work, if
needed)
- unified user/memory kernel memory reclaim
- changes to the shrinkers.
- changes to the limit once it is already in place
- per-cgroup display in /proc/slabinfo
- task movement
- a whole lot of other stuff.
* Hey glommer, do you have numbers?
Yes, I have 8 numbers. And since 8 is also a number, then I have 9 numbers.
So what I did was to type "find /" in a freshly booted system (my
laptop). I just ran each iteration once, so nothing scientific. I halved
the limits until the allocations started to fail, which was
more or less around 256K hard limit. Find is also not a workload that
pins the dentries in memory for very long. Other kinds of workloads
will display different results here...
Base: (non-patched kernel)
real 0m16.091s
user 0m0.567s
sys 0m6.649s
Patched kernel, root cgroup (unlimited. max used mem: 22Mb)
real 0m15.853s
user 0m0.511s
sys 0m6.417s
16Mb/4Mb (HardLimit/SoftLimit)
real 0m16.596s
user 0m0.560s
sys 0m6.947s
8Mb/4Mb
real 0m16.975s
user 0m0.568s
sys 0m7.047s
4Mb/2Mb
real 0m16.713s
user 0m0.554s
sys 0m7.022s
2Mb/1Mb
real 0m17.001s
user 0m0.544s
sys 0m7.118s
1Mb/512K
real 0m16.671s
user 0m0.530s
sys 0m7.067s
512k/256k
real 0m17.395s
user 0m0.567s
sys 0m7.179s
So, what those initial numbers do tell us, is that the performance
penalty for the root cgroup is not expected to be that bad. When the
limits start to be hit, a penalty is incurred, which is under the
expectations.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: basic-code.patch
URL: <http://lists.openvz.org/pipermail/devel/attachments/20111013/22e2dedf/attachment-0001.ksh>
More information about the Devel
mailing list