[Devel] Re: [Vserver] VServer vs OpenVZ
Kirill Korotaev
dev at sw.ru
Mon Dec 12 12:22:15 PST 2005
>>> 1) "Fair scheduling" - as far as I can tell the VZ "fair scheduler"
>>> does nothing the VServer QoS/Limit system does. If anything, the VZ
>>> fair scheduler is not yet O(1) which is a big negative. VServer is
>>> built on standard kernel and therefore uses the O(1) scheduler (an
>>> absolute must when you have so many processes running on a single
>>> kernel)
>>
>> this is not true! Fairscheduler in current implementation on 2.6
>> kernel is O(1) and doesn't depend on number of processes anyhow!
>> And we are working on improving it much more and implement some
>> additional features in it.
>
> Great, why not provide packages against latest Virtuozzo (with modules,
> vzfs, etc) for better real world testing? What numbers are you seeing
> in regards to load, based on my estimates with 3000 procs and an avg of
> 3 running on a server with a load average of 3 should drop to much
> lower. What kind of numbers do you see in your tests?
1. Virtuozzo 3.0 with O(1) scheduler will be released very soon. OpenVZ
already has it, so you can test it right now.
2. Can't understand your statement about 3000 procs/3 avg etc. What
estimations do you mean? Can you describe it in more details?
>>> 3) Disk/memory sharing - OpenVZ has nothing. Virtuozzo uses an
>>> overlay fs "vzfs". The templates are good for an enterprise
>>> environment, but really prove useless in a hosting environment. vzfs
>>> is overlay and therefore suffers from double caching (it caches both
>>> files in /vz/private (backing) and /vz/root (mount)).
>>
>> not sure what you mean... memory caching?! it is not true again then...
> The kernel caches based on inode number. If you modified the caching
> part of the module then I may be incorrect in my thinking. Take example:
>
> # ls -ai /vz/private/1/root/bin/ls
> 41361462 /vz/private/1/root/bin/ls
> # ls -ai /vz/template/redhat-as3-minimal/coreutils-4.5.3-26/bin/ls
> 1998864 /vz/template/redhat-as3-minimal/coreutils-4.5.3-26/bin/ls
> # ls -ai /vz/root/1/bin/ls
> 41361462 /vz/root/1/bin/ls
1. Kernel doesn't cache anything based on inode numbers. Inode numbers
are just a magic IDs. Nothing more. Internally everything is much more
complex.
2. inode numbers can be different with vzfs, but the data cached under
these inodes is the same, i.e. no double caching with vzfs happens. This
is the main purpose of VZFS and this is required for scalability: to
have only one instance of data in memory.
> The kernel will cache both inodes 41361462 and 1998864. Knowing that,
> when I look at my host servers with 8GB of RAM and see 4GB being used
> for cache/buffers I get angry.
Looks like you are misinterpreting /proc/meminfo output. Let me explain.
/proc/meminfo shows you amount of memory used for caching of files and
buffers, both of which are reclaimable. This means that:
- cached memory is not a wastage of you HW memory, it's a temporarely
cache of disk files. The bigger the better.
- since it's reclaimable it's not a memory which is pinned downed and is
freed on demand when some applications or kernel really need memory for
its own use. i.e. it means that you have 4GB of _FREE_ RAM which kernel
_temporarily_ used for caches. Why are you angry with it?! I would be
happy in this situation :)
- as I wrote before the data is cached in memory only once, so in your
example with /bin/ls it takes only ~68k of data memory in caches +
dentry/inode caches (internal kernel structures). And raw figures in
/proc/meminfo doesn't allow to understand whether it was cached once or
twice or more times in memory. It's just 4GB of RAM which are used for
caches, no any other information here.
- on practice vzfs saves you 40-70% of VPS memory compared to OpenVZ
when VPSs are based on the same template. Sure, the more hungry VPS is
the less relative gain vzfs provide for such VPS.
So your comment about better scalability of OpenVZ than Virtuozzo looks
wrong to me...
> vzfs appears to be a standard unionfs
> with support for CoW to those who do not see the source. You ignored
> responding to how VServer does it which results in using a patched
> kernel to have special CoW links without a union mount. The links are
> based on a hard link architecture resulting in 1 inode. Also commenting
> on was ignored vunify and vzcache speeds.
It was not ignored actually, sorry that I didn't replied to it before
and made you think so.
We immediately started investigating your report. I believe vunify tool
is more like vzpkglink which is also fast enough. But it will be checked
more thoroughly.
More likely the whole vzcache will be reworked. I really appreciate such
reports and probably will return to you with more questions on it.
>> RSS is good yeah, but there are lot's of DoS possible if you limit RSS
>> only. No lowmem, no TCP bufs, etc... I personally know many ways of
>> DoSing of other resources, but if you don't care security this is
>> probably ok.
>>
> It does RSS and VM limiting with no guarantees. It also does locked
> pages, sockets, etc. The argument of who has more structures to limit
> is actually rather pointless now as VServer could take the OpenVZ limits
> to see what they can limit and decide which they want to implement. That
> is only a matter of time. I'm sure Herbert has seen output of
> /proc/user_beancounters before OpenVZ was even released and didn't see a
> reason for some of the limits. What I was pointing out was differences
> currently. A very minor advantage of VServer if they virtualize the
> meminfo structure to reflect memory/swap total/usage based on the RSS/VM
> limits.
meminfo will be fixed soon, I suppose.
Kirill
More information about the Devel
mailing list