[Users] Curious about ploop performance results.

Kirill Korotaev dev at parallels.com
Sun May 4 23:20:43 PDT 2014


Oh, ssds have their own effects ))
E.g. Compression, garbage collection (which may decrease performance many many times periodically etc.). So always analyze that. Performance is not easy and not black and white.

Sent from my iPhone

On 05 мая 2014 г., at 4:44, "jjs - mainphrame" <jjs at mainphrame.com<mailto:jjs at mainphrame.com>> wrote:

Interesting - I'll check out FIO.

I wondered about the possibility of disk placement, and created the 3rd and 4th CTs in the reverse order of the 1st two, but regardless, the 2 ploopfs-based CTs  performed better than the 2 simfs-based CTs. (they are all ext4 underneath) The CTs are contained in a partition at the end of the disk, which somewhat limits the effect of disk placement. I suppose a better test would be to eliminate the variables of rotational media by using SSDs for future testing,

J J




On Sat, May 3, 2014 at 11:50 PM, Kirill Korotaev <dev at parallels.com<mailto:dev at parallels.com>> wrote:
Forget about iozone - it benchmarks cached i/o and small data sets, so essentially it measures memory / syscall speeds. On larger data sets it measures mix of ram and real i/o and a lot depends on previous cache state.

Fio is a better tool.

Actually the most likely explanation for your effect is non-uniform disk performance across the plate. You can find that for rotational media performance at beginning of block device is almost about twice faster than at the end (rotational speed is the same, but velocity is obviously faster on inner tracks).
So you can verify that by dumping extent info by dumpfs. Accurate benchmarking would require a small localized partition for both tests to make sure performance can't vary due to this effect.

Sent from my iPhone

On 04 мая 2014 г., at 1:06, "jjs - mainphrame" <jjs at mainphrame.com<mailto:jjs at mainphrame.com>> wrote:

I did some benchmarks on newly created CTs with iozone, and the results were probably more in line with what you'd expect.

The simfs-based CT was about 5% faster on write, and the ploop-based CT was about 5% faster on re-write, read, and re-read. The results are repeatable.


Regards,

J J


On Sat, May 3, 2014 at 11:53 AM, jjs - mainphrame <jjs at mainphrame.com<mailto:jjs at mainphrame.com>> wrote:
I am continuing to do testing as time allows. Lat night I ran sysbench fileio tests, and again, the ploop CT yielded better performance then either then simfs CT or the vzhost. It wasn't as drastic a difference as the dbench results, but the difference was there. I'll continue in this vein with freshly created CTs. The machine was just built a few days ago, it's quiescent, it's doing nothing except hosting a few vanilla CTs.

As for the rules of thumb, I can tell you that the results are 100% repeatable. But explainable, ah, that's the thing. still working on that.

Regards,

J J



On Sat, May 3, 2014 at 11:31 AM, Kir Kolyshkin <kir at openvz.org<mailto:kir at openvz.org>> wrote:
On 05/02/2014 04:38 PM, jjs - mainphrame wrote:
Thanks Kir, the /dev/zero makes sense I suppose. I tried with /dev/random but that blocks pretty quickly - /dev/urandom is better, but still seems to be a bottleneck.

You can use a real file on tmpfs.

Also, in general, there are very many factors that influence test results. Starting from the cron jobs and other stuff (say, network activity) that runs periodically or sporadically and spoils your results, to the cache state (you need to use vm_drop_caches, or yet better, reboot between tests), to the physical place on disk where your data is placed (rotating hdds tend to be faster at the first sectors compared to the last sectors, so ideally you need to do this on a clean freshly formatted filesystem). There is much more to it, can be some other factors, too. The rule of thumb is results need to be reproducible and explainable.

Kir.



As for the dbench results, I'd love to hear what results others obtain from the same test, and/or any other testing approaches that would give a more "acceptable" answer.

Regards,

J J



On Fri, May 2, 2014 at 4:01 PM, Kir Kolyshkin <kir at openvz.org<mailto:kir at openvz.org>> wrote:
On 05/02/2014 03:00 PM, jjs - mainphrame wrote:
Just for kicks, here are the data from the tests. (these were run on a rather modest old machine)

<mime-attachment.png>


Here are the raw dbench data:


#clients        vzhost                  simfs CT        ploop CT
---------------------------------------------------------------------
1               11.1297MB/sec       9.96657MB/sec       19.7214MB/sec
2               12.2936MB/sec       14.3138MB/sec       23.5628MB/sec
4               17.8909MB/sec       16.0859MB/sec       45.1936MB/sec
8               25.8332MB/sec       22.9195MB/sec       84.2607MB/sec
16              32.1436MB/sec       28.921MB/sec        155.207MB/sec
32              35.5809MB/sec       32.1429MB/sec       206.571MB/sec
64              34.3609MB/sec       29.9307MB/sec       221.119MB/sec

Well, I can't explain this, but there's probably something wrong with the test.



Here is the script used to invoke dbench:

HOST=`uname -n`
WD=/tmp
FILE=/usr/share/dbench/client.txt

for i in 1 2 4 8 16 32 64
do
    dbench -D $WD -c $FILE $i &>dbench-${HOST}-${i}
done

Here are the dd commands and outputs:

OPENVZ HOST
----------------
[root at vzhost ~]# dd bs=1M count=512 if=/dev/zero of=test conv=fdatasync
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 11.813 s, 45.4 MB/s
[root at vzhost ~]# df -T
Filesystem     Type  1K-blocks    Used Available Use% Mounted on
/dev/sda2      ext4   20642428 2390620  17203232  13% /
tmpfs          tmpfs    952008       0    952008   0% /dev/shm
/dev/sda1      ext2     482922   68436    389552  15% /boot
/dev/sda4      ext4   51633780 3631524  45379332   8% /local
[root at vzhost ~]#


PLOOP CT
----------------
root at vz101:~# dd bs=1M count=512 if=/dev/zero of=test conv=fdatasync
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 2.50071 s, 215 MB/s

This one I can explain :)

This is caused by ploop optimization that was enabled in the kernel recently.
If data block is all zeroes, it is not written to the disk (same thing as sparse files,
just for ploop).

So you need to test it with some real data (anything but not all zeroes).
I am not sure how fast is /dev/urandom but this is one of the options.


root at vz101:~# df -T
Filesystem        Type     1K-blocks    Used Available Use% Mounted on
/dev/ploop11054p1 ext4       4539600 1529316   2804928  36% /
none              devtmpfs    262144       4    262140   1% /dev
none              tmpfs        52432      52     52380   1% /run
none              tmpfs         5120       0      5120   0% /run/lock
none              tmpfs       262144       0    262144   0% /run/shm
root at vz101:~#


SIMFS CT
----------------
root at vz102:~# dd bs=1M count=512 if=/dev/zero of=test conv=fdatasync
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 12.6913 s, 42.3 MB/s
root at vz102:~# df -T
Filesystem     Type     1K-blocks    Used Available Use% Mounted on
/dev/simfs     simfs      4194304 1365500   2828804  33% /
none           devtmpfs    262144       4    262140   1% /dev
none           tmpfs        52432      52     52380   1% /run
none           tmpfs         5120       0      5120   0% /run/lock
none           tmpfs       262144       0    262144   0% /run/shm
root at vz102:~#

Regards,

J J

​


On Fri, May 2, 2014 at 2:10 PM, jjs - mainphrame <jjs at mainphrame.com<mailto:jjs at mainphrame.com>> wrote:
You know the saying, "when something seems too good to be true"...

I just installed centos 6.5 and openvz on an older machine, and when I built an ubuntu 12.04 CT I noticed that ploop is now the default layout. Cool. So I built another ubuntu12.04 CT, identical in every way except that I specified smifs, so I could do a quick performance comparison.

First I did a quick timed dd run, then I ran dbench with varying numbers of clients.

The simfs CT showed performance roughly similar to the host, which was not too surprising.
What did surprise me was that the ploop CT showed performance which was significantly better than the host, in both the dd test and the dbench tests.

I know someone will tell me "dbench is a terrible benchmark" but it's also a standard. Of course, if anyone knows a "better" benchmark, I'd love to try it.

Regards,

J J




_______________________________________________
Users mailing list
Users at openvz.org<mailto:Users at openvz.org>
https://lists.openvz.org/mailman/listinfo/users



_______________________________________________
Users mailing list
Users at openvz.org<mailto:Users at openvz.org>
https://lists.openvz.org/mailman/listinfo/users





_______________________________________________
Users mailing list
Users at openvz.org<mailto:Users at openvz.org>
https://lists.openvz.org/mailman/listinfo/users



_______________________________________________
Users mailing list
Users at openvz.org<mailto:Users at openvz.org>
https://lists.openvz.org/mailman/listinfo/users



_______________________________________________
Users mailing list
Users at openvz.org<mailto:Users at openvz.org>
https://lists.openvz.org/mailman/listinfo/users

_______________________________________________
Users mailing list
Users at openvz.org<mailto:Users at openvz.org>
https://lists.openvz.org/mailman/listinfo/users


_______________________________________________
Users mailing list
Users at openvz.org<mailto:Users at openvz.org>
https://lists.openvz.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/users/attachments/20140505/cfada5a0/attachment-0001.html>


More information about the Users mailing list