[Devel] OpenVZ patch for VE disk i/o accounting :: 2.6.18
Rick Blundell
rickb at rapidvps.com
Sun Nov 26 09:23:25 PST 2006
Hello OpenVZ users and developers,
I am requesting your feedback on this attached patch. It is my first
contributed patch but I think it could add useful accounting metrics to
the project. The following patch against 2.6.18 + ovz028test005.1 adds
four new accounting-only metrics to user_beancounters as follows:
numreads
numwrites
kbytesread
kbyteswritten
Introduction-
The ability to account disk I/O activity per VE is extremely critical
when diagnosing saturated disk bandwidth. Right now, it is entirely
possible for one VE to cripple the HN (effecting other VEs) by running
disk intensive applications such as bonnie++, iozone, dd, etc. This has
been discussed a few times on the forum as a potential problem
situation. Although the VZ cpu scheduler does a fine job of limiting the
VE's cpu time, if that cputime is spent churning the disk in an
uninterruptible state, other VEs cannot meet their cpu guarantee even on
a non overcommitted server. Even a small amount of cputime can cause a
large amount of iowait if the disk system is not extremely strong
(SCSI,RAID,etc).
Ideally we would be able to institute a shaping algorithm like tc's cbq
for disk I/O. Where, we can guarantee and limit each VE to xx kbit/sec
read, yy kbit/sec write, zz read trans/sec, aa write trans/sec. However,
this is not possible yet. This patch will allow reactive measures
against disk I/O hogging Ves, ie supplying you with information so that
you can at least know which VE is saturating your disk bandwidth. What
you do from there (stop the VE, investigate/reconfigure the application,
etc) is up to you.
The CFQ I/O scheduler makes great strides in preventing disk hogs from
effecting other applications/VEs, however CFQ has no concept of a VE;
thus a VE could spawn a disk hogging application repeatedly and not be
punished by CFQ.
Although the user_beancounters read/write metrics are accounting on the
precision of KB, all read/written bytes are accounted for. If you write
1 byte 1024 seperate times, kbyteswritten will increment one time. *All
four metrics are taken from storage layer hits*. Buffered reads and
virtual files reads are not counted since they do not result in a
storage layer seek. This patch is based on Andrew Morton's
2.6.19-rc6-mm1 patchset, http://userweb.kernel.org/~akpm/2.6.19-rc6-mm1/
Thank-you,
Rick Blundell
3 Test Cases demonstrating bytesread/byteswritten performed using the
patch from inside a VE.
-=-=-=-=-=--=-==-=-=-=-=-==--=-=-=-=
Case 1 :: “ Write file to disk “
-=-=-=-=-=--=-==-=-=-=-=-==--=-=-=-=
Here, we will write a 1000KB file to the disk, while checking
kbyteswritten UBC before and after the write. As you can see the file
created is 1000KB in size, and kbyteswritten has increased by 1000.
-bash-3.00# grep kbyteswritten /proc/user_beancounters ; dd if=/dev/zero
of=/root/bigfile3 bs=512 count=2000; grep kbyteswritten
/proc/user_beancounters ;
kbyteswritten 192 192 2147483647 2147483647
0
2000+0 records in
2000+0 records out
kbyteswritten 1192 1192 2147483647 2147483647
0
-bash-3.00# ls -al /root/bigfile3
-rw-r--r-- 1 root root 1024000 Nov 26 07:42 /root/bigfile3
1192KB-192KB = 1000KB = 1024000B
-=-=-=-=-=--=-==-=-=-=-=-==--=-=-=-=
Case 2 :: “Read unbuffered file from disk“
-=-=-=-=-=--=-==-=-=-=-=-==--=-=-=-=
Here we will read a file which has not been read yet since HN boot. This
means the file is not in the buffer, and thus will be fetched from the
disk. As you can see, the file is 5243392 Bytes which is 5120KB.
Kbytesread increased by 5152.
-bash-3.00# grep bytesr /proc/user_beancounters ; cat /root/bigfile2
>/dev/null ; grep bytesr /proc/user_beancounters
kbytesread 40182 40182 3000 4000
0
kbytesread 45334 45334 3000 4000
0
-bash-3.00# ls -al /root/bigfile2
-rw-r--r-- 1 root root 5243392 Nov 26 06:45 /root/bigfile2
-=-=-=-=-=--=-==-=-=-=-=-==--=-=-=-=
Case 3 :: “Read buffered file from disk“
-=-=-=-=-=--=-==-=-=-=-=-==--=-=-=-=
In this case we will read the same file which was read in case 2. Since
we read it recently, the contents will be in the buffer cache.
-bash-3.00# grep bytesr /proc/user_beancounters ; cat /root/bigfile2
>/dev/null ; grep bytesr /proc/user_beancounters
kbytesread 45346 45346 3000 4000
0
kbytesread 45346 45346 3000 4000
0
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: openvz_2618_io_acct.diff
URL: <http://lists.openvz.org/pipermail/devel/attachments/20061126/a97649b4/attachment-0001.ksh>
More information about the Devel
mailing list