[Devel] OpenVZ patch for VE disk i/o accounting :: 2.6.18

Rick Blundell rickb at rapidvps.com
Sun Nov 26 09:23:25 PST 2006


Hello OpenVZ users and developers,

I am requesting your feedback on this attached patch. It is my first 
contributed patch but I think it could add useful accounting metrics to 
the project. The following patch against 2.6.18 + ovz028test005.1 adds 
four new accounting-only metrics to user_beancounters as follows:

numreads
numwrites
kbytesread
kbyteswritten

Introduction-
The ability to account disk I/O activity per VE is extremely critical 
when diagnosing saturated disk bandwidth. Right now, it is entirely 
possible for one VE to cripple the HN (effecting other VEs) by running 
disk intensive applications such as bonnie++, iozone, dd, etc. This has 
been discussed a few times on the forum as a potential problem 
situation. Although the VZ cpu scheduler does a fine job of limiting the 
VE's cpu time, if that cputime is spent churning the disk in an 
uninterruptible state, other VEs cannot meet their cpu guarantee even on 
a non overcommitted server.  Even a small amount of cputime can cause a 
large amount of iowait if the disk system is not extremely strong 
(SCSI,RAID,etc).

Ideally we would be able to institute a shaping algorithm like tc's cbq 
for disk I/O. Where, we can guarantee and limit each VE to xx kbit/sec 
read, yy kbit/sec write, zz read trans/sec, aa write trans/sec. However, 
this is not possible yet. This patch will allow reactive measures 
against disk I/O hogging Ves, ie supplying you with information so that 
you can at least know which VE is saturating your disk bandwidth. What 
you do from there (stop the VE, investigate/reconfigure the application, 
etc) is up to you.

The CFQ I/O scheduler makes great strides in preventing disk hogs from 
effecting other applications/VEs, however CFQ has no concept of a VE; 
thus a VE could spawn a disk hogging application repeatedly and not be 
punished by CFQ.

Although the user_beancounters read/write metrics are accounting on the 
precision of KB, all read/written bytes are accounted for. If you write 
1 byte 1024 seperate times, kbyteswritten will increment one time. *All 
four metrics are taken from storage layer hits*. Buffered reads and 
virtual files reads are not counted since they do not result in a 
storage layer seek. This patch is based on Andrew Morton's 
2.6.19-rc6-mm1 patchset, http://userweb.kernel.org/~akpm/2.6.19-rc6-mm1/

Thank-you,
Rick Blundell


3 Test Cases demonstrating bytesread/byteswritten performed using the 
patch from inside a VE.

-=-=-=-=-=--=-==-=-=-=-=-==--=-=-=-=
Case 1 :: “ Write file to disk “
-=-=-=-=-=--=-==-=-=-=-=-==--=-=-=-=
Here, we will write a 1000KB file to the disk, while checking 
kbyteswritten UBC before and after the write. As you can see the file 
created is 1000KB in size, and kbyteswritten has increased by 1000.

-bash-3.00# grep kbyteswritten /proc/user_beancounters ; dd if=/dev/zero 
of=/root/bigfile3 bs=512 count=2000; grep kbyteswritten 
/proc/user_beancounters ;
             kbyteswritten        192        192 2147483647 2147483647 
        0
2000+0 records in
2000+0 records out
             kbyteswritten       1192       1192 2147483647 2147483647 
        0
-bash-3.00# ls -al /root/bigfile3
-rw-r--r--  1 root root 1024000 Nov 26 07:42 /root/bigfile3

1192KB-192KB = 1000KB = 1024000B

-=-=-=-=-=--=-==-=-=-=-=-==--=-=-=-=
Case 2 :: “Read unbuffered file from disk“
-=-=-=-=-=--=-==-=-=-=-=-==--=-=-=-=
Here we will read a file which has not been read yet since HN boot. This 
means the file is not in the buffer, and thus will be fetched from the 
disk. As you can see, the file is  5243392 Bytes which is 5120KB. 
Kbytesread increased by 5152.

-bash-3.00# grep bytesr /proc/user_beancounters ; cat /root/bigfile2 
 >/dev/null ; grep bytesr /proc/user_beancounters
             kbytesread        40182      40182       3000       4000 
      0
             kbytesread        45334      45334       3000       4000 
      0
-bash-3.00# ls -al /root/bigfile2
-rw-r--r--  1 root root 5243392 Nov 26 06:45 /root/bigfile2

-=-=-=-=-=--=-==-=-=-=-=-==--=-=-=-=
Case 3 :: “Read buffered file from disk“
-=-=-=-=-=--=-==-=-=-=-=-==--=-=-=-=
In this case we will read the same file which was read in case 2. Since 
we read it recently, the contents will be in the buffer cache.

-bash-3.00#  grep bytesr /proc/user_beancounters ; cat /root/bigfile2 
 >/dev/null ; grep bytesr /proc/user_beancounters
             kbytesread        45346      45346       3000       4000 
      0
             kbytesread        45346      45346       3000       4000 
      0
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: openvz_2618_io_acct.diff
URL: <http://lists.openvz.org/pipermail/devel/attachments/20061126/a97649b4/attachment-0001.ksh>


More information about the Devel mailing list