[Devel] seems to be a flaw in cfq

Vasily Tarasov vtaras at openvz.org
Tue Dec 19 05:53:04 PST 2006


ello, Jens.

Seems, that we've found some problem in CFQ.
I used your fio tool of version 1.9 to reproduce it.
2.6.18 vanilla kernel.
Generally situation is the following:
there are several readers, which read from their own files, except
two readers, which read one shared file.
These two readers experienced huge starvation!

I want to note, that there is no such situation on deadline scheduler
and situation is much better for anticipatory scheduler.

Here is a job file:

<snip>
; cfq bug reproduce
; each file is 1Gib size

[global]
timeout=200

[reader1]
filename=file1

[reader2]
filename=file2

[reader3]
filename=file3

; --- this two reads the same file ---
[reader4]
filename=file4

[reader5]
filename=file4
; ------------------------------------

[reader6]
filename=file6

[reader7]
filename=file7


[reader8]
filename=file8


[reader9]
filename=file9

[reader10]
filename=file10

[reader11]
filename=file11
</snip>


Here is an output of fio:
# fio job1.file
reader1: (g=0): rw=read, odir=1, bs=4K-4K/4K-4K, rate=0, ioengine=sync,
iodepth=1
reader2: (g=0): rw=read, odir=1, bs=4K-4K/4K-4K, rate=0, ioengine=sync,
iodepth=1
reader3: (g=0): rw=read, odir=1, bs=4K-4K/4K-4K, rate=0, ioengine=sync,
iodepth=1
reader4: (g=0): rw=read, odir=1, bs=4K-4K/4K-4K, rate=0, ioengine=sync,
iodepth=1
reader5: (g=0): rw=read, odir=1, bs=4K-4K/4K-4K, rate=0, ioengine=sync,
iodepth=1
reader6: (g=0): rw=read, odir=1, bs=4K-4K/4K-4K, rate=0, ioengine=sync,
iodepth=1
reader7: (g=0): rw=read, odir=1, bs=4K-4K/4K-4K, rate=0, ioengine=sync,
iodepth=1
reader8: (g=0): rw=read, odir=1, bs=4K-4K/4K-4K, rate=0, ioengine=sync,
iodepth=1
reader9: (g=0): rw=read, odir=1, bs=4K-4K/4K-4K, rate=0, ioengine=sync,
iodepth=1
reader10: (g=0): rw=read, odir=1, bs=4K-4K/4K-4K, rate=0, ioengine=sync,
iodepth=1
reader11: (g=0): rw=read, odir=1, bs=4K-4K/4K-4K, rate=0, ioengine=sync,
iodepth=1
Starting 11 threads
Threads running: 11: [RRRRRRRRRRR] [100.00% done] [ 31298/ 0 kb/s]
[eta 00m:00s]
reader1: (groupid=0): err= 0:
read : io= 662MiB, bw= 3469KiB/s, runt=200058msec
slat (msec): min= 0, max= 0, avg= 0.00, dev= 0.00
clat (msec): min= 0, max= 1362, avg= 1.06, dev=30.90
bw (KiB/s) : min= 490, max=12582, per=12.03%, avg=3732.10, dev=4034.82
cpu : usr=0.15%, sys=4.43%, ctx=169761
reader2: (groupid=0): err= 0:
read : io= 644MiB, bw= 3375KiB/s, runt=200041msec
slat (msec): min= 0, max= 0, avg= 0.00, dev= 0.00
clat (msec): min= 0, max= 1454, avg= 1.09, dev=31.69
bw (KiB/s) : min= 485, max=12558, per=11.59%, avg=3595.80, dev=3795.64
cpu : usr=0.15%, sys=4.36%, ctx=165149
reader3: (groupid=0): err= 0:
read : io= 679MiB, bw= 3561KiB/s, runt=200057msec
slat (msec): min= 0, max= 0, avg= 0.00, dev= 0.00
clat (msec): min= 0, max= 1584, avg= 1.03, dev=30.33
bw (KiB/s) : min= 445, max=12689, per=12.53%, avg=3886.43, dev=4161.05
cpu : usr=0.15%, sys=4.63%, ctx=174219
reader4: (groupid=0): err= 0:
read : io= 1MiB, bw= 9KiB/s, runt=200009msec
<<< ONLY ONE MIB
slat (msec): min= 0, max= 0, avg= 0.00, dev= 0.00
clat (msec): min= 0, max= 1327, avg=411.34, dev=597.62
bw (KiB/s) : min= 3, max= 24, per=0.03%, avg= 9.92, dev=11.60
cpu : usr=0.00%, sys=0.01%, ctx=501
reader5: (groupid=0): err= 0:
read : io= 1MiB, bw= 9KiB/s, runt=200009msec
<<< ONLY ONE MIB
slat (msec): min= 0, max= 0, avg= 0.00, dev= 0.00
clat (msec): min= 0, max= 1327, avg=413.70, dev=599.16
bw (KiB/s) : min= 3, max= 24, per=0.03%, avg= 9.67, dev=11.18
cpu : usr=0.00%, sys=0.01%, ctx=491
reader6: (groupid=0): err= 0:
read : io= 661MiB, bw= 3466KiB/s, runt=200045msec
slat (msec): min= 0, max= 0, avg= 0.00, dev= 0.00
clat (msec): min= 0, max= 1453, avg= 1.06, dev=30.92
bw (KiB/s) : min= 483, max=12222, per=12.14%, avg=3765.32, dev=4002.38
cpu : usr=0.14%, sys=4.51%, ctx=169566
reader7: (groupid=0): err= 0:
read : io= 655MiB, bw= 3435KiB/s, runt=200056msec
slat (msec): min= 0, max= 0, avg= 0.00, dev= 0.00
clat (msec): min= 0, max= 1585, avg= 1.07, dev=31.36
bw (KiB/s) : min= 602, max=12115, per=11.78%, avg=3654.40, dev=3858.70
cpu : usr=0.14%, sys=4.48%, ctx=168098
reader8: (groupid=0): err= 0:
read : io= 666MiB, bw= 3496KiB/s, runt=200000msec
slat (msec): min= 0, max= 0, avg= 0.00, dev= 0.00
clat (msec): min= 0, max= 1358, avg= 1.05, dev=30.95
bw (KiB/s) : min= 7, max=12730, per=12.23%, avg=3791.85, dev=4177.52
cpu : usr=0.15%, sys=4.53%, ctx=170999
reader9: (groupid=0): err= 0:
read : io= 666MiB, bw= 3495KiB/s, runt=200016msec
slat (msec): min= 0, max= 0, avg= 0.00, dev= 0.00
clat (msec): min= 0, max= 1456, avg= 1.05, dev=30.78
bw (KiB/s) : min= 6, max=12681, per=12.13%, avg=3761.05, dev=4038.50
cpu : usr=0.17%, sys=4.52%, ctx=170982
reader10: (groupid=0): err= 0:
read : io= 648MiB, bw= 3398KiB/s, runt=200026msec
slat (msec): min= 0, max= 0, avg= 0.00, dev= 0.00
clat (msec): min= 0, max= 1356, avg= 1.09, dev=31.57
bw (KiB/s) : min= 5, max=12533, per=11.79%, avg=3657.35, dev=3930.12
cpu : usr=0.15%, sys=4.35%, ctx=166233
reader11: (groupid=0): err= 0:
read : io= 628MiB, bw= 3296KiB/s, runt=200049msec
slat (msec): min= 0, max= 0, avg= 0.00, dev= 0.00
clat (msec): min= 0, max= 1591, avg= 1.12, dev=32.34
bw (KiB/s) : min= 3, max=13910, per=11.59%, avg=3595.44, dev=4000.10
cpu : usr=0.16%, sys=4.27%, ctx=161300

Run status group 0 (all jobs):
READ: io=5917MiB, aggrb=31013, minb=9, maxb=3561, mint=200000msec,
maxt=200058msec

Disk stats (read/write):
sda: ios=1516020/65, merge=242/28, ticks=1975258/49498,
in_queue=2024681, util=100.00%


Is such behavior expected?

Thanks in advance,
Vasily.




More information about the Devel mailing list