Verifying Linux System Performance
Verifying Linux System Performance
To verify an ioDrive's performance in a Linux system, Fusion-io recommends
using the fio benchmark. fio is included in many distributions, or may be
compiled from source. The latest source distribution is always linked to from
http://freshmeat.net/projects/fio,
and compiling requires having the libaio development headers in place. For
step-by-step instructions on compiling fio from source, see
Appendix A: Compiling the fio Utility
We recommend using raw block access to test the raw performance of
the ioDrive. The best way to verify system performance is to run the
fio jobs shown below.
|
WARNING: Note that the write tests will destroy any data
that currently resides on the ioDrive.
|
Important: Use driver stack 2.1 or later for best
performance.
Running fio Tests (Linux)
There are four fio tests of interest in the sample jobs below:
- One-card write bandwidth
- One-card read IOPS
- One-card read bandwidth
- One-card write IOPS
The following benchmarks are designed for maximally stressing the system and
detecting performance issues, not to showcase the ioDrive's performance.
# Write Bandwidth test
$ fio --filename=/dev/fioa --direct=1 --rw=randwrite --bs=1m --size=5G --numjobs=4 --runtime=10 --group_reporting --name=file1
# Read IOPS test
$ fio --filename=/dev/fioa --direct=1 --rw=randread --bs=4k --size=5G --numjobs=64 --runtime=10 --group_reporting --name=file1
# Read Bandwidth test
$ fio --filename=/dev/fioa --direct=1 --rw=randread --bs=1m --size=5G --numjobs=4 --runtime=10 --group_reporting --name=file1
# Write IOPS test
$ fio --filename=/dev/fioa --direct=1 --rw=randwrite --bs=4k --size=5G --numjobs=64 --runtime=10 --group_reporting --name=file1
Figure 1 – Benchmark tests
These tests are also avaliable as fio job input files, which may be requested
from support@fusionio.com.
The latest expected performance numbers for your card type can be found in
the ioDrive spec sheet, available at http://www.fusionio.com/products.
The sample IOPS test uses a 4K block size and 64 threads. The sample bandwidth
test uses 1MB block size and 4 threads. For multi-card runs, IOPS are calculated
by adding the per-card bandwidth/second together and dividing it by the block
size. See the following section for sample output from each of the tests and how
to validate that your system is performing properly. The key data points in the
output are highlighted.
Sample Benchmarks, Expected Results for Linux
Make sure to run the system-vetting
tests in the order they are
displayed below. The initial write bandwidth test may need to be run twice with
a short pause between the runs to setup the card for best performance.
Note: Compare your system results to the expected published
numbers for your card type. See http://www.fusionio.com/products
for details. Typically, your results should be better than those on the
published spec sheet.
Write Bandwidth Test on Linux
The output below shows the
test achieving 733 MiB/sec, with random writes done on 1MB blocks. This is a
10-second write test.
file1: (g=0): rw=randwrite, bs=1M-1M/1M-1M, ioengine=sync, iodepth=1
...
file1: (g=0): rw=randwrite, bs=1M-1M/1M-1M, ioengine=sync, iodepth=1
Starting 4 processes
Jobs: 4 (f=4): [wwww] [100.0% done] [ 0/756023 kb/s] [eta 00m:00s]
file1: (groupid=0, jobs=4): err= 0: pid=28562
write: io=7,165MiB, bw=733MiB/s, iops=716, runt= 10004msec
clat (usec): min=1,825, max=73,427, avg=5583.05, stdev=797.26
bw (KiB/s) : min= 0, max=189326, per=23.78%, avg=178554.21, stdev=20802.46
cpu : usr=0.04%, sys=2.05%, ctx=7179, majf=0, minf=44
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued r/w : total=0/7165, short=0/0
lat (msec) : 2=0.01%, 4=0.03%, 10=99.90%, 100=0.06%
Run status group 0 (all jobs):
WRITE: io=7,165MiB, aggrb=733MiB/s, minb=733MiB/s, maxb=733MiB/s, mint=10004msec, maxt=10004msec
Disk stats (read/write):
fioa: ios=0/57320, merge=0/0, ticks=0/252622, in_queue=1852545, util=98.84%
Figure 2 – Write bandwidth test on Linux
Read IOPS Test on Linux
The output below shows the test
achieving 104,000 IOPS, with random reads done on 4K blocks.
file1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=sync, iodepth=1
...
file1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=sync, iodepth=1
Starting 64 processes
Jobs: 64 (f=64): [rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr]
[100.0% done] [420483/ 0 kb/s] [eta 00m:00s]
file1: (groupid=0, jobs=64): err= 0: pid=27861
read : io=4,058MiB, bw=414MiB/s, iops=104K, runt= 10036msec
clat (usec): min=44, max=36,940, avg=620.03, stdev=50.53
bw (KiB/s) : min= 5873, max=15664, per=1.55%, avg=6560.25, stdev=54.21
cpu : usr=0.50%, sys=4.01%, ctx=1163152, majf=0, minf=704
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued r/w: total=1038929/0, short=0/0
lat (usec): 50=0.01%, 100=0.43%, 250=11.05%, 500=35.74%, 750=22.88%
lat (usec): 1000=14.99%
lat (msec): 2=14.88%, 4=0.02%, 10=0.01%, 50=0.01%
Run status group 0 (all jobs):
READ: io=4,058MiB, aggrb=414MiB/s, minb=414MiB/s, maxb=414MiB/s, mint=10036msec, maxt=10036msec
Disk stats (read/write):
fioa: ios=1038929/0, merge=0/0, ticks=389591/0, in_queue=61732674, util=99.12%
Figure 3 – Read IOPS test on Linux
Read Bandwidth Test on Linux
The output below shows the test
achieving 779 MiB/sec, with random reads done on 1MB blocks.
file1: (g=0): rw=randread, bs=1M-1M/1M-1M, ioengine=sync, iodepth=1
...
file1: (g=0): rw=randread, bs=1M-1M/1M-1M, ioengine=sync, iodepth=1
Starting 4 processes
Jobs: 4 (f=4): [rrrr] [100.0% done] [811597/ 0 kb/s] [eta 00m:00s]
file1: (groupid=0, jobs=4): err= 0: pid=28781
read : io=7,614MiB, bw=779MiB/s, iops=____760____, runt= 10007msec
clat (usec): min=1,788, max=39,174, avg=5253.24, stdev=920.87
bw (KiB/s) : min=157286, max=221412, per=25.16%, avg=200720.59, stdev=5042.40
cpu : usr=0.03%, sys=2.79%, ctx=7880, majf=0, minf=1072
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued r/w: total=7614/0, short=0/0
lat (msec): 2=0.39%, 4=15.34%, 10=83.18%, 20=0.99%, 50=0.11%
Run status group 0 (all jobs):
READ: io=7,614MiB, aggrb=779MiB/s, minb=779MiB/s, maxb=779MiB/s, mint=10007msec, maxt=10007msec
Disk stats (read/write):
fioa: ios=60912/0, merge=0/0, ticks=218927/0, in_queue=1667617, util=98.84%
Figure 4 – Read bandwidth test on Linux
Write IOPS Test on Linux
The test below achieved 106,000
IOPS, with random writes and a 4K block size.
file1: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=sync, iodepth=1
...
file1: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=sync, iodepth=1
Starting 64 processes
Jobs: 64 (f=64): [wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww]
[100.0% done] [ 0/422981 kb/s] [eta 00m:00s]
file1: (groupid=0, jobs=64): err= 0: pid=28964
write: io=4,138MiB, bw=424MiB/s, iops=106K, runt= 10001msec
clat (usec): min=31, max=68,803, avg=589.54, stdev=67.72
bw (KiB/s) : min= 0, max=17752, per=1.44%, avg=6231.17, stdev=182.09
cpu : usr=0.37%, sys=2.97%, ctx=1059983, majf=0, minf=576
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued r/w: total=0/1059304, short=0/0
lat (usec): 50=0.21%, 100=0.67%, 250=1.81%, 500=2.79%, 750=94.36%
lat (usec): 1000=0.14%
lat (msec): 2=0.01%, 100=0.01%
Run status group 0 (all jobs):
WRITE: io=4,138MiB, aggrb=424MiB/s, minb=424MiB/s, maxb=424MiB/s, mint=10001msec,
maxt=10001msec
Disk stats (read/write):
fioa: ios=0/1059304, merge=0/0, ticks=0/425609, in_queue=64011510, util=99.50%
Figure 5 – Write IOPS test on Linux