Disk, filesystem benchmarks: fio, iozone, ior on Linux

Posted by Pavlo Khmel on Mon 05 February 2024

Examples were tested on Rocky Linux 9.2. Benchmarks will print output results in MiB/s or MB/s (data transferred per second).

MB vs MiB vs Mb

  • 1 Bit = 0 or 1
  • 1 Byte = 8 Bits
  • 1 Megabyte (MB) = 1,000,000 bytes or 8,000,000 bits
  • 1 Megabit (Mb) = 1,000,000 bits or 125,000 bytes
  • 1 Mebibytes (MiB) = 1,048,576 bytes or 8,388,608 bits

Install benchmarks fio, iozone, ior

fio available in the default repository

dnf -y install fio

iozone RPM or Source can be downloaded from https://www.iozone.org

curl -O https://www.iozone.org/src/current/iozone-3-506.x86_64.rpm
rpm -ivh iozone-3-506.x86_64.rpm 

ior benchmark is created to test parallel shared filesystems and it requires MPI to run from multiple servers.

Compile OpenMPI:

yum install gcc gcc-c++ make perl
curl -O https://download.open-mpi.org/release/open-mpi/v3.1/openmpi-3.1.3.tar.gz
tar xf openmpi-3.1.3.tar.gz
cd openmpi-3.1.3/
CFLAGS="-Ofast -march=native" ./configure --prefix=$HOME/openmpi-3.1.3
make
make install

# create file mpi_env.sh

#!/bin/bash
export MPI_HOME=$HOME/openmpi-3.1.3
export PATH=$MPI_HOME/bin:$PATH
export LD_LIBRARY_PATH=$MPI_HOME/lib:$LD_LIBRARY_PATH

# load new MPI environment

source mpi_env.sh

Compile ior:

curl -LJO https://github.com/hpc/ior/releases/download/3.3.0/ior-3.3.0.tar.gz
tar xf ior-3.3.0.tar.gz 
cd ior-3.3.0/
./configure
make
cd ..

Benchmarks rules:

  • Benchmark data size should be 2-3 times larger than RAM on a server. To avoid benchmarking cache instead of disk. But this takes a lot of time.
  • It is possible to use "O_DIRECT" to bypass disk cache.
  • If a benchmark is run as root users: it is possible to sync data and drop cache between tests with the command:
sync; echo 3 > /proc/sys/vm/drop_caches 
  • On just created file system check that the initialization process is finished. Example with EXT4 files system:
mkfs.ext4 /dev/mapper/ost0

# optin init_itable=0 makes initialization faster

mount -o init_itable=0 /dev/mapper/ost0 /mnt/ost0

# wait until ext4lazyinit process stops

ps aux | grep ext4lazyinit
  • Benchmarking RAID? Check that RAID initialization process completed

Sequential read/write example on a single disk.

fio --name=seq_w --directory=/mnt/ost00 --numjobs=32 --size=10G --time_based --runtime=60s --ioengine=libaio --direct=1 --verify=0 --bs=1M --iodepth=64 --rw=write --group_reporting | grep -A1 "Run status group" | grep -v "Run status group"
fio --name=seq_r --directory=/mnt/ost00 --numjobs=32 --size=10G --time_based --runtime=60s --ioengine=libaio --direct=1 --verify=0 --bs=1M --iodepth=64 --rw=read --group_reporting | grep -A1 "Run status group" | grep -v "Run status group"

cd /mnt/ost00
/opt/iozone/bin/iozone -I -i 0 -c -e -w -r 1m -s 10G -t 16 -+n | grep Children
/opt/iozone/bin/iozone -I -i 1 -c -e -w -r 1m -s 10G -t 16 -+n | grep Children

# on one server (modify -s SIZE )
mpirun --allow-run-as-root -np 32 /root/ior-3.3.0/src/ior -a POSIX -v -F -C -e -g -k -b 1m -t 1m -s 10000 -i 1 -w -r -o /mnt/ost00/

# on multiple servers (modify -s SIZE )
mpirun --allow-run-as-root -np 32 --host serverA:16,serverB:16 /root/ior-3.3.0/src/ior -a POSIX -v -F -C -e -g -k -b 1m -t 1m -s 10000 -i 1 -w -r -o /mnt/ost00/

Output example:

# fio
  WRITE: bw=597MiB/s (626MB/s), 597MiB/s-597MiB/s (626MB/s-626MB/s), io=35.0GiB (37.6GB), run=60147-60147msec
   READ: bw=782MiB/s (820MB/s), 782MiB/s-782MiB/s (820MB/s-820MB/s), io=45.9GiB (49.3GB), run=60077-60077msec

# iozone

  Children see throughput for 16 initial writers  =  500237.48 kB/sec
  Children see throughput for 16 readers          =  934463.98 kB/sec

# ior

  Max Write: 488.02 MiB/sec (511.72 MB/sec)
  Max Read:  946.20 MiB/sec (992.16 MB/sec)

Sequential read/write example on 4 disks ost00 + ost01 + ost02 + ost03

fio --name=seq_w --directory=/mnt/ost00:/mnt/ost01:/mnt/ost02:/mnt/ost03 --numjobs=32 --size=10G --time_based --runtime=60s --ioengine=libaio --direct=1 --verify=0 --bs=1M --iodepth=64 --rw=write --group_reporting
fio --name=seq_r --directory=/mnt/ost00:/mnt/ost01:/mnt/ost02:/mnt/ost03 --numjobs=32 --size=10G --time_based --runtime=60s --ioengine=libaio --direct=1 --verify=0 --bs=1M --iodepth=64 --rw=read --group_reporting

/opt/iozone/bin/iozone -I -i 0 -c -e -w -r 1m -s 10G -t 32 -+n -F /mnt/ost00/file01 /mnt/ost00/file02 /mnt/ost00/file03 /mnt/ost00/file04 /mnt/ost00/file05 /mnt/ost00/file06 /mnt/ost00/file07 /mnt/ost00/file08 /mnt/ost01/file01 /mnt/ost01/file02 /mnt/ost01/file03 /mnt/ost01/file04 /mnt/ost01/file05 /mnt/ost01/file06 /mnt/ost01/file07 /mnt/ost01/file08 /mnt/ost02/file01 /mnt/ost02/file02 /mnt/ost02/file03 /mnt/ost02/file04 /mnt/ost02/file05 /mnt/ost02/file06 /mnt/ost02/file07 /mnt/ost02/file08 /mnt/ost03/file01 /mnt/ost03/file02 /mnt/ost03/file03 /mnt/ost03/file04 /mnt/ost03/file05 /mnt/ost03/file06 /mnt/ost03/file07 /mnt/ost03/file08

/opt/iozone/bin/iozone -I -i 1 -c -e -w -r 1m -s 10G -t 32 -+n -F /mnt/ost00/file01 /mnt/ost00/file02 /mnt/ost00/file03 /mnt/ost00/file04 /mnt/ost00/file05 /mnt/ost00/file06 /mnt/ost00/file07 /mnt/ost00/file08 /mnt/ost01/file01 /mnt/ost01/file02 /mnt/ost01/file03 /mnt/ost01/file04 /mnt/ost01/file05 /mnt/ost01/file06 /mnt/ost01/file07 /mnt/ost01/file08 /mnt/ost02/file01 /mnt/ost02/file02 /mnt/ost02/file03 /mnt/ost02/file04 /mnt/ost02/file05 /mnt/ost02/file06 /mnt/ost02/file07 /mnt/ost02/file08 /mnt/ost03/file01 /mnt/ost03/file02 /mnt/ost03/file03 /mnt/ost03/file04 /mnt/ost03/file05 /mnt/ost03/file06 /mnt/ost03/file07 /mnt/ost03/file08

mpirun --allow-run-as-root -np 32 /root/ior-3.3.0/src/ior -a POSIX -v -F -C -e -g -k -b 1m -t 1m -s 60000 -i 1 -w -r -o /mnt/ost00/file01@/mnt/ost00/file02@/mnt/ost00/file03@/mnt/ost00/file04@/mnt/ost00/file05@/mnt/ost00/file06@/mnt/ost00/file07@/mnt/ost00/file08@/mnt/ost01/file01@/mnt/ost01/file02@/mnt/ost01/file03@/mnt/ost01/file04@/mnt/ost01/file05@/mnt/ost01/file06@/mnt/ost01/file07@/mnt/ost01/file08@/mnt/ost02/file01@/mnt/ost02/file02@/mnt/ost02/file03@/mnt/ost02/file04@/mnt/ost02/file05@/mnt/ost02/file06@/mnt/ost02/file07@/mnt/ost02/file08@/mnt/ost03/file01@/mnt/ost03/file02@/mnt/ost03/file03@/mnt/ost03/file04@/mnt/ost03/file05@/mnt/ost03/file06@/mnt/ost03/file07@/mnt/ost03/file08