Friday, June 17, 2016

Measuring IOPS part 2: DISKSPD

In the previous post I have been talking of IOMETER and we have seen that today it is no more the best tool for disk performance measurement for several reasons.
ENTER DISKSPD
Crawling the net, I have found an excellent blog post by Jose Barreto on a tool named Diskspd.exe recently developed by Microsoft. As explained in that post, Diskspd is a flexible tool that produce many types of synthetic workloads using all kinds of storage, including local disks, LUNs on a SAN, Storage Spaces or SMB file shares.
The version I have found is 2.0.15, released on January 2015, which officially supersedes the SQLIO Disk Subsystem Benchmark Tool. The source code is hosted on the following Github repo:
while the binaries can be found here:
The interesting things about Diskspd are that :
  • it comes with a very nice explanation of the way it works, with examples and use cases
  • the provided performance information is extensive
  • the text output can be parsed by the PowerShell automators we have all become
As a test bed for the initial runs I have decided to use a HP Proliant BL460c Gen8 with 32 CPU cores and a Smart Array P220i controller with a RAID 1 array made up of two 10000 RPM SAS disks.
Actually I used Diskspd against more recent hardware but it failed with the following error:
diskspd test.dat

WARNING: Complete CPU utilization cannot currently be gathered within DISKSPD for this system.
        Use alternate mechanisms to gather this data such as perfmon/logman.
        Active KGroups 2 > 1 and/or processor count 16 > 64.
It looks like there’s a problem with the API that gets CPU stats, as mentioned here.

When you have a look at the Diskspd help, you will se that the only mandatory parameter is a target, in the form of a file path (c:\test.dat), a partition letter (E:) or a physical drive number (#2):

https://3.bp.blogspot.com/-KUOsNXc--5Q/VztXt_aMCVI/AAAAAAAAC0U/nyF2-nzI24oRWbGGyLnd6rPeWP8VW4LTgCKgB/s640/iops06.png

Not being an expert, I decide to run the command with just the mandatory target in an elevated prompt:
diskspd c:\test.dat

Command Line: C:\Windows\system32\diskspd.exe c:\test.dat

Input parameters:

       timespan:   1
       -------------
       duration: 10s
       warm up time: 5s
      cool down time: 0s
       random seed: 0
       path: 'test.dat'
          think time: 0ms
          burst size: 0
          using software and hardware write cache
          performing read test
          block size: 65536
          using sequential I/O (stride: 65536)
          number of outstanding I/O operations: 2
          thread stride size: 0
          threads per file: 1
          using I/O Completion Ports
          IO priority: normal

Results for timespan 1:
*******************************************************************************

actual test time:       10.09s
thread count:           1
proc count:             32

CPU |  Usage |  User  |  Kernel |  Idle
-------------------------------------------
  0| 100.00%|   3.71%|   96.30%|   0.00%
  1|   0.00%|   0.00%|    0.00%| 100.00%
  2|   2.47%|   0.62%|    1.85%|  97.38%
  3|   0.00%|   0.00%|    0.00%| 100.00%
  4|   1.39%|   0.00%|    1.39%|  98.46%
  5|   0.00%|   0.00%|    0.00%| 100.00%
  6|   0.31%|   0.15%|    0.15%|  99.70%
  7|   0.00%|   0.00%|    0.00%| 100.00%
  8|   0.15%|   0.00%|    0.15%|  99.85%
  9|   0.00%|   0.00%|    0.00%| 100.00%
 10|   0.00%|   0.00%|    0.00%| 100.00%
 11|   0.00%|   0.00%|    0.00%| 100.00%
 12|   0.00%|   0.00%|    0.00%| 100.00%
 13|   0.00%|   0.00%|    0.00%| 100.00%
 14|   0.31%|   0.00%|    0.31%|  99.70%
 15|   0.00%|   0.00%|    0.00%| 100.00%
 16|   1.24%|   1.08%|    0.15%|  98.77%
 17|   0.00%|   0.00%|    0.00%| 100.00%
 18|   6.03%|   5.41%|    0.62%|  93.98%
 19|   0.00%|   0.00%|    0.00%| 100.00%
 20|   4.79%|   4.64%|    0.15%|  95.06%
 21|   0.00%|   0.00%|    0.00%| 100.00%
 22|   0.46%|   0.15%|    0.31%|  99.54%
 23|   0.00%|   0.00%|    0.00%| 100.00%
 24|   0.00%|   0.00%|    0.00%| 100.00%
 25|   0.00%|   0.00%|    0.00%| 100.00%
 26|   0.00%|   0.00%|    0.00%| 100.00%
 27|   0.00%|   0.00%|    0.00%| 100.00%
 28|   0.00%|   0.00%|    0.00%| 100.00%
 29|   0.00%|   0.00%|    0.00%| 100.00%
 30|   0.00%|   0.00%|    0.00%| 100.00%
 31|   0.00%|   0.00%|    0.00%| 100.00%
-------------------------------------------
avg.|   3.66%|   0.49%|    3.17%|  96.33%

Total IO
thread |       bytes     |     I/Os     |     MB/s   |  I/O per s |  file
------------------------------------------------------------------------------
    0 |     25740902400 |       392775 |    2432.28 |   38916.47 | test.dat (1024MB)
------------------------------------------------------------------------------
total:       25740902400 |       392775 |    2432.28 |   38916.47

Read IO
thread |       bytes     |     I/Os     |     MB/s   |  I/O per s |  file
------------------------------------------------------------------------------
    0 |     25740902400 |       392775 |    2432.28 |   38916.47 | test.dat (1024MB)
------------------------------------------------------------------------------
total:       25740902400 |       392775 |    2432.28 |   38916.47

Write IO
thread |       bytes    |     I/Os     |     MB/s   |  I/O per s |  file
------------------------------------------------------------------------------
    0 |               0 |            0 |       0.00 |       0.00 | test.dat (1024MB)
------------------------------------------------------------------------------
total:                 0 |            0 |       0.00 |       0.00
Looking at the text output, we can see that by default Diskspd warms up for five seconds before doing a ten seconds long sequential read run with 65k blocks and two outstanding I/O operations against a 1GB file:

https://2.bp.blogspot.com/--IKuJ6o2ce4/VztXvTJK5QI/AAAAAAAAC0o/-xah3zwl_nQ_wko15xA7v_2rHJuP3-kgwCKgB/s400/iops07.png

The result are split in two. In the first section you can find a bit of information on the usage of all the cores:

https://3.bp.blogspot.com/-yKQtCcuj8Pc/VztXuJDiGmI/AAAAAAAAC0Y/cUH-oL8ZRt8xCILMUf7cB4zeISDga5YbwCKgB/s640/iops08.png

In the second section there are the actual IO results, split in three sections:
  • Total IO
  • Read IO
  • Write IO
Since, as I have said, the default test is a read test, there are no data about write operations and the Total IO results match exactly the Read IO results:

https://1.bp.blogspot.com/-WBf8o-nwzi0/VztXuD1B9EI/AAAAAAAAC0c/XF7EljYd9y4kxXCP0iv8Ybk2HyEAbsM3ACKgB/s640/iops09.png

If I try to balance reads and writes, I can see that my RAID 1, which is in theory supposed to have high read performance, and medium write performance (due to a write IO penalty of 2, as explained here), returns oddly uniform results:

https://2.bp.blogspot.com/-5tQKrvxWiCY/VztXuijR4yI/AAAAAAAAC0g/SWF6Ujh9dyQNUaumGEtka6QJJuRlCu2TQCKgB/s640/iops10.png

When I disable caching (-h parameter), I get much lower results, but I still cannot see the impact of the RAID 1 write penalty:

https://3.bp.blogspot.com/--LUmwLSCh1g/VztXuo6NyMI/AAAAAAAAC0k/k755gz0WsRQ38juCeIj19UzTJn9FM7adQCKgB/s640/iops11.png

When to this I add the –L parameter, the same command returns latency information:

https://2.bp.blogspot.com/--Aixu-cCPpc/VztXvXDbXSI/AAAAAAAAC0s/KuoxWj2fOgcDKkBMWGippF46Go7dSi7VgCKgB/s640/iops12.png

By defaults Diskspd uses a 65K I/O size. Changing this parameter to 4k (-b4k) causes an increment in the number of the IOPS for both reads and writes, latency decreases dramatically, but the throughput keeps being the same for reads and writes:

https://2.bp.blogspot.com/-xQ2Ajvo8WS4/VztXv8DulxI/AAAAAAAAC0w/4bH49fvdCkQ5n1MUe9GeFNf6BUn96fZiQCKgB/s640/iops13.png

If I switch from sequential operations to random operations (–r switch), the throughput decreases while read latency increases:

https://3.bp.blogspot.com/-bX8xyPtDTnU/VztXwYRxe2I/AAAAAAAAC00/XTugFSG1fLQhZGa3Ri9Q143jcrDNEH_NACKgB/s640/iops14.png

Again, the results are strangely uniform between reads and writes. We will see why in the third post of this series.

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...