Thursday, August 4, 2016

Measuring IOPS part 5: enter PowerShell

We keep our quest to understand how we can use DISKSPD to measure storage performance. As suggested by Jose Barreto and as I have discussed in the previous post of this series, we need to run the tool with a large number of workloads to get a grip of this complex matter.

To do so, I use the magic of a PowerShell script that iterates through combinations of
  • 4KB to 2MB IOs
  • 1 to 64 threads per file
  • 1 to 64 queue depth
to perform 10 seconds long read operations (I reckon this impacts me more than write operations) on a 1GB file.

I also capture latency information (by means of the –L parameter) and disable caching (adding the –h parameter).

Here’s the script. I have reused some parts of Jose Barreto’s blog code to build mine (so kudos to him for that):

$container = @()
$blocksize = 4,8,16,32,64,128,512,1024,2048
$thread = 1,2,4,8,16,32,64
$outstandingIO = 1,2,4,8,16,32,64
$duration = 10
$warmup = 5
$rest = 2
$combination = $blocksize.count * $thread.count * $outstandingIO.count
"TEST SPEC".PadRight(50,'*')
"Test started at $(Get-Date)"
"$Combination combinations".PadRight(50,'*')
"Duration of each test: $duration seconds".PadRight(50,'*')
"Warm-up time of each test: $warmup seconds".PadRight(50,'*')
"Rest between tests: $rest seconds"
"Predicted finish time: $((Get-Date) + (New-TimeSpan -seconds $($combination * ($duration + $warmup + $rest))))".PadRight(50,'*')
$a = 0 
$blocksize | % {
    $b = "-b$_" + "k"
    [int]$k = $_
    $thread | % {
        $c = "-t$_"
        [int]$t = $_
        $outstandingIO | % {
            $d ="-o$_"
            $o = $_
            Write-Progress -Id 1 -Activity ("Checking IOPS") -PercentComplete ($a / $combination * 100) -Status ("Checked {0} combinations of {1}" -f $a, $combination)   
            try {
                $ErrorActionPreference = 'Stop'
                $cmd = "diskspd $b $c $d -w0 -c1G -r -h -d$duration -W$warmup -L C:\test.dat"
                $result = Invoke-Expression $cmd
                foreach ($line in $result) {if ($line -like "total:*") { $total=$line; break } }
                foreach ($line in $result) {if ($line -like "avg.*") { $avg=$line; break } }
                [double]$mbps = $total.Split("|")[2].Trim() 
                [double]$iops = $total.Split("|")[3].Trim()
                [double]$latency = $total.Split("|")[4].Trim()
                [double]$cpu = $avg.Split("|")[1].Trim() -replace '%'   
                "Param $b, $c, $d, $iops iops, $mbps MB/sec, $latency ms, $cpu %CPU"
                $current = [PSCustomObject]@{
                    Bytes =$k
                    Threads = $t
                    Queuedepth = $o
                    iops = $iops
                    MBs = $mbps
                    Latencyms = $latency
                    PercentCPU = $cpu
                    }
                }
            catch {
                "Param $b, $c, $d failed"
                $current = [PSCustomObject]@{
                    Bytes =$k
                    Threads = $t
                    Queuedepth = $o
                    iops = 'NA'
                    MBs = 'NA'
                    Latencyms = 'NA'
                    PercentCPU = 'NA'
                    }
                }
            $a++
            $container += $current
            Start-Sleep -Seconds $rest
            }
        }
    }

That's a very simple scripts that user iteration to build a list of results combined with the test specs. Here's a screenshot of the script while running with its progress bar:




This is particularly useful for filtering the data afterward: having the results from more than 400 iterations inside a custom object, gives us the possibility to check for some empirical evidence generated from within the test.


In the next post of this series we will see what results it will give. Stay tuned!

Wednesday, July 6, 2016

New PowerShell function to setup a iSCSI target on a NETAPP and mount it on Windows

I don't often use my NETAPP controllers as iSCSI targets. Most of the time I just export NFS (especially to host VMWare datastores) or CIFS and that suffices my needs. Sometimes however I am asked to mount on Windows Servers disks from the NETAPP directly on startup and not just as simple shares mounted with logon scripts or whatever.
So I decided to write a pretty do-it-all PowerShell function that does all the work of configuring the NETAPP and mounting the disk via iSCSI on my server.
If I had to do this via the OnCommand GUI, I should first setup an aggregate, then a volume, then a LUN, configure the iGroup and only then move to my Windows server and manually bind the initiator with the target.
Fortunately back in 2010 NETAPP released a module (inside the NetApp PowerShell Toolkit) to do all this stuff and more. Today you need to grab at least version 1.5 to get all the cmdlets I used in my function. Personally I have the last version (which is 4.2) which can bo downloaded here.


Concerning the requirements, the script must run on the Windows Server that acts as initiator. On that server you have to be running at least version 3.0 of Windows PowerShell and to have the DataONTAP module installed.
The Windows Server that acts as initiator will run my function that
  • configures the storage controller
  • setup the target
  • setup the initiator
  • do the mapping with the LUN in a iGroup.
Once the function has finished configuring the iSCSI disk, you just have to initialize it in Disk Management, format it and assign a letter. I already showed you how to do this in PowerShell, just search my blog.

Enough said: here's the function Mount-NAiSCSIDisk.

Just one last quick note: I am going to work on this function to improve it, so please, post suggestions or ideas in the comments below! And, for sure, share!

#Requires -Version 3.0 -Modules DataONTAP

function Mount-NAiSCSIDisk
        <#
        .SYNOPSIS
            Setup and mounts a NetApp iSCSI LUN to a Windows Server
 
        .DESCRIPTION
            Setup an aggregate, a volume, a LUN, an iGroup then mounts it as a NetApp iSCSI target to a Windows Server
             
        .PARAMETER NaController
            Name of the NetApp controller

        .PARAMETER Aggregate
            Name of the aggregate

        .PARAMETER DiskCount
            Number of disks in the aggregate
            
        .PARAMETER Volume
            Name of the volume

        .PARAMETER VolumeSize
            Size of the volume

        .PARAMETER SnapshotReserveSize
            Size of the snapshot reserve

        .PARAMETER LUNPath
            Path of the LUN

        .PARAMETER LUNType
            Type of the LUN

        .PARAMETER LUNSize
            Size of the LUN

        .PARAMETER Igroup
            Nq;e of the Igroup

        .PARAMETER IgroupProtocol
            Protocl for the Igroup
            
        .PARAMETER IgroupType
            Type of the Igroup

        .EXAMPLE
            Mount-NAiSCSIDisk -NaController netapp1 -Igroup igroup1 -IgroupProtocol iscsi -IgroupType windows
            
        .EXAMPLE
            Mount-NAiSCSIDisk -NaController netapp1 -Aggregate aggr1 -DiskCount 32 -Volume vol1 -VolumeSize 10g -SnapshotReserveSize 0 -LUNPath /vol/NA1/iscsivol1 -LUNSize 1g -LUNType windows -Igroup igroup1 -IgroupProtocol iscsi -IgroupType windows

        .AUTHOR
            happysysadm.com - Carlo MANCINI

        #>
    {
    param(
        [Parameter(Mandatory)][string]$NaController,
        [string]$Aggregate,
        [string]$DiskCount,
        [string]$Volume,
        [string]$VolumeSize,
        [string]$SnapshotReserveSize,
        [Parameter(Mandatory)][string]$LUNPath,
        [string]$LUNSize,
        [string]$LunType,
        [Parameter(Mandatory)][string]$Igroup,
        [Parameter(Mandatory)][string]$IgroupProtocol,
        [Parameter(Mandatory)][string]$IgroupType
        )
    
    write-verbose 'Started'
    
    $Error = $False

    $info = @()

    $infolastonline = @()

    Write-Verbose 'Connect to NaController'

    try {
    
        Connect-NaController $NaController -ErrorAction Stop

        }

    catch {

        Write-Warning 'Unable to connect to the controller'

        }

    write-verbose 'Starting the configuration on the netapp controlller'
    
    if(Get-NaAggr $Aggregate) {

        Write-Verbose 'Creating the aggregate'
        
        New-NaAggr $Aggregate -Use64Bit -DiskCount $DiskCount

        }

    else {

        Write-Verbose 'Aggregate already existing'

        }

    if(Get-NaVol $Volume) {

        Write-Verbose 'Creating the volume'

        New-NaVol $Volume -Aggregate $Aggregate $VolumeSize

        Write-Verbose 'Setting the snapshot reserve'

        Set-NaSnapshotReserve $Volume $SnapshotReserveSize

        }

    else {

        Write-Verbose 'Volume already existing'

        }

    if(Get-NaLun $LUNPath) {
    
        Write-Verbose 'Creating the LUN'
    
        New-NaLun -Path $LUNPath -Size $LUNSize -Type $LunType

        }

    else {

        Write-Verbose 'LUN already existing'

        }
    
    Write-Verbose 'Starting the iSCSI configuration'
    
    Write-Verbose 'Adding the storage controller to the iSCSI Initiator target portals'

    Add-NaHostIscsiTargetPortal

    Write-Verbose 'Establishing a connection to the target discovered by the iSCSI Initiator'

    Get-NaHostIscsiTarget (Get-NaIscsiNodeName) | Connect-NaHostIscsiTarget

    Write-Verbose 'Creating a new initiator group'

    New-NaIgroup $Igroup $IgroupProtocol $IgroupType

    Write-Verbose 'Adding the initiator the the initiator group'
    
    Get-NaHostIscsiAdapter | Add-NaIgroupInitiator $Igroup

    Write-Verbose 'Mapping the LUN to the initiators in the initiator group'

    Add-NaLunMap $LUNPath $Igroup

    }

Monday, July 4, 2016

Measuring IOPS part 4: binding IOMETER results to DISKSPD results

In the previous post we have seen that Diskspd can provide us with a quick information on MB/s and I/O per second (IOPS), as well as the average latency.
For the moment we have no idea if the 39k IOPS measured in the first sequential read test or the 710 total IOPS measured in the last random read and write test are good or bad result compared to the workload I have generated. The only thing that we have found for sure is that latency decreases when we use small IOs instead of large IOs. But we still don’t know if a latency of 5 ms, such the one measured in the last run, is a fancy value or not.
We need to investigate further the other Diskspd options. But before we do that we have to build a dictionary of what is measured with IOMETER and see the corrisponding terms used by DISKSPD.
BINDING IOMETER RESULTS TO DISKSPD RESULTS

If I look in the Diskspd help, and I try to reproduce the variables of IOmeter, I can build the following table of matches:

  • IOMETER Transfer request size = DISKSPD -b parameter, described as ‘Size of the IO in KB
  • IOMETER Percent Read/Write Distribution = DISKSPD –w, described as ‘Percentage of writes
  • IOMETER Percent Random/Sequential Distribution = DISKSPD –r, which forces Random operations
  • IOMETER # of Outstanding I/Os = DISKSPD –o, described as ‘Outstanding IOs or queue depth (per thread)
At this moment we have no idea of what kind of workload we have to generate to get meaningful results that can help me assess storage performance, and Jose Barreto himself states that we have to experiment with the –t and the –o parameter until we find the combination that gives the best results.

Now guess which tool I am going to use to automate this task in the next post of this series: PowerShell! Stay tuned!

Monday, June 27, 2016

Measuring IOPS part 3: of the impact of disk array cache on the results of DISKSPD

In the previous post we have welcomed the new kid on the block, namedly DISKSPD, and found out that, though it should disable hardware caching, the –h switch seems to have no effect on the HP SmartArray RAID controller cache. If we have a look at the SmartArray RAID performance factors document edited by HP, we can read that the Read Cache is useful in prefetching sequential reads:


https://1.bp.blogspot.com/--UEeQKtzQw8/VztXxJ4aS2I/AAAAAAAAC04/MGncqT1UZLkhdG9HKSwVi93f0PlteV-4QCKgB/s640/iops15.png


Concerning the Write Cache, HP states that it is used to cheat on the host application by making it believe that the data have been written when they are not, and that if the Cache fills up, the Disk controller has some mechanisms to speed up execution, such as grouping logical blocks (write coalescing) or change the execution order to lower latency (command reordering):


https://3.bp.blogspot.com/-3Bq7PHjROzs/VztXx-CtSSI/AAAAAAAAC08/6aSkkcf4iXQ3LqSX5F07pmgZl-tNMQPZgCKgB/s640/iops16.png


So, the fact of using the HP SmartArray controller cache justifies the fact that, during my tests with Diskspd, I cannot see the write penalty which is typical for RAID 1 configurations:


https://4.bp.blogspot.com/-0k05deVRQgc/VztXyapEaWI/AAAAAAAAC1A/hFv413_snr8fvb2EoDjG-tYShCAz_cSxACKgB/s640/iops17.png


To confirm the fact that the HP Controller is not honoring the request to disable caching, I can run Diskspd with the the –S switch (instead of -h). The –S switch disable the OS cache only. Since the result of this run and of the previous run are roughly the same, we can state that Diskspd can’t disable the SmartArray cache, only the OS cache:

https://1.bp.blogspot.com/-uITrRmNvXLk/VztXy84APXI/AAAAAAAAC1E/Sw2ZWIr2cnMksZ6zzu8HqzhRjTMQuPC0gCKgB/s640/iops18.png


This particular case apart, DISKSPD seems to be a nice tool by Microsoft. Let's keep studying it in the next post of this series (to be published).

Thursday, June 23, 2016

Introducing a PowerShell function to report cluster resources moves and failovers

The number of clusters I am managing these days is becoming very large, so I felt like I needed some kind of report mailed to me giving a view of what resources went online, offline, became degraded or failed in my environment. The good news when you are doing whatever you can think of on a Windows box (and not only) is that you can leverage, you know the special word, PowerShell, which, once you install the Failover Clustering feature, comes with a module for managing clusters:

Get-Module FailoverClusters | Format-List

Name              : FailoverClusters
Path              : C:\windows\system32\windowspowershell\v1.0\Modules\FailoverClusters\FailoverClusters.psd1
Description       : 
ModuleType        : Manifest
Version           : 2.0.0.0
NestedModules     : {Microsoft.FailoverClusters.PowerShell, Microsoft.FailoverClusters.PowerShell}
ExportedFunctions : 
ExportedCmdlets   : {Add-ClusterCheckpoint, Add-ClusterDisk, Add-ClusterFileServerRole, 
                    Add-ClusterGenericApplicationRole...}
ExportedVariables : 
ExportedAliases   : {Add-VMToCluster, Remove-VMFromCluster}
Now the hardest part when you want to set up a cluster reporting, is to find the proper events on which to build your solutions. After a few tests, I came up with the following events to look for:

Cluster Informational events: these are stored in a special operational log
  • Event ID 1201 under Microsoft-Windows-FailoverClustering/Operational is the cluster group coming online
  • Event ID 1204 under Microsoft-Windows-FailoverClustering/Operational is the cluster group going offline
Cluster Warning events: these are recorded in the System event log
  • Event ID 1167 under System is for resource Groups becoming degraded (message: Cluster Agent: The cluster resource resourcename has become degraded.)
Cluster Error events: these are recorded in the System event log
  • Event ID 1609 under System is for resource failures (message: Cluster resource 'resourcename' in clustered service or application 'appname' failed.)



The function I came up with leverage just a few cmdlets.


The first one is Get-WinEvent, used in conjunction with the -FilterHashtable parameter, which has the advantage of being lighting fast (especially compared to Get-EventLog), as I explained in a previous post where I used it to retrieve the system boot time (http://www.happysysadm.com/2014/07/windows-boot-time-explored-in-powershell.html - chapter 12).



The other cmdlets came all from the FailoverClustering module:

Enough said. Here's the advanced function Get-ClusterResourceInfo:

function Get-ClusterResourceInfo
        <#
        .SYNOPSIS
            Generates a report of the status of cluster resources.
 
        .DESCRIPTION
            Generates a report of the status of cluster resources with the timestamp of the last time each resource came online.
            The report can be sent as an e-mail or exported to a CSV file.
             
        .PARAMETER Cluster
            Name of the cluster to query

        .PARAMETER Passthru
            Returns the report as an object

        .PARAMETER Detailed
            Expands Resource Groups adding details about reources

        .EXAMPLE
            Get-ClusterResourceInfo -Cluster clustername -Passthru

        .EXAMPLE
            Get-ClusterResourceInfo -Cluster clustername -Passthru -Detailed

        .EXAMPLE
            Get-ClusterResourceInfo -Cluster clustername -Passthru -Detailed -Verbose

        .EXAMPLE
            Get-ClusterResourceInfo -Cluster clustername -Passthru | Format-table -AutoSize

        .EXAMPLE
            Get-ClusterResourceInfo -Cluster clustername -Passthru | Sort-Object ClusterName | Format-Table -Property ResourceGroup, LastOnline

        .EXAMPLE
            'clu1','clu2' | % { Get-ClusterResourceInfo -Cluster $_ -Passthru -Detailed -Verbose } #| Format-Table * -AutoSize

        .AUTHOR
            HappySysadm.com

        #>
    {
    [CmdletBinding(DefaultParameterSetName='__AllParameterSets')]
    param(

        [Parameter(Mandatory)][ValidateScript({Get-Cluster $_})][string]$Cluster,

        [switch]$Detailed = $False,
        
        [switch]$Passthru = $True

        )
    
    write-verbose 'Started'
    
    if($Detailed) { Write-Verbose 'Detailed mode is ON' } else { Write-Verbose 'Detailed mode is OFF' }

    Write-Verbose 'Retrieving cluster information'

    $Error = $False

    $info = @()

    $infolastonline = @()

    $infolastoffline = @()

    $infolasterror = @()

    $infolastdegraded = @()
    
    try
    
        {

         $clu = Get-cluster $cluster

         $nodes = $clu | Get-ClusterNode

         #Event ID 1201 is the cluster group coming online. Retrieving them all.
         $eventson = foreach($node in $nodes){Get-WinEvent -ComputerName $node -FilterHashtable @{LogName='Microsoft-Windows-FailoverClustering/Operational';ID = 1201} -ErrorAction SilentlyContinue}
         $eventson = $eventson | sort timecreated -Descending

         #Event ID 1204 is the cluster group going offline. Retrieving them all.
         $eventsoff = foreach($node in $nodes){Get-WinEvent -ComputerName $node -FilterHashtable @{LogName='Microsoft-Windows-FailoverClustering/Operational';ID = 1204} -ErrorAction SilentlyContinue}
         $eventsoff = $eventsoff | sort timecreated -Descending
         
         #Event ID 1609 is for resource failures. Retrieving them all.
         $eventserror = foreach($node in $nodes){Get-WinEvent -ComputerName $node -FilterHashtable @{LogName='System';ID = 1069} -ErrorAction SilentlyContinue}
         $eventsoerror = $eventserror | sort timecreated -Descending
         
         #Event ID 1167 is for resource group becoming degraded. Retrieving them all.
         $eventsdegraded = foreach($node in $nodes){Get-WinEvent -ComputerName $node -FilterHashtable @{LogName='System';ID = 1167} -ErrorAction SilentlyContinue}
         $eventsdegraded = $eventsdegraded | sort timecreated -Descending

         $clustergroups = $clu | Get-ClusterGroup
    
         foreach($clustergroup in $clustergroups) {

                Write-Verbose "`tWorking on $($clu.name) - $($clu | Get-ClusterNode) - $($clustergroup.Name)"

                #Selecting only the events 1201 for the current resource group
                $infolastonline += ($eventson | ? {$_.message -match $clustergroup.name}).TimeCreated

                #Selecting only the events 1204 for the current resource group
                $infolastoffline += ($eventsoff | ? {$_.message -match $clustergroup.name}).TimeCreated

                #Selecting all events regarding resource failures for the current resource group
                $infolasterror = ($eventserror | ? {$_.message -match $clustergroup.name}).TimeCreated

                #Selecting all events regarding current resource group becoming degraded
                $infolastdegraded = ($eventsdegraded | ? {$_.message -match $clustergroup.name}).TimeCreated

                if($detailed) {
                
                    #More detailed: retrieving all the resources inside the current resource group
                    $resources = $clustergroup | Get-ClusterResource

                    foreach($resource in $resources) {
                
                        $infores = [PSCustomObject]@{
                            ClusterType = 'MSCS'
                            ClusterName = $clu.Name
                            ResourceGroup = $resource.OwnerGroup
                            Resource = $resource.Name
                            ServerName = $resource.OwnerNode
                            ResourceStatus = $resource.State
                            LastOnline = if($infolastonline){$infolastonline[0]}else{'NA'}
                            LastOffline = if($infolastoffline){$infolastoffline[0]}else{'NA'}
                            LastError = if($infolasterror){$infolasterror[0]}else{'NA'}}

                        $info += $infores

                        }

                    }
                    
                else {

                    #Less detailed: only resource groups report.
                    
                    $inforesgroup = [PSCustomObject]@{
                            ClusterType = 'MSCS'
                            ClusterName = $clu.Name
                            ResourceGroup = $clustergroup.Name
                            ServerName = $clustergroup.OwnerNode
                            ResourceStatus = $clustergroup.State
                            LastOnline = if($infolastonline){$infolastonline[0]}else{'NA'}
                            LastOffline = if($infolastoffline){$infolastoffline[0]}else{'NA'}
                            LastDegraded = if($infolastdegraded){$infolastdegraded[0]}else{'NA'}}

                    $info += $inforesgroup

                    }
                
                $infolastonline = $null

                $infolastoffline = $null
                
                $infolasterror = $null

                $infolastdegraded = $null

                }
            
         }

    catch
        
        { 
            
        $Error = $True
            
        Write-Warning $_.Exception.Message 

        $info = $null
            
        }

    if($Error)

        {

        Write-Warning "Script aborted. No information returned on $clu.name"

        }

    elseif($Passthru)
       
        {
        
        Write-Verbose 'Returning object for further filtering.'
        
        $info
        
        }

    else

        {

        Write-Verbose "No passthru"

        }
    
    write-verbose 'Finished'
    
    }
In my case I pipe the output of this function to Send-MailMessage so that I receive a weekly report of all my clusters. Hope you like it. If you wish to contribute to this function, you can find it on my GitHub.

Friday, June 17, 2016

Measuring IOPS part 2: DISKSPD

In the previous post I have been talking of IOMETER and we have seen that today it is no more the best tool for disk performance measurement for several reasons.
ENTER DISKSPD
Crawling the net, I have found an excellent blog post by Jose Barreto on a tool named Diskspd.exe recently developed by Microsoft. As explained in that post, Diskspd is a flexible tool that produce many types of synthetic workloads using all kinds of storage, including local disks, LUNs on a SAN, Storage Spaces or SMB file shares.
The version I have found is 2.0.15, released on January 2015, which officially supersedes the SQLIO Disk Subsystem Benchmark Tool. The source code is hosted on the following Github repo:
while the binaries can be found here:
The interesting things about Diskspd are that :
  • it comes with a very nice explanation of the way it works, with examples and use cases
  • the provided performance information is extensive
  • the text output can be parsed by the PowerShell automators we have all become
As a test bed for the initial runs I have decided to use a HP Proliant BL460c Gen8 with 32 CPU cores and a Smart Array P220i controller with a RAID 1 array made up of two 10000 RPM SAS disks.
Actually I used Diskspd against more recent hardware but it failed with the following error:
diskspd test.dat

WARNING: Complete CPU utilization cannot currently be gathered within DISKSPD for this system.
        Use alternate mechanisms to gather this data such as perfmon/logman.
        Active KGroups 2 > 1 and/or processor count 16 > 64.
It looks like there’s a problem with the API that gets CPU stats, as mentioned here.

When you have a look at the Diskspd help, you will se that the only mandatory parameter is a target, in the form of a file path (c:\test.dat), a partition letter (E:) or a physical drive number (#2):

https://3.bp.blogspot.com/-KUOsNXc--5Q/VztXt_aMCVI/AAAAAAAAC0U/nyF2-nzI24oRWbGGyLnd6rPeWP8VW4LTgCKgB/s640/iops06.png

Not being an expert, I decide to run the command with just the mandatory target in an elevated prompt:
diskspd c:\test.dat

Command Line: C:\Windows\system32\diskspd.exe c:\test.dat

Input parameters:

       timespan:   1
       -------------
       duration: 10s
       warm up time: 5s
      cool down time: 0s
       random seed: 0
       path: 'test.dat'
          think time: 0ms
          burst size: 0
          using software and hardware write cache
          performing read test
          block size: 65536
          using sequential I/O (stride: 65536)
          number of outstanding I/O operations: 2
          thread stride size: 0
          threads per file: 1
          using I/O Completion Ports
          IO priority: normal

Results for timespan 1:
*******************************************************************************

actual test time:       10.09s
thread count:           1
proc count:             32

CPU |  Usage |  User  |  Kernel |  Idle
-------------------------------------------
  0| 100.00%|   3.71%|   96.30%|   0.00%
  1|   0.00%|   0.00%|    0.00%| 100.00%
  2|   2.47%|   0.62%|    1.85%|  97.38%
  3|   0.00%|   0.00%|    0.00%| 100.00%
  4|   1.39%|   0.00%|    1.39%|  98.46%
  5|   0.00%|   0.00%|    0.00%| 100.00%
  6|   0.31%|   0.15%|    0.15%|  99.70%
  7|   0.00%|   0.00%|    0.00%| 100.00%
  8|   0.15%|   0.00%|    0.15%|  99.85%
  9|   0.00%|   0.00%|    0.00%| 100.00%
 10|   0.00%|   0.00%|    0.00%| 100.00%
 11|   0.00%|   0.00%|    0.00%| 100.00%
 12|   0.00%|   0.00%|    0.00%| 100.00%
 13|   0.00%|   0.00%|    0.00%| 100.00%
 14|   0.31%|   0.00%|    0.31%|  99.70%
 15|   0.00%|   0.00%|    0.00%| 100.00%
 16|   1.24%|   1.08%|    0.15%|  98.77%
 17|   0.00%|   0.00%|    0.00%| 100.00%
 18|   6.03%|   5.41%|    0.62%|  93.98%
 19|   0.00%|   0.00%|    0.00%| 100.00%
 20|   4.79%|   4.64%|    0.15%|  95.06%
 21|   0.00%|   0.00%|    0.00%| 100.00%
 22|   0.46%|   0.15%|    0.31%|  99.54%
 23|   0.00%|   0.00%|    0.00%| 100.00%
 24|   0.00%|   0.00%|    0.00%| 100.00%
 25|   0.00%|   0.00%|    0.00%| 100.00%
 26|   0.00%|   0.00%|    0.00%| 100.00%
 27|   0.00%|   0.00%|    0.00%| 100.00%
 28|   0.00%|   0.00%|    0.00%| 100.00%
 29|   0.00%|   0.00%|    0.00%| 100.00%
 30|   0.00%|   0.00%|    0.00%| 100.00%
 31|   0.00%|   0.00%|    0.00%| 100.00%
-------------------------------------------
avg.|   3.66%|   0.49%|    3.17%|  96.33%

Total IO
thread |       bytes     |     I/Os     |     MB/s   |  I/O per s |  file
------------------------------------------------------------------------------
    0 |     25740902400 |       392775 |    2432.28 |   38916.47 | test.dat (1024MB)
------------------------------------------------------------------------------
total:       25740902400 |       392775 |    2432.28 |   38916.47

Read IO
thread |       bytes     |     I/Os     |     MB/s   |  I/O per s |  file
------------------------------------------------------------------------------
    0 |     25740902400 |       392775 |    2432.28 |   38916.47 | test.dat (1024MB)
------------------------------------------------------------------------------
total:       25740902400 |       392775 |    2432.28 |   38916.47

Write IO
thread |       bytes    |     I/Os     |     MB/s   |  I/O per s |  file
------------------------------------------------------------------------------
    0 |               0 |            0 |       0.00 |       0.00 | test.dat (1024MB)
------------------------------------------------------------------------------
total:                 0 |            0 |       0.00 |       0.00
Looking at the text output, we can see that by default Diskspd warms up for five seconds before doing a ten seconds long sequential read run with 65k blocks and two outstanding I/O operations against a 1GB file:

https://2.bp.blogspot.com/--IKuJ6o2ce4/VztXvTJK5QI/AAAAAAAAC0o/-xah3zwl_nQ_wko15xA7v_2rHJuP3-kgwCKgB/s400/iops07.png

The result are split in two. In the first section you can find a bit of information on the usage of all the cores:

https://3.bp.blogspot.com/-yKQtCcuj8Pc/VztXuJDiGmI/AAAAAAAAC0Y/cUH-oL8ZRt8xCILMUf7cB4zeISDga5YbwCKgB/s640/iops08.png

In the second section there are the actual IO results, split in three sections:
  • Total IO
  • Read IO
  • Write IO
Since, as I have said, the default test is a read test, there are no data about write operations and the Total IO results match exactly the Read IO results:

https://1.bp.blogspot.com/-WBf8o-nwzi0/VztXuD1B9EI/AAAAAAAAC0c/XF7EljYd9y4kxXCP0iv8Ybk2HyEAbsM3ACKgB/s640/iops09.png

If I try to balance reads and writes, I can see that my RAID 1, which is in theory supposed to have high read performance, and medium write performance (due to a write IO penalty of 2, as explained here), returns oddly uniform results:

https://2.bp.blogspot.com/-5tQKrvxWiCY/VztXuijR4yI/AAAAAAAAC0g/SWF6Ujh9dyQNUaumGEtka6QJJuRlCu2TQCKgB/s640/iops10.png

When I disable caching (-h parameter), I get much lower results, but I still cannot see the impact of the RAID 1 write penalty:

https://3.bp.blogspot.com/--LUmwLSCh1g/VztXuo6NyMI/AAAAAAAAC0k/k755gz0WsRQ38juCeIj19UzTJn9FM7adQCKgB/s640/iops11.png

When to this I add the –L parameter, the same command returns latency information:

https://2.bp.blogspot.com/--Aixu-cCPpc/VztXvXDbXSI/AAAAAAAAC0s/KuoxWj2fOgcDKkBMWGippF46Go7dSi7VgCKgB/s640/iops12.png

By defaults Diskspd uses a 65K I/O size. Changing this parameter to 4k (-b4k) causes an increment in the number of the IOPS for both reads and writes, latency decreases dramatically, but the throughput keeps being the same for reads and writes:

https://2.bp.blogspot.com/-xQ2Ajvo8WS4/VztXv8DulxI/AAAAAAAAC0w/4bH49fvdCkQ5n1MUe9GeFNf6BUn96fZiQCKgB/s640/iops13.png

If I switch from sequential operations to random operations (–r switch), the throughput decreases while read latency increases:

https://3.bp.blogspot.com/-bX8xyPtDTnU/VztXwYRxe2I/AAAAAAAAC00/XTugFSG1fLQhZGa3Ri9Q143jcrDNEH_NACKgB/s640/iops14.png

Again, the results are strangely uniform between reads and writes. We will see why in the third post of this series.
Related Posts Plugin for WordPress, Blogger...