Wednesday, December 3, 2014

Powershell oneliner to get disk usage by file extension

These days storing multimedia files (like family videos, travel pictues, etc) requires a lot of storage and since I am a big fan of SSDs (and those kind of disks are still quite expensive), I wanted to show the simple Powershell one-liner I wrote in the weekend that checks what kind of file extensions are (over)consuming space on all my local drives.

The first step is to retrieve a list of all my local drives, which resumes to a query to Win32_LogicalDisk:

(Get-CIMInstance -class Win32_LogicalDisk -Filter 'drivetype=3').DeviceID
A lot of drives, so a lot of work that Powershell will silently do for me.

Let's start to run Get-ChildItem against this list of local drives and hunt for every file in every subfolder:

Get-ChildItem $((Get-CIMInstance -class Win32_LogicalDisk -Filter 'drivetype=3').DeviceID) `
-File -Recurse
Once you know (thanks Get-Member!) that Get-ChildItem returns the file extension as a full grown property, you can ask Powershell to make the grouping for you:

Get-ChildItem $((Get-CIMInstance -class Win32_LogicalDisk -Filter 'drivetype=3').DeviceID) `
-File -Recurse |

 Group-Object Extension
The next step is to send this result down the pipeline to Select-Object, which is very good at defining what I call 'calculated properties'.

The properties I need for my report are:
- the extension itself
- the total size of all the files for the given extension
- the Count of those files

Since Group-Object returns four properties:

Count       Property   int Count {get;}                                                
Group       Property   System.Collections.ObjectModel.Collection[psobject] Group {get;}
Name        Property   string Name {get;}                                              
Values      Property   System.Collections.ArrayList Values {get;} 
we have to build on top of this to get what we want.

Let's start by letting Select-Object chew the Group properties (which is a PSObject, as you can see above) to get the total file size for each extension:

... | Select-Object ..., @{n='Size';e={$($_.Group | Measure-Object Length -Sum).Sum}}, ...

Name        Size Count
----        ---- -----
.JPG  2507712043  2704
.jpeg     319979     1
.mp4  1605261912   119
.tif    22024960     1

Very well, now we have to beautify this to make it readable. I'll perform rounding on the size of the files after having converted it to megabytes:

... | Select-Object Name, @{n='Size (MB)';e={[math]::Round((($_.Group | Measure-Object Length -Sum).Sum / 1MB), 2)}}, Count
Then I'll change the name property to Extension, suppress the dot at the beginning of the string (through some very complex regex!), and make the resulting string uppercase:

... | Select-Object @{n='Extension';e={($_.Name -replace '^\.').ToUpper()}}, ...
Here's the final result, sorted by total file size by extension:

gci $((gcim Win32_LogicalDisk -Filter 'drivetype=3').DeviceID) -File -Rec |
  Group Extension |

  Select @{
   e={($_.Name -replace '^\.').ToUpper()}

   n="Size (MB)";
   e={[math]::Round((($_.Group | Measure Length -Sum).Sum/1MB),2)}

   Count |

  Sort 'Size (MB)' -Desc

Extension Size (MB) Count
--------- --------- -----
JPG         2391,54  2704
MP4          1530,9   119
TIF              21     1
JPEG           0,31     1
Of course the one-liner can be modified to taste.

You could for instance pass to Get-ChildItem a manual list of drives or folders in the form or an array:

Get-ChildItem @('c:\','d:\','n:\pictures','n:\videos') -File -Recurse
Or you could use a different measure unit from MB. In Powershell V5 there are for instances five measure units: KB, MB, GB, TB and PB. The one you choose depends on the amount of files you have.

Powershell, anything is possible.


  1. Very nice. You've inspired me to take this a step further.

    1. Thanks Jeffery!
      Powershell is like building blocks and improvement possibilities are endless, seel free to make it better.


Related Posts Plugin for WordPress, Blogger...