Wednesday, June 26, 2013

My powershell quest for the perfect one-liner

After winning the Advanced section of the 2013 Scripting Games, Mike had the good idea of re-gifting one of the his prizes and started off a competition where his readers were tasked with the writing of the shortest possible one-liner to returns a list of PowerShell cmdlet names that don't have repeating characters in them.

I published a 24-chars-long answer which has just been chosen by Mike as shortest answer. Therefore I suppose it is a good idea to detail here the way I proceeded.

As you know, the Get-Command cmdlet gets all commands that are installed on your computer, including cmdlets, aliases, functions, workflows, filters, scripts, and applications. So I was going to use this cmdlet to retrieve all the others. Fine. Now what I wanted was the shortest possible alias for this cmdlet. To find it I went this way:
PS C:\> Get-Alias | ? {$_.Definition -match "Get-Command"} | Sort -Property length
which returned:
CommandType     Name   ModuleName
-----------     ----   ----------
Alias           gcm -> Get-Command
Just one alias was returned, but three chars long! I couldn't ask anything better than this!

Now, as I said, Get-Command returns by default every type of command, but Mike asked for cmdlets only, so I explored the parameters of Get-Command and at the same time showed to screen their aliases:
PS C:\> gcm Get-Command | % {$_.parameters.values | ft name, aliases}

Name  Aliases
----  -------
Name  {}
Verb  {}
Noun  {}
Module  {PSSnapin}
CommandType {Type}
TotalCount {}
Syntax  {}
ArgumentList {Args}
All  {}
ListImported {}
ParameterName {}
ParameterType {}
Verbose  {vb}
Debug  {db}
ErrorAction {ea}
WarningAction {wa}
ErrorVariable {ev}
WarningVariable {wv}
OutVariable {ov}
OutBuffer {ob}
This way I found out one that was good enough for me: CommandType, whose alias was Type.

At this moment I checked in the help for CommandType param being positional, because in this case I could have skipped specifying the parameter name. Unfortunately it was not:
-CommandType 

   Required?                    false
   Position?                    Named
   Accept pipeline input?       true (ByPropertyName)
   Parameter set name           AllCommandSet
   Aliases                      Type
   Dynamic?                     false
So I had find I way to shorten CommandType or Type. As you should know, there is an incredible piece of the Powershell interpreter named the Parameter Binder that is in charge of analyzing cmdlet parameters. This piece of code is damn smart: it does not requires that you specify the full name of a parameter as long as you specify enough for it to uniquely distinguish what you want.

I easily found out that I could shorten CommandType to just 'C' or Type to just 'Ty' (not just 'T' because there is another 'TotalCount' parameter that starts with the same letter). Great. I had to use 'C'.

Same for the parameter value: Powershell guesses the value of the parameter as long as you give enough letter to make clear what you are looking for. As the possible enumerators of Get-Command -CommandType are "Alias, Function, Filter, Cmdlet, ExternalScript,Application, Script, Workflow, All", we see that that only Cmdlet starts with letter 'C'. Great!

So I was able to find out all possible Powershell cmdlets with a 8 chars one-liner!
gcm -c c
Since Mike asked to return just the names of cmdlets I modified the code this way:
(gcm -c c).name
Now this was the easy part.
I spent some time to analyze the ins and outs of Get-Command, and especially what type of output it produced:
    System.Management.Automation.AliasInfo
    System.Management.Automation.ApplicationInfo
    System.Management.Automation.FunctionInfo
    System.Management.Automation.CmdletInfo
    System.Management.Automation.ExternalScriptInfo
    System.Management.Automation.FilterInfo
    System.Management.Automation.WorkflowInfo
As you can see there are seven possible output types, depending on what you asked. In case of a query for cmdlets here's what I got:
gcm -c c| Get-Member
TypeName: System.Management.Automation.CmdletInfo
I checked on MSDN and found the following information about the CmdletInfo class:

"It provides information about a cmdlet, such as its definition (usage pattern), noun and verb name, and other information."

I wanted to know more and I decided to use the good old GetType() method:
(gcm -c c).GetType()

IsPublic IsSerial Name      BaseType
-------- -------- ----      --------
True     True     Object[]  System.Array
So, what I had for the moment was a command which returned an array of strings.

After some reasoning, I decided to go for Select-String, which was probably the best bet for me.

The Select-String cmdlet works on streams of strings. When you pass it an objects, the Powershell engine converts those objects to strings before passing them to Select-String.

Also, Mike asked for case-insensitive matching and this is what Select-String does by default. Good for me.

At this point of the story I started to write my REGEX expression.

I spent a lot of time verifying the benefit of using lookahead. In the end I understood that my best option was to write a regex that retrieved all the strings with a positive match for duplicate letters and then let Select-String inverse the result.

The regex that I wrote looks like this:
(.).*\1
which means:
  • (.) a numbered group which matches any character (parenthesis capture and implicitly number the expression contained within them)
  • .* any char any number of repetitions
  • \1 against the above numbered group (the number after the backslash is the ordinal position of the capturing group in the regular expression)

Basically for each char, I check whether it's repeated any number of times and if it is then a positive match is returned.

I dind't have to handle the dash cahracter since it alwas comes alone (being the naming of cmdlets always in the form verb-noun, no exceptions).

At this point of my quest, I had a 49 chars solution:
(gcm -c c).name|Select-String -NotMatch '(.).*\1'
Select-String was easily replaced by its shortest alias:
PS C:\> Get-Alias | ? {$_.Definition -match "Select-String"} | Sort -Property length

CommandType     Name   ModuleName
-----------     ----   ----------
Alias           sls -> Select-String

(gcm -c c).name|sls -NotMatch -Pattern '(.).*\1'
I focused then on the -Notmatch switch. Since it is a switch it can be moved anywhere, and does not need beign followed by a pattern, since these are two different parameters:
(gcm -c c).name|sls -Pattern '(.).*\1' -NotMatch
Also I changed it to -N since Select-String has no other parameter starting with a 'N' and my code kept working:
(gcm -c c).name|sls -Pattern '(.).*\1' -N
Luckily, I was also able to remove the 'Pattern' parameter name since this is a positional parameter and its position is '0':
-Pattern
Required? true Position? 0 Accept pipeline input? false Parameter set name (All) Aliases None Dynamic? false
(gcm -c c).name|sls '(.).*\1' -N
Now, there is another interesting feature in Powershell which is its capability to understand where a string starts and where it ends. Powershell definitively knows that
(gcm -c c).name|sls '(.).*\1' -N
and
(gcm -c c).name|sls '(.).*\1'-N
are exactly the same because the regex expression is enclosed in simple quotes, so what come after is another parameter. The additional space after the quote is not required and since the shorted the better, I dropped it!

At this moment I felt I was on a lucky day and changed my solution to:
gcm -c c|sls '(.).*\1'-n
...and, believe it or not, it kept working! I was on a good mood!
Here's the returned cmdlets
You ask what dark magic makes Select-String fetch directly the Name property? Well, I suppose it is because there are three fields returned by the default view of Get-Command: CommandType, Name and ModuleName.
PS C:\> gcm -c c |select -First 1
CommandType Name ModuleName ----------- ---- ---------- Cmdlet Add-BitsFile BitsTransfer
Select-String takes in the inputobject byValue:
-InputObject 

     Required?                    true
     Position?                    Named
     Accept pipeline input?       true (ByValue)
     Parameter set name           Object
     Aliases                      None
     Dynamic?                     false
If you rememeber, cmdlets parameters can accept pipeline input in one of two different ways: ByValue and ByPropertyName.

ByValue means that a parameter can accept piped objects that have the same .NET type as their parameter value or objects that can be converted to that type, and since Select-String is looking for a string to bind to, the first of the three field returned by Get-Command which is a string is Name:
CommandType Property System....CommandType {get;} Name Property string Name {get;} ModuleName Property string ModuleName {get;}
To confirm this, I passed my one-liner to Trace-Command and the result I got confirmed it:
PS C:\> Trace-Command -Name Parameter* -PSHost -Expression { gcm -c c wait-job|sls '(.).*\1'-n }
WriteLine   Argument count: 3
BIND NAMED cmd line args [Get-Command]
BIND arg [c] to parameter [CommandType]
COERCE arg to [System.Management.Automation.CommandTypes]
CONVERT arg type to param type using LanguagePrimitives.ConvertTo
CONVERT SUCCESSFUL using LanguagePrimitives.ConvertTo: [Cmdlet]
BIND arg [Cmdlet] to param [CommandType] SUCCESSFUL
BIND POSITIONAL cmd line args [Get-Command]
BIND arg [wait-job] to parameter [Name]
...
BIND arg [System.String[]] to param [Name] SUCCESSFUL
WriteLine   CurrentParameterSetName = AllCommandSet
WriteLine   CurrentParameterSetName = AllCommandSet
MANDATORY PARAMETER CHECK on cmdlet [Get-Command]
WriteLine   Argument count: 2
BIND NAMED cmd line args [Select-String]
BIND arg [True] to parameter [NotMatch]
COERCE arg to [System.Management.Automation.SwitchParameter]
Parameter and arg types the same, no coercion is needed.
WriteLine           isMandatory = False
BIND arg [True] to param [NotMatch] SUCCESSFUL
BIND POSITIONAL cmd line args [Select-String]
BIND arg [(.).*\1] to parameter [Pattern]
...
BIND arg [System.String[]] to param [Pattern] SUCCESSFUL
WriteLine   CurrentParameterSetName = File
WriteLine   CurrentParameterSetName = File
MANDATORY PARAMETER CHECK on cmdlet [Select-String]
CALLING BeginProcessing
CALLING BeginProcessing
CALLING EndProcessing
BIND PIPELINE object to parameters: [Select-String]
PIPELINE object TYPE = [System.Management.Automation.CmdletInfo]
RESTORING pipeline parameter's original values
Parameter [InputObject] PIPELINE INPUT ValueFromPipeline NO COERCION
WriteLine           Adding PipelineParameter name=InputObject; value=Wait-Job
BIND arg [Wait-Job] to parameter [InputObject]
WriteLine               isMandatory = True
BIND arg [Wait-Job] to param [InputObject] SUCCESSFUL
WriteLine           aParameterWasBound = True
WriteLine           CurrentParameterSetName = Object
WriteLine           aParameterWasBound = False
WriteLine           aParameterWasBound = False
WriteLine           aParameterWasBound = False
WriteLine           CurrentParameterSetName = Object
MANDATORY PARAMETER CHECK on cmdlet [Select-String]
CALLING EndProcessing
That's all for my solution. Feel free to leave a comment or share if you've enjoyed my explaination! Thanks to Mike for this funny puzzle and to all other people that participated (especially Bartek, Lee, Shay and NoHandle) to make it an interesting contest.

3 comments:

  1. Nice step by step description, like it :) Just would use: Get-Aliad -Definition instead of filtering with where-object.
    David

    ReplyDelete
  2. The explanation why the connection of gcm and sls works seamlessly is IMHO bit off. The get-command cmdlet emits System.Management.Automation.CmdletInfo objects. These objects are implicitly converted to string so that the Select-string cmdlet can accept them on input. (gcm | select -First 1).toString(). The type internally defines how it is converted to string.

    ReplyDelete
  3. What do you mean by 'implicitly'? as far as I can tell, there is no conversion, since cmdletinfo is already an array of strings passed over to sls. But you are the expert and I have no idea of what happens behind the scenes.

    Also I noticed that my oneliner does not work on Powershell V4, just on V3. Any idea? I checked sls and gcm manuals and found no changes...

    ReplyDelete

Related Posts Plugin for WordPress, Blogger...