Monday, September 28, 2015

Powershell gymnastics - the modulo operator and beyond

I am starting today a new series of blog posts aimed at improving general knowledge of the scripting functionalities of Powershell. This will be somewhat like gymnastics, or crossfit if you like, aimed at making you more comfortable with the language and with computer programming in general.

I suppose that it's a funny idea to start the series with a post on one of the less known arithmetic operators: the modulo.

THE MODULO OPERATOR
 
Let's take a simple arithmetic problem: what's left over when you divide 13 by 3? The answer is easy to compute: divide 13 by 3 and take the remainder: 1. But how do you compute this in Powershell? The modulo operator, whose historic sign is '%', comes to the rescue. But, before we continue, let's make sure you remember what the terminology is. We are talking here of an Euclidean division, where 13 is the dividend, 3 is the divisor, 4 is the quotient and 1 is the remainder:
13 = 3 * 4 + 1
Now there is an easy way to get just the remainder, which is by using the modulo operator '%'. The syntax is:

dividend % divisor

So:
13 % 3
will give the expected result: 1

Although typically performed with the dividend and the divisor both being integers (we are talking of an Euclidean division of two integers), Powershell allows other types of numeric operands, which is a bit misleading on my opinion. Perl in this case is more strict, and limit itself to an Euclidean division, so, if it receives fractional arguments, Perl will only use the integer portion of the fraction. This means that the following two operations are equivalent in Perl:
 
#my $result1 = 13.3.5 % 3.3;
#my $result2 = 13 % 3;

In Powershell on the contrary, the arguments can be floating-point numbers, so it will accept 13.3 as dividend and 3.3 as divisor and return 0.1 as remainder:
PS C:\> 13.3%3.3
0,100000000000001
PS C:\> 13.3%3.3 | Get-Member

   TypeName: System.Double
If we want Powershell to go strict we have first to constrain the Double into an Integer:
PS C:\> [int]13.3
13
then make the modulo operation:
PS C:\> [int]13.3%[int]3.3
1
 
THE BANKER'S ROUNDING PROBLEM
 
The problem with forcing a double into an integer in modulo operations is that Powershell uses banker's rounding algorithm. This algorithm which means that if there are two nearest integers, like when the decimal part of a number is exactly 0.5, PowerShell rounds always to the nearest even:
PS C:\> [int]10.5
10
PS C:\> [int]11.5
12
PS C:\> [int]12.5
12
PS C:\> [int]13.5
14
See? The resulting integer is always even. This is how it is explained on MSDN:

Rounding to nearest, or banker's rounding
Midpoint values are rounded to the nearest even number. For example, both 3.75 and 3.85 round to 3.8, and both -3.75 and -3.85 round to -3.8. This form of rounding is represented by the MidpointRounding.ToEven enumeration member.
Rounding to nearest is the standard form of rounding used in financial and statistical operations. It conforms to IEEE Standard 754, section 4. When used in multiple rounding operations, it reduces the rounding error that is caused by consistently rounding midpoint values in a single direction. In some cases, this rounding error can be significant.

Though this is fine in most occasions but modulo operations cope badly with that and should be avoided.

When in a modulo operation there is no remainder, such is 5 % 1, the interpreter will return 0:
PS C:\> 5%1
0
 
MODULO OPERATION IN A BOOLEAN WORLD
 
With that in mind, it's useful to know here that you can put a modulo operation in an IF statement. In this case you do not need to check if the output -eq 0, because PowerShell evaluates 0 as boolean $False and 1 as boolean $True, and the code becomes shorter:
PS C:\> if(!(5%1)){"No remainder for this ops"}
No remainder for this ops
For sake of completeness, know that the modulo operation can be executed alongside an assignment to a variable. This is the syntax:

%=n Divide the value by n and assign the remainder

Here's an example:
$number = 13
$number %= 3
$number value will be set to 1, which is the remainder of 13 / 3.

THE DIVREM METHOD
 
Now that you have become familiar with the modulo operator, and to better appreciate it, let's see how you could get the same result going .NET and using the System.Math class. This class offers a DivRem method which beginners have a hard time understanding how it works. The reason is that is takes three parameters when the modulo operation logically needs just two: a dividend and a divisor.

If you use it with two parameters you'll get the following error:
PS C:\> [system.math]::DivRem(13,3)
Cannot find an overload for "DivRem" and the argument count: "2".
At line:1 char:1
+ [system.math]::DivRem(13,3)
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], MethodException
    + FullyQualifiedErrorId : MethodCountCouldNotFindBest
It looks like the third parameter is mandatory. Since we do not know what to pass to DivRem, let's try to give him a fake var:
PS C:\> [system.math]::DivRem(13,3,[ref]$fakevar)
[ref] cannot be applied to a variable that does not exist.
At line:1 char:1
+ [system.math]::DivRem(13,3,[ref]$fakevar)
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (fakevar:VariablePath) [], RuntimeException
    + FullyQualifiedErrorId : NonExistingVariableReference
Finally a clear error message! What's happening here is that Powershell expect the third parameter be a declared variable of type System.Management.Automation.PSReference. Let's give him one:
PS C:\> $Remainder = 0

PS C:\> [system.math]::DivRem(13,3,[ref]$Remainder)
4
As you see, DivRem outputs the quotient (which is the result of the division) straight away. If we want to know the remainder, let's query the $Remainder variable:
PS C:\> $Remainder
1
Cool, it works! And differently from '%', the DivRem method will perform the banker's rounding out of the box:
PS C:\> [system.math]::DivRem(13.3,3.3,[ref]$Remainder2)
4
PS C:\> $Remainder2
1
That's because the Math.DivRem method takes Int32 (the default) or Int64 parameters, not floating-point numbers.

SHIFTING BITS
 
It looks like now you are proficient with modulo operations on this side of the river. The time has come to move on the other side into the binary world to see what else our computers have to offer for modulo operations. You probably already know that you can perform bitwise arithmetic in Powershell.


But, before we start looking at how modulo operation can be performed with bitwise operators, let's gain a better knowledge studying two interesting arithmetic operators which have been added to Powershell 3.0:
  • shr
  • shl
This section will force you to familiarize with binary operations and will make it easier to understand bitwise operations we will perform later.

What the -shr (shift-right) and -shl (shift-left) operators do is to shift bits inside processor registers, and they can also be used as a quick way to divide or multiple the number.

First some binary. We, humans, usually count with base 10, that is we have 10 digits to make numbers with, so we go in order 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, then we start adding them together to go beyond 10 numbers: 10, 11, 12, 13, 14, etc.

In base 2 though, there are only 2 digits to use: 0 and 1. So the same numbers go like this: 1, 10, 11, 100, 101, 110, 111, 1000, 1001, 1010, etc.



The conversion can be easily performed in Powershell using [Convert]:
PS C:\> 0..10 | % {[Convert]::ToString($_,2)}
0
1
10
11
100
101
110
111
1000
1001
1010

Let's add some padding zeroes to make 1 byte long integers like if we would like to ideally fill a processor register (even do processor hardware is of no importance at all here because Powershell lies on a hardware abstraction layer and knows nothing of the underlying architecture, differently from C and C++ which are intended to be highly efficient languages):
PS C:\> 0..10 | % {[Convert]::ToString($_,2).PadLeft(8,'0')}
00000000
00000001
00000010
00000011
00000100
00000101
00000110
00000111
00001000
00001001
00001010

Or if we want Int32 (4 bytes, or, 1 word):
S C:\> 0..10 | % {[Convert]::ToString($_,2).PadLeft(32,'0')}
0000000000000000000000000000000
0000000000000000000000000000001
0000000000000000000000000000010
0000000000000000000000000000011
0000000000000000000000000000100
0000000000000000000000000000101
0000000000000000000000000000110
0000000000000000000000000000111
0000000000000000000000000001000
0000000000000000000000000001001
0000000000000000000000000001010

Be aware that these are not real binary numbers, but just textual representation of binary numbers.

If we use the -shl operator, we can transform binary 2 to binary 4, then to binary 8:

Let's start from decimal 2:
PS C:\> [Convert]::ToString(2,2).PadLeft(8,'0') # decimal 2
00000010

and proceed shifting bits leftward:
PS C:\> [Convert]::ToString(2 -shl 1,2).PadLeft(8,'0') # decimal 4
00000100
PS C:\> [Convert]::ToString(2 -shl 2,2).PadLeft(8,'0') # decimal 8
00001000
PS C:\> [Convert]::ToString(2 -shl 3,2).PadLeft(8,'0') # decimal 16
00010000
PS C:\> [Convert]::ToString(2 -shl 4,2).PadLeft(8,'0') # decimal 32
00100000
PS C:\> [Convert]::ToString(2 -shl 5,2).PadLeft(8,'0') # decimal 64
01000000
PS C:\> [Convert]::ToString(2 -shl 6,2).PadLeft(8,'0') # decimal 128
10000000

In a few words, what's happening here is that -shl is shifting all the bits in the provided value 1 spot the left, and chop of the extra zero. It turns out that this operation corresponds to raising 2 to the 2nd power in a loop:
PS C:\> 0..6 | % {[Convert]::ToString([math]::Pow(2,$_),2)}
1
10
100
1000
10000
100000
1000000

When we add padding we get:
PS C:\> 0..7 | % {[Convert]::ToString([math]::Pow(2,$_),2).PadLeft(8,'0')}
00000001
00000010
00000100
00001000
00010000
00100000
01000000
10000000

which is the equivalent of:
PS C:\> 0..7 | % {[Convert]::ToString(1 -shl $_,2).PadLeft(8,'0')}
00000001
00000010
00000100
00001000
00010000
00100000
01000000
10000000

If you're not lost, at this point you should start feeling well stretched.
 
MODULO IN A BINARY WORLD
 
Now that you are familiar with all the binary representations of powers of 2, you are ready to know that there are other ways of performing modulo operations. Many experienced programmers know well that if they use a bitwise shift for multiplication or division (which is considered one of the slowest arithmetic operations) when they’re multiplying or dividing by a power of two, they will get faster calculations. One of these performance tricks works for modulo too.

The not-so-secret formula is:

x modulo y = (x & (y − 1))

which translates to the following Powershell oneliner:
$number -band ([Math]::Pow(2,$number) - 1)

where:
  • -band is the operator that does a binary-and of the bits in the values on the left and on the right side
  • [Math]::Pow is the Powershell syntax for ^
Let's try to make an example:
PS C:\> 54 % 32
22

The bit pattern of 54 is:
PS C:\> [Convert]::ToString(54,2).PadLeft(8,'0')
00110110

In the example 32 is a power of two: 2^5

Since numbers which are power of two have only one significant bit set to 1 followed by less significant bits set to 0, its binary representation is:
PS C:\> [Convert]::ToString(32,2).PadLeft(8,'0')
00100000

Now if the divisor is a power of two (as 32 is) and we decrement it by one (from 32 to 31 in decimal), this sets all the less significant bits to 1 and the rest to 0. This yields therefore to the following binary representation:
PS C:\> [Convert]::ToString(32-1,2).PadLeft(8,'0')
00011111

If we perform a bitwise-and between 54 and 31 we get:

00110110 (54)
00011111 (31)
---------------
00010110 (22)

which is the equivalent of the following Powershell code:
PS C:\> 54 -band 31
22

Once we do the bitwise-and, the most significant bit from the original value is lost, and this leaves us with the remainder of the original value divided by the divisor.

In a few words, when you perform a modulo of a number against its power of 2, you just take its lower order bits. The number of bits at the end is determined by which power of two you're using.

Simple, isn't it?

Now let's move to see some practical uses of the modulo operations.

EVEN OR ODD
 
Modulo is useful when trying to determine whether a number is even or odd: if a number can be evenly divided by 2 it’s even; if a number can’t be evenly divided by 2 then it’s odd. This translates into Powershell like this:
$number = 101
if($number % 2) 
 {"Odd"} 
  else
 {"Even"}

In the previous example, if the number in the $number variable is even, the operation will return 0, so False, and it will display "Even". If the number is odd, it will return 1, so True, and show "Odd" on the screen.

Of course in Powershell multiple if around modulo operations can be concatenated, for instance in the following example I'll find all the numbers between 1 and 100 that are multiples of 4 and 7:
1..100|?{!($_%4)-and!($_%7)}

28
56
84

As you can see I used the range operator (..) to go through all the numbers between 1 and 100, which I passed down to Where-Object with two conditions. Nice. And the interpreter does a good job of understanding my input even do there are no empty spaces in the line of code. Cool and very handful for code golfing competitions.

THE LEAP YEAR ALGORITHM
 
Modulo can also be used to determine whether a year is a leap year or not (a leap year is a year containing one additional day so to keep our calendar year synchronized with the astronomical year). There of course is a .NET method to get this piece of information, using the IsLeapYear method, as in the following example:
$year = 1900
[DateTime]::IsLeapYear($Year)

But this can be gloriously achieved also with the following Powershell code:
PS C:\> $year = 2012
PS C:\> if(!($year%4)){"$year is a leap year"}
2012 is a leap year

To be precise, our Gregorian calendar removes three leap days every 400 years, so our algorithm has to evolve toward this:
$year = 2012
if((!($year % 4) -and ($year % 100)) -or !($year % 400))
    {
    "$year is a leap year"
    }
else
    {
    "$year is not a leap year"
    }

Note that the logical and operator has higher precedence than logical or and will be evaluated first.
 
OVER-OPTIMIZING THE LEAP YEAR ALGORITHM
 
This is classical example showing the modulo operation at its full power. Notwithstanding that, it turns out that the same result can be achieved using a bitwise-and operation, like those we saw above. The circle is almost full, I guess. We are now proceeding against some optimizations aimed at showing the full power of Powershell.

Let's start optimizing this leap year algorithm.

We know already that 4 being a power of 2, the first test ($year % 4) can be converted to:
($year) -band 3

We can also replace ($year % 100) with:
($year % 25)

This is valid because 100 results from 2 x 2 x 5 x 5. Since ($year % 4) has already checked for factors of 4, we can eliminate that factor from 100, leaving 25.

Concerning the third test ($year % 400), this can be replaced by a bitwise-and operation against the number 15. In fact, like we just saw for the second test, 400 results from 2 x 2 x 2 x 2 x 5 x 5. The factor of 25 can be taken out because we already checked it in the second test. This leaves us with 2 x 2 x 2 x 2, which gives 16. Being 16 a power of 2, we can replace ($year % 16) with:
($year -band 15)

The overly optimized final algorithm is here for you to see:
$year = 2012
if((!($year -band 3) -and ($year % 25)) -or !($year -band 15))
    {
    "$year is a leap year"
    }
else
    {
    "$year is not a leap year"
    }

Funny how a good understanding of binary operations and a good grasp on bitwise operators can make your code evolve toward something more performant.

CONVERTING SECONDS
 
A last example to make sure you have grasped the use of modular arithmetics, is to implement a Powershell script that converts an elapsed time in seconds to hours, minutes, and seconds:
$sec = 1500
$hrs = [math]::Truncate($sec / 3600)
$min = [math]::Truncate(($sec / 60) % 60)
$sec = $sec % 60
"{0}:{1}:{2}" -f $hrs,$min,$sec
CONCLUSION
 
The general idea is that modular arithmetics (and the modulo operation) are particularly useful when numbers wrap around when a certain value is reached (like in a clock).

I hope you enjoyed this first post on Powershell gymnastics. More to come soon.

Do not hesitate to post your examples of operations with modulo and with shifting bits, or to share with your friends if you liked this content.

UPDATE: The second Powershell workout is online.

2 comments:

  1. Great rundown of the operator, thanks.

    I think, in the final example, you need to truncate instead of forcing to [int], which rounds. 14746 give wrong minutes, 14246 gives wrong hours.

    $sec = 14746
    $hrs = [math]::Truncate($sec / 3600)
    $min = [math]::Truncate(($sec / 60) % 60)
    $sec = $sec % 60
    "{0}:{1}:{2}" -f $hrs,$min,$sec

    ReplyDelete
    Replies
    1. Thanks Michael, I was a bit quick on this. I have updated the blog post and replaced casting to an integer with the Truncate method.
      Nice spot!
      Carlo

      Delete

Related Posts Plugin for WordPress, Blogger...