Monday, January 7, 2019

Artificial Neural Networks in PowerShell - part 1

Neural Networks had their beginnings in 1943 when Warren McCulloch, a neurophysiologist, and a young mathematician, Walter Pitts, wrote a paper on how human neurons might work.
They focused on understanding how our brain can produce highly complex patterns by using many basic cells that are bound together. These basic brain cells are known as neurons, and McCulloch and Pitts imagined a highly simplified model of a neuron in their paper. The McCulloch and Pitts model of a neuron, also known as MCP neuron for short, has made an important contribution to the development of Artificial Neural Networks (ANN), which model key features of biological brain neurons.
A few years later, in 1958, Frank Rosenblatt, a neuro-biologist, proposed the Perceptron. The Perceptron is an algorithm for supervised learning of binary classifiers. In other words it is a function that, given a vector of inputs, is able to classify them into a class.
Today I am going to start a series of posts that will explain how to rewrite a Perceptron in PowerShell and that is meant to show how a single node of a neural network works and can be trained following the model defined by Rosenblatt. All of this in PowerShell.

But before we proceed, we’ll need to have a look at how a Perceptron does work. And before we do so, let’s see how we can mathematically emulate a neuron whose functioning can be represented with the following diagram:

As you can see here, and to simplify things, know that a neuron takes a vector of inputs, \$m1 ,\$m2 ,\$m... and produces an output by multiplying each input to different weights, which are real numbers expressing the importance of each input to the output.
Multiplying those weights by the input allows to decide whether a neuron fires or not.
In PowerShell the first thing to do is to define an Invoke-Neuron function which does exactly that, multiplying each input with a pseudo-random weight (notice the SetSeed parameter used during the generation of the first random weight):

The values of the weight parameters are initialized randomly (this stops them all converging to a single value). We will see later that when training starts and the Perceptron is presented with a dataset, those weights are adjusted towards values that give correct output.
As you can notice, in this function we have to call some kind of activation function, meaning a function that decides whether the neuron fires (in case the result approaches 1 or is 1) or not (in case the output approaches 0 or is 0).
Different activation functions exist. McCullogh and Pitts adopted the Heaviside function, which is used for something called Logistic Regression, meaning that the function will produce a binary output: the function produces 1 (or true) when input passes threshold limit whereas it produces 0 (or false) when input does not pass threshold. In context of pattern classification, such an algorithm can be useful to determine if a sample belongs to one class or the other, hence the name binary classifier.
This Heaviside step function can be represented as

and translates to PowerShell as a simple Switch-based function:

The alternative is to use a Non Linear Sigmoid function

which works perfectly to reduce the output to a number between 0 and 1, whatever the input numbers:

In this function [math]::Exp returns the number e raised to the power –t, where e is the unique number whose natural logarithm is equal to one and whose value is approximately equal to 2.71828.
Let’s focus again on the Invoke-Neuron function.
Let’s add another parameter which takes the name of activation function to use:

The produced output for the same input dataset changes from binary to a value between 0 and 1:

You have now probably noticed that a \$bias variable is added to the computation. His role is similar to the one of the constant b of a linear function:

which graphs to a straight line. It allows us to slide this line up and down to fit the prediction with the data better. Without that bias the line will always goes through the axis origin (0, 0) and you may get a poorer fit.
The \$bias is one of the learnable parameters of this model, the others being of course the weights, as we will see when we will develop a Perceptron in the next blog post.
Here’s our final code:
\$w1 = Get-Random -SetSeed 1 -Minimum 0.01 -Maximum 0.99
\$w2 = Get-Random -Minimum 0.01 -Maximum 0.99
\$bias = Get-Random -Minimum 0.01 -Maximum 0.99

function Invoke-Neuron (\$m1, \$m2, \$w1 ,\$w2, \$bias, \$activation) {
\$t = \$m1*\$w1+\$m2*\$w2+\$bias
switch (\$activation) {
heaviside {Heaviside \$t}
sigmoid {Sigmoid \$t}
Default {Sigmoid \$t}
}
}
Invoke-Neuron 4 7 \$w1 \$w2 \$bias 'Heaviside'
Invoke-Neuron 4 7 \$w1 \$w2 \$bias 'Sigmoid'
This piece of code acts like a human neuron, with dendrites (the inputs \$m1, \$m2), synapsis (the weights \$w1, \$w2) and a single output which is similar to the neuron output going through the axon.
Simple isn’t it?
In the next post we will see how we can code a Perceptron in PowerShell, and we’ll try to understand its role in the guess work to determine the best values to give to weights and bias.
Stay tuned.