swldas: April 2023

Implementing a Perceptron

In this part we will start implementing a single perceptron and do some experiments.

first part can be found here.

A single Perceptron

A single Perceptron(neuron) can do a lot of jobs. Yes Like a Neuron in Our nervous system it can do a lot of things. As One grows from Infants to toddler the neurons in our brain get trained(learned) to coordinate between our limbs, coordinate our activities etc...
Similarly the Perceptrons can be trained to classify data as per the training data. but we can only classify linearly separable data.

What is a linearly separable?
Linear separability can be termed as, if the plot of data can be separated by a single straight line. This means depending on the separation criterion, there exists a straight line dividing the data points in to 2 regions.

Now lets see a small code implementing it step by step. NOTE: We need a matrix manipulation library for all our Neural network adventure. For this purpose my tmatlib on github is quite suitable. Though one should not expect it to perform like blas. In my code i used a similar version known as smatlib. As it is continuously growing I have not published it on github. But stay assure the functionality and performance of both are same.

// structure of our perceptron.

typedef struct perceptron
{
	int n_inp;   // number inputs
	matrix W;    // weight matrix
	double lr;   // learning rate
	double bias; // bias
}perceptron;

A straight line on a 2-dimentional plain can be expressed as y=mx+c. The Bias parameter in our perceptron is related to c. Our training data is x and y, x is input and y is expected or target. Our perceptron learns the value of m. Learning rate(lr) determines how fast or slow the perceptron should learn. If Learning rate is too low then it may be trapped in local minima. If it is too high then it may oscillate and may not reach the solution. So Bottom line is one has to experiment with learning rate(lr) value to achieve better result. Weight matrix is initialized with random values. Gradually with training it is adjusted/updated to achieve better result.

I uploaded all the codes at Simple-NeuralNetwork.

In the repo there are 4 files perceptron.h, linearsep.c, and_gate.c and or_gate.c. We will discuss bits from those files and see how they work.

In perceptron.h file, there are functions/methods to create, manipulate, train, display and delete the perceptron. General sequence is

create a perceptron
prepare data set
train
test with predict function

This is simple, is not it? Now we will look into 2 functions, namely predict and train. These functions are the core of this perceptron.

predict:

double predict(perceptron p, double inp[])
{
	double res=0.0;
	
	for(int r = 0; r < p.n_inp; r++)
	{
		res += inp[r]*get_cell(p.W,0,r);
	}
	res += p.bias;
	double ret= sigmoid(res);
	return(ret);
}

This function multiplies inputs with corresponding weights and adds them. Next, to activate the output apply sigmoid function. A sigmoid function ranges from 0 to 1. Please see the first part here where the sum process and sigmoid function is defined.

Train:

  void train(perceptron *p,double *data,double desired)
  {
	double guess = predict(*p,data);
	
	double err = desired - guess;
	for(int n = 0; nn_inp; n++)
	{
		double cur_w = get_cell(p->W,0,n);
		set_cell(&p->W,0,n,cur_w+p->lr*err*data[n]);
	}
	p->bias += p->lr *err;
  }

In this function, perceptron is trained by adjusting weights and bias.The steps can be as follows.

calculate output
find error by subtracting output from target
find delta weight by multiplying error, data(input) and learning rate
find delta bias by multiplying error and learning rate
update weights
update bias

In step 3 we are calculating the delta weights. This is the most important step in whole code. How and why delta value is multiplication of error with input is a topic itself. As an Hint it is gradient, hence a derivative of error term with inputs and weights are to be considered.

Testing

To tryout this implementation, 3 code files were provided. which adheres to the general sequence. First one is linear_sep.c, In this demo program, random data sets are created and the perceptron is trained; then with a known data set it is compared. the output on my system is as bellow.


************Perceptron:**************

number of inputs = 2
Weights:
[ 1.00000000 ]
[ 1.00000000 ]

Bias = 1.000000
*************************************
************Perceptron:**************

number of inputs = 2
Weights:
[ 61.70302487 ]
[ -31.13024636 ]

Bias = 0.622900
*************************************

is (2.000000 < 2 x 20.000000 + 1) predicted = 1.000000

For other 2 demo codes. Can this artificial neuron mimic reliably a digital gate? Well that was the motive when artificial neuron was proposed. It is observed that AND and OR gate are possible to simulate but not XOR gate. WHY??? from bellow diagrams it is obvious that OR function and AND function are linear separable but not the XOR function.

// OR gate implementation.


    // x1 x2 y
    //  0  0 0
    //  0  1 1
    //  1  0 1
    //  1  1 1
    //
    //  0,1-------1,1
    //   |          |
    //  \|          |
    //   |\         |
    //   | \        |
    //  0,0-\-----1,0
    //       \-separating line
    
    // AND gate implementation.
	// x1 x2 y
	//  0  0 0
	//  0  1 0
	//  1  0 0
	//  1  1 1
	//
	//  0,1---\----1,1
	//   |      \    |
	//   |        \  |
	//   |          \|
	//   |           |\-line separating  
	//  0,0-------1,0
// XOR function
// x1  x2  y
//  0   0  0
//  0   1  1
//  1   0  1
//  1   1  0
//        /----------|----- 2 lines separates
//  0,1--/-----1,1   |
//   |  /       |    |
//   | /        |    |
//   |/         |/---|
//   /          /
//  /|         /|
//  0,0-------/-1,0

Output of OR gate

************Perceptron:**************

number of inputs = 2
Weights:
[ 0.00000000 ]
[ 0.00000000 ]

Bias = 1.000000
*************************************
************Perceptron:**************

number of inputs = 2
Weights:
[ 11.48282021 ]
[ 11.48242093 ]

Bias = -5.281243
*************************************
inputs 0, 0 predicted = 0.005060
inputs 1, 0 predicted = 0.997978
inputs 0, 1 predicted = 0.997977
inputs 1, 1 predicted = 1.000000

Output of AND gate

************Perceptron:**************

number of inputs = 2
Weights:
[ 0.00000000 ]
[ 0.00000000 ]

Bias = 1.000000
*************************************
************Perceptron:**************

number of inputs = 2
Weights:
[ 10.22930534 ]
[ 10.22993658 ]

Bias = -15.512550
*************************************
inputs 0, 0 predicted = 0.000000
inputs 1, 0 predicted = 0.005050
inputs 0, 1 predicted = 0.005053
inputs 1, 1 predicted = 0.992943

In the above result if we consider 0.99 as 1 and anything less that 0.005 as 0 then our results are at par with the truth table.

As we saw above XOR is not linearly separable, we can not simulate it with a single perceptron. we need more than one layer to simulate it we will do it in a future post.

Till then happy coding.

Hello There,

Around a year ago we talked. Now several things have changed. I recently joined as a PhD scholar 😃. And decided to soil my hands in AI/ML. During my M.Tech days(2011-2013), we are taught Soft computing. Though I did my Thesis on Network On Chip (NOC) which is related to putting different IPs on to a silicon substrate. I devised a Genetic Algorithm based best mapping as well as a deterministic heuristic based mapping using graphs. both the codes are available on my github page. But I have never paid much attention to ML till 2016, when I was developing a traffic detection application for a client. thought it was never went beyond PoC stage.

Now I thought to put all my knowledge to earn a degree (Yes a Doctorate). To refresh my understandings of my master's degree time, I decided to go step by step implementing building blocks to learn this topic.

We should start by putting a single neuron first. The artificial neurons, the counter part of natural neurons share similar characteristic.

Perceptrons - the most basic form of a neural network · Applied Go

Dendrites are inputs and axon is output. But one thing we must put here which is Activation. It is the strength we provide as output.

in case of perceptrons input are applied through some weights. This means different input channel have different weights. the perceptron sums the weighted inputs and apply some sort of activation before output the result.

So mathematically we can write

output = activation(sum ( inputs x weights))

output = activation(sum ( input array * weights array))

we can say we need 2 one dimensional matrices for inputs and weights. Then upon the result of the multiplication we apply activation function.

Lets assume we have 3 inputs i₁, i₂, i₃ with weights w₁,w₂ and w_3. Now output will be \begin{equation} z=\sum_{1}^{n} i_mw_m \end{equation} next activation of z is output.

There are several activation functions used to activate the output of perceptron. the most used are Binary activation, ReLU(0,x) and Sigmoid(0,1).

We will use sigmoid for our application. \begin{equation}\sigma(z)=\frac{1}{1+e^{-z}}\end{equation}

Next we will see the simple implementation of this perceptron.

Bye Bye for now.

swldas

Sunday 30 April 2023

Neural Network from Scratch Using C (Part-2)

Implementing a Perceptron

A single Perceptron

predict:

Train:

Testing

Saturday 29 April 2023

Neural Network from Scratch using C

Neural Network from Scratch Using C (Part-3)

Followers