Last week I wrote a post about Machine learning using .Net Framework. Today I would like to write a simple example to create a simple linear regression using F#.

Let us consider the case scenario at Diamondback, a bicycle manufacturing company based in the USA.

There are 3 applications which run on the top of a single SQL Server instance which are: A customer ordering website with a section for direct customer sales and another section for resellers to buy in bulk, A desktop inventory control management application and A reporting solution using Power BI as a front end.

The boss of the company noticed through some basic charting in the Power BI that many resellers are dropping their order depending on the average customer reviews of the product.

After researching some different machine learning techniques, we can settle on using a simple regression to maximize the sales.

First, let’s see what is a Regression?

Regressions attempt to predict one number given a set of different numbers. Linear regressions are written using a formula like this:

y = x0 + x1 + x2 + E

Here, y is the number you want to predict and x0, x1, and x2 are some numbers that might affect y.

Now consider Diamondback problem, y is the number of bikes a retail store will order in a month, x0 is the month of the year, x1 is the order from the previous three months, and x2 is the number of other bikes that are being ordered by their immediate competitors. E is all of the things that our formula cannot account for that still affects the bike sales—like an individual store losing a key sales person.

Our goal, then, is to find as few x parameters as possible that have the greatest impact on y and then make a reasonable attempt to have our website reflect both the predicted value and to influence the users for our benefit.

Now our next step is to build and implement the model. In machine learning, it is not enough to simply implement a model and get some answers.

We also have to be able to speak to know how good our answer really is.So we need to have two vital pieces of information:

  • A model to predict
  • Some way to help us evaluate how good the model is at predicting.

The sum of squares error, often called SSE, is a common way to evaluate a simple linear regression.

The model is created based on all of the y data points and then checked how close that model fits each data point. These differences are squared, then the squared values are summed. The goal is to get the sums of squares as low as possible. Once we have the SSE, we can change our model to try to get the sum of squares lower.

A slightly better variation of the SSE is the Mean Square Error (MSE). The MSE is the SSE divided by the number of observations of the regression. Like the MSE, the number itself is not particularly useful. However, if we take the square root of the MSE, often called the Root Of Mean Square Error, or RMSE, the result is an error measure in the same units as our original numbers. we can use the RMSE when evaluating the predictive capabilities of a simple linear model.

We also have a second measure, called the r2, that calculates how much correlation our model has. The r2 takes the r (in this case, Pearson’s correlation) and squares it. The r2 is always between zero and one, with zero meaning that there is no correlation between x and y and one meaning that the regression line perfectly fits the data. In practical terms, we want a low as possible RMSE with a high as possible r2.

We are going to use Accord.NET which goes better than Math.NET Fit() as it returns the sum of the squared errors and the coefficient of determination (called r squared).

Now Let’s see coding part of this example.  First, we need to create the F# library project as shown on the screen.

Regression1

Open NuGet Package Manager and install Accord.NET to the F# project:

PM> install-package Accord

PM> install-package Accord.Statistics

Add a new script to your project and call it LinearReg.fsx. Then, add the following references and open the script file

Regression2

Next, add a record type, a list of that record type, a connection string, and a query:

Regression3

Next, add some ADO.Net code to extract the data from the database and put it into the list:

Regression4

With the data coming down, let’s see if our models reflect what the boss noticed in the Power BI charting:

Regression5

You will see the following:

Regression6

We now see a 0.29 r2 and a 451 RMSE, which shows a weak relationship between customer reviews and order quantity and that there is a 450 order margin of error.

We now have a model and we can predict some sales. In the next blog, I will put this model in the actual project.

Please feel free to comment your opinion about this article or whatever you feel like telling me. Also if you like this article, don’t forget to share this article with your friends. Thanks!

Happy Coding !!

 

One thought on “Implementation of Linear Regression using F#.Net

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.