Today I learned about cost functions, which are also known as mean square error (MSE), for linear regression problems.

I just started the Machine Learning Course on Coursera by Andrew Ng to get a better understanding of how Machine Learning works and extends my knowledge in the field.

To recap: Linear regression is a way to model a relationship between *X* and *y*.
There is also multivariate linear regression where there are multiple *X*s to predict a *y*. Linear regression with one variable can be described as following:

To measure the accuracy of the hypothesis above, there can be use a cost function, which takes the average difference of all the hypothesis results and where *m* is the number of training examples.

In other words: It is the mean of the difference between the predicted value and the actual value of the hypothesis. Another name for the cost function is also mean squared error (MSE).

For the hypothesis we have to chose and . Then we can check the hypothesis with the cost function above. The result of the function is always non-negative and values which are closes to zero are better, because it supports the hypothesis. The goal is to minimize to build an optimal hypothesis.

An example to make things clear: There is an existing training set with the values *(1/1)*, *(2/3)* and *(3/5)*. We choose -1 for and 2 for .
Maybe you already can imagine this graph, and the linear function and therefore you know that the values for and
are pretty accurate.

Perfect! For the 3 values above and are optimal.