TIL: Cost functions for linear regression

Today I learned about cost functions, which are also known as mean square error (MSE), for linear regression problems.

I just started the Machine Learning Course on Coursera by Andrew Ng to get a better understanding of how Machine Learning works and extends my knowledge in the field.

To recap: Linear regression is a way to model a relationship between X and y. There is also multivariate linear regression where there are multiple Xs to predict a y. Linear regression with one variable can be described as following:

$h_\theta(x) = \theta_0 + \theta_i x$

To measure the accuracy of the hypothesis above, there can be use a cost function, which takes the average difference of all the hypothesis results and where m is the number of training examples.

$J(\theta_0, \theta_1) = \frac{1}{2m}\sum_{i=1}^{m}(h_\theta(x_{i})-y_{i}))^2$

In other words: It is the mean of the difference between the predicted value and the actual value of the hypothesis. Another name for the cost function is also mean squared error (MSE).

For the hypothesis we have to chose $\theta_0$ and $\theta_1$ . Then we can check the hypothesis with the cost function above. The result of the function is always non-negative and values which are closes to zero are better, because it supports the hypothesis. The goal is to minimize $J(\theta_0, \theta_)$ to build an optimal hypothesis.

An example to make things clear: There is an existing training set with the values (1/1), (2/3) and (3/5). We choose -1 for $\theta_0$ and 2 for $\theta_1$ . Maybe you already can imagine this graph, and the linear function and therefore you know that the values for $\theta_0$ and $\theta_1$ are pretty accurate.

$\frac{1}{2*3}\sum_{i=1}^{3}(-3+3x_i-y_i) = \frac{1}{2*3}*((-1+2+1-2)^2+(-1+2*2-3)^2+(-1+2*3-5)^2) = \frac{1}{2*3}*(0^2+0^2+0^2) = \frac{1}{2*3}*0 = 0$

Perfect! For the 3 values above $\theta_0$ and $\theta_1$ are optimal.

Written on April 30th, 2018 by Lasse Schultebraucks

Feel free to share!