Today I learned about Multivariate Linear Regression, Polynomial Regression and more about Gradient Descent.

I also created a Python script which let me create the layout for posts, especially TIL posts much faster.

As I mentioned in a post a couple of days ago I learned more about multivariate Linear Regression. So Linear Regression with multiple features. The hypothesis function with multiple features looks as following:

By using the definitions of matrix multiplication, the multivariate hypothesis function can als be written as following:

where $x_0^{(i)}$ is defined as $1$ ($\theta_0$ works just as a constant).

The form of gradient descent for multivariate linear regression looks as following:

$x_0^{(i)}$ is again defined as $1$. The rest basically works much like with one variable.

Some words about Feature Scaling and Learning Rate. Feature scaling is important, because it speeds up the gradient descent. $\theta$ basically descend quicker on smaller ranges than on larger rangers. Ideally all variables look like $-1 <= x_{(i)} <= 1$.

A feature can be scaled with following formula:

where $\mu_i$ is the average of all the value for feature $i$ and $s_i$ is the range of values $(max-min)$. Example: If $x_i$ is a feature with range of values from 10 to 100 and a mean of 55, then

Again some words to the Learning Rate $\alpha$. It is important that $\alpha$ is not to large, else $J(\theta)$ may not decrease and not converge. On the other side, if $\alpha$ is too small, $J(\theta)$ will converge very slowly.

Last but not least some more words to Polynomial Regression: The hypothesis does not be a linear function, it can be also a polynomial function, e.g. a quadratic function: $h_{\theta_0} (x) = \theta_0 + \theta_1 x_1 + \theta_2 x_2^2$. I will probably write more about Polynomial Regression in the next days. Just wanted to note them here.