Today I learned about Multivariate Linear Regression, Polynomial Regression and more about Gradient Descent.

I also created a Python script which let me create the layout for posts, especially TIL posts much faster.

As I mentioned in a post a couple of days ago I learned more about multivariate Linear Regression. So Linear Regression with multiple features. The hypothesis function with multiple features looks as following:

By using the definitions of matrix multiplication, the multivariate hypothesis function can als be written as following:

where $ x_0^{(i)} $ is defined as $ 1 $ ($ \theta_0 $ works just as a constant).

The form of gradient descent for multivariate linear regression looks as following:

$ x_0^{(i)} $ is again defined as $ 1 $. The rest basically works much like with one variable.

Some words about Feature Scaling and Learning Rate. Feature scaling is important, because it speeds up the gradient descent. $ \theta $ basically descend quicker on smaller ranges than on larger rangers. Ideally all variables look like $ -1 <= x_{(i)} <= 1 $.

A feature can be scaled with following formula:

where $ \mu_i $ is the average of all the value for feature $ i $ and $ s_i $ is the range of values $ (max-min) $. Example: If $ x_i $ is a feature with range of values from 10 to 100 and a mean of 55, then

Again some words to the Learning Rate $ \alpha $. It is important that $ \alpha $ is not to large, else $ J(\theta) $ may not decrease and not converge. On the other side, if $ \alpha $ is too small, $ J(\theta) $ will converge very slowly.

Last but not least some more words to Polynomial Regression: The hypothesis does not be a linear function, it can be also a polynomial function, e.g. a quadratic function: $ h_{\theta_0} (x) = \theta_0 + \theta_1 x_1 + \theta_2 x_2^2 $. I will probably write more about Polynomial Regression in the next days. Just wanted to note them here.