Linear Regression

Introduction

Linear Regression and more generally, Regression analysis, is one of the most fundamental and commonly used methods in machine learning. It is a supervised learning technique where both input features and output labels are required to train the model.

Imagine you want to buy a bike (avid biker here!) and you want to estimate the bike price using different features of the bike such as weight, brand, type (road, city, mountain etc.), condition (new, mint, used etc.), mileage, material (carbon, aluminum, steel etc.), and so on. Linear regression is a possible method. By plotting the bike price vs. every individual feature you can use regression to determine how each of the bike features correlate with the price. The regression function $f$ can be represented as $$ Y=f(X), $$ where $f$ is a linear function, $Y$ is the bike price, and $X$ is the input features (or a subset of input features) we discusesd earlier. the variable $Y$ (bike price in this case) is called dependent variable because we're making the assumption that it's value depends on the independent variable(s) that are the bike features. In other words, our goal is to use the independent vriables to understand the behavior of the dependent variable. Linear regression function $f$ can be further expanded in terms of the independent variables as: $$ Y=f(X)=\theta_0+\theta_1x_1+\theta_2x_1+\dots+\theta_nx_n. $$ This equation is a general expression for a linear regression model with $n$ features (independent variables). Here, $x_i$ represents the feature $i$ and $\theta_i$ is the corresponding coefficient. $\theta_0$ is called bias or intercept. From mathmatical perspective, bias can be defined as the mean of the dependent variable if we set all the independent variables to zero but more on that in another post!