 # Regression in machine learning

If you want to get started with machine learning then “Road starts with Regression ” and also want to become a data scientist or want to be expert then the regression is the first algorithms to learn.

Important: Linear and Logistic are the two most common and used regression algorithms to start with, these are best for predictive modeling.

Truth is that there are 7 types of regression algorithms which are important to their own point and all have its own value

Types of Regression :

1.   Linear Regression
2.   Logistic  Regression
3.   Polynomial Regression
4.   Stepwise Regression
5.   Ridge Regression
6.   Lasso Regression
7.   ElasticNet Regression

Regression Analysis :

This is a form of predictive modeling technique which analyzes or investigate between dependent (target) and independent variables  (predictor).
Generally used for :

•  Forecasting
•  Time series modeling
•  Casual effect relationship
Example: Rash driving and  number road accident by a driver is one of the popular regression analysis
Why use regression analysis :

As you have already learned regression analysis can be done with two or more variables lets choose and example.
1.  We want to estimate growth in sales of a company based on current economic conditions, you    have the recent company data which indicates that the growth in the sale is around 2 to more     times the growth in the economy, by using this data we can predict the future sale of the company
2.   Regression analysis also allows us to compare the effects of the variables measured on different scales, like the effect of a price change and number of promotional activities

Matrices for  Regression :

•  Number of Independent variables
•  Type dependent variables
•  The shape of the Regression line
Linear Regression :

Most widely used modeling technique and first pick by general people to learn.
Here
Dependent variable is continuous
Independent variable  can be continuous or  discrete

IT is also used to estimate real values cost like  (cost of houses, number of calls,)

It consist of  dependent and independent variables
Example :

Y= a*X   +  b

#####################################################

 import numpy as np import pandas as PD import scipy.stats as stats import matplotlib.pyplot as plt from sklearn import datasets, linear_model # loading CSV file data_frame=pd.read_csv(“Housing.csv“) X=data_frame[‘price‘] Y=data_frame[‘lotsize‘] X=X.values.reshape(len(X),1) Y=Y.values.reshape(len(Y),1) # now time for data splitting into testing sets X_train=X[:–250] X_test=X[–250:] # splitting targets into testing sets Y_train=Y[:–250] Y_test=Y[–250:] # plotting output plt.scatter(X_test,Y_test,color=‘black‘) plt.title(‘Test case data‘) plt.xlabel(‘Size‘) plt.ylabel(‘Price‘) plt.xticks(()) plt.yticks(()) #plt.show() # Now linear regression regr=linear_model.LinearRegression() regr.fit(X_train,Y_train) plt.plot(X_test,regr.predict(X_test),color=‘red‘,linewidth=3) plt.show()