Ridge and Lasso Regression In Python

Manoj Gadde
Analytics Vidhya
Published in
4 min readMar 28, 2021

--

In my previous post, I talked about theory related to ridge and lasso regression and math equations behind them.

In this article let’s implement ridge and lasso regression in python.

note: If you don’t know the maths and theory concepts behind ridge and lasso I highly recommend you to follow this link: https://manojgadde.medium.com/ridge-and-lasso-regression-made-easy-343df45a90b9#4be1-463c284cfb2d

Introduction :

Fortunately implementing any machine learning algorithm is not much difficult and the Scikit-learn library provides many machine learning algorithms. let’s implement ridge and lasso regression using scikit-learn library.

In this, we are using the surprise housing dataset and we are required to build a regression model using regularisation in order to predict the actual value of the prospective properties and decide whether to invest in them or not.

you can find the dataset here: https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data

note: I won’t go through all the preprocessing steps, I will directly jump into ridge and lasso implementation.

we have split the dataset into train size of 70% and test size of 30% and scaled the train and test data using minmaxscaler.

1. Ridge Regression :

Here we have imported ridge from sklearn library and fit the model using X_train and y_train where our y_train contain target variable i.e ‘sale price and X_train contain all independent variables.

One thing to notice here is that we passed an additional parameter called ‘alpha’ into the ridge and this alpha is nothing but lambda value in the math equation of ridge regression and this alpha is also a hyperparameter to tune and we need to find out the optimal alpha value.

we just take a random alpha value of 0.01 and let’s see how our ridge regression model performs on training data. here we got r2 score of 91% on training data.

let’s make predictions on test data also to check how our model is performing on unseen data.

we got r2 score of 89% and it is pretty good but we cannot say this alpha(0.01) value is the only best alpha value and we need to find the optimal alpha value using gridsearchcv and let’s do that

Hyperparameter tuning :

here we have defined kfold(5folds) cross-validation and range of alpha values i.e hyperparameter values and choose the evaluation metric as r2 score and we got an optimal alpha value of 1.0 and using this alpha we will again train our model.

So finally using the optimal alpha value of 1.0 gave the best train(91%) and test(90%) results for ridge regression.

note: ridge regression also reduces the magnitude of coefficients.

2. Lasso Regression :

Here we have imported lasso from sklearn library and fit the model using X_train and y_train where our y_train contains target variable i.e ‘sale price and X_train contain all independent variables.

Hyperparameter tuning :

Here we have used 5 fold cross-validation and defined the range of alpha values and choose the evaluation metric as r2 score and got the best alpha value of 0.001. let’s train our model once again using this alpha value and we got train and test scores as almost similar.

note: lasso regression also makes redundant variables coefficients to zero this means it will help in feature selection.

This concludes our article on python implementation of ridge and lasso regression. I hope this article helped you to understand python implementation.

This is my second medium post, if you liked this article please show your support by clapping for this article below and it really motivates me to write more interesting articles and if you have any questions leave a comment below and I’d love to hear from you.

I’m a data science enthusiast and currently pursuing pg diploma in machine learning & ai from the international institute of information technology Bangalore.

you can also find me in www.linkedin.com/in/manoj-gadde

--

--

Manoj Gadde
Analytics Vidhya

Machine Learning | Deep Learning | Artificial Intelligence