What is regularization in Machine Learning

原創

object_allen

2018-08-21 23:12

轉自：https://www.quora.com/What-is-regularization-in-machine-learning

Regularization is a technique used in an attempt to solve the overfitting[1] problem in statistical models.*

First of all, I want to clarify how this problem of overfitting arises.

When someone wants to model a problem, let's say trying to predict the wage of someone based on his age, he will first try a linear regression model with age as an independent variable and wage as a dependent one. This model will mostly fail, since it is too simple.

Then, you might think: well, I also have the age, the sex and the education of each individual in my data set. I could add these as explaining variables.

Your model becomes more interesting and more complex. You measure its accuracy regarding a loss metric L(X,Y)

where X is your design matrix and Y

is the observations (also denoted targets) vector (here the wages).

You find out that your result are quite good but not as perfect as you wish.

So you add more variables: location, profession of parents, social background, number of children, weight, number of books, preferred color, best meal, last holidays destination and so on and so forth.

Your model will do good but it is probably overfitting, i.e. it will probably have poor prediction and generalization power: it sticks too much to the data and the model has probably learned the background noise while being fit. This isn't of course acceptable.

So how do you solve this?

It is here where the regularization technique comes in handy.

You penalize your loss function by adding a multiple of an L1

(LASSO[2]) or an L2 (Ridge[3]) norm of your weights vector w

(it is the vector of the learned parameters in your linear regression). You get the following equation:

L(X,Y)+λN(w)

is either the L1, L2

or any other norm)

This will help you avoid overfitting and will perform, at the same time, features selection for certain regularization norms (the L1

in the LASSO does the job).

Finally you might ask: OK I have everything now. How can I tune in the regularization term λ

One possible answer is to use cross-validation: you divide your training data, you train your model for a fixed value of λ

and test it on the remaining subsets and repeat this procedure while varying λ. Then you select the best λ

that minimizes your loss function.

I hope this was helpful. Let me know if there is any mistakes. I will try to add some graphs and eventually some R or Python code to illustrate this concept.

Also, you can read more about these topics (regularization and cross validation) here:

* Actually this is only one of the many uses. According to Wikipedia, it can be used to solve ill-posed problems. Here is the article for reference: Regularization (mathematics).

As always, make sure to follow me for more insights about machine learning and its pitfalls: http://quora.com/profile/Yassine...

Footnotes

[1] Overfitting

[2] Lasso (statistics)

[3] Tikhonov regularization

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

What is regularization in Machine Learning

如何使用 JS 判斷用戶是否處於活躍狀態

Mono 支持LoongArch架構

lightdb秒級增加列和刪除列（not null帶默認值）

lightdb數據庫超時相關控制參數

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

❤️‍🔥 Solon Cloud Event 新的事務特性與應用

網絡爬蟲的祕密：如何高效地抓取JD.com視頻鏈接

lightdb mysql 8.0兼容之不可見主鍵

使用 JS 實現在瀏覽器控制檯打印圖片 console.image()

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（四）使用域名訪問網站應用

Java的內存回收機制

eclipse mars離線配置hibernate 插件心得

Java Collection

[Java]Socket和ServerSocket學習筆記

RedHat下安裝Python3步驟

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結