Gradient Descent 0 - Feature Scaling

In Multiple Variable Linear Regression, the value ranges of different features vary greatly. 

It makes gradient descend take a long way to converge. 

In the house price example, it can be something like this:


The hypothesis contour is a skinny eclipse, then gradient descent takes a zigzag trace.

The basic idea to handle this problem is to make sure all features are on a similar scale. 


After that, hypothesis contour tends to be a circle, makes gradient descent converge faster. 

Another  frequently used formula is:


It makes every feature range from -0.5 to 0.5.


This material comes from machine learning class on coursera.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章