使用random forest regression帮助HR精确的判断新员工的薪资

原創

HereIcome

2018-09-03 08:34

使用的数据还是上一篇博客https://blog.csdn.net/HereIcome/article/details/80435395中的数据，只是训练的模型使用random forest regression模型。

from sklearn.ensemble import RandomForestRegressor

regressor=RandomForestRegressor(n_estimators=100,random_state=0) //n_estimators是可以调节的参数。

regressor.fit(x,y) //x,y分别是level、salary

运行结果图：

理解RandomForestRegressor：

参考博客链接：https://blog.csdn.net/qq_16633405/article/details/61200502，链接作者：春雨里de太阳

RandomForestRegressor模型建立在DecisionTreeRegressor之上的，若干个DecisionTree组成RandomForest,他的计算过程如下：

step1：Pick at random K data points from the Training set.

step2:Build the Decision Tree associated to these K data points.

step3:Choose the number Ntree of trees you want to build and repeat STEP1&2

step4:For a new data point,make each one of your Ntree trees predict the value of Y to for the data point in question,and assign the new data point the average across all of the predicted Y values.

其中参数 n_estimators : integer, optional (default=10) ，The number of trees in the forest.

较多的子树可以让模型有更好的性能，但同时让你的代码变慢。你应该选择尽可能高的值，只要你的处理器能够承受的住，因为这使你的预测更好更稳定。

引用春雨里de太阳所说的：就像是随机森林，支持向量机，神经网络等机器学习工具都具有高性能。他们有很高的性能，但用户一般并不了解他们实际上是如何工作的。不知道该模型的统计信息不是什么问题，但是不知道如何调整模型来拟合训练数据，这将会限制用户使用该算法来充分发挥其潜力。

王家林老师人工智能AI第11课老师微信13928463918

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

使用random forest regression帮助HR精确的判断新员工的薪资

理解RandomForestRegressor：

spark內核解密

瑜伽練習

（spark）Wordcount

《大學》全文

java mongodb groupby分組查詢

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結