如何用Python构建机器学习模型?

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文,我们将通过 Python 语言包,来构建一些机器学习模型。"}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"构建机器学习模型的模板"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"该 Notebook 包含了用于创建主要机器学习算法所需的代码模板。在 scikit-learn 中,我们已经准备好了几个算法。只需调整参数,给它们输入数据,进行训练,生成模型,最后进行预测。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1.线性回归"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"对于线性回归,我们需要从 sklearn 库中导入 linear_model。我们准备好训练和测试数据,然后将预测模型实例化为一个名为线性回归 LinearRegression 算法的对象,它是 linear_model 包的一个类,从而创建预测模型。之后我们利用拟合函数对算法进行训练,并利用得分来评估模型。最后,我们将系数打印出来,用模型进行新的预测。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"# Import modules\nfrom sklearn import linear_model\n\n# Create training and test subsets\nx_train = train_dataset_predictor_variables\ny_train = train_dataset_predicted_variable\n\nx_test = test_dataset_precictor_variables\n\n# Create linear regression object\nlinear = linear_model.LinearRegression()\n\n# Train the model with training data and check the score\nlinear.fit(x_train, y_train)\nlinear.score(x_train, y_train)\n\n# Collect coefficients\nprint('Coefficient: \\n', linear.coef_)\nprint('Intercept: \\n', linear.intercept_)\n\n# Make predictions\npredicted_values = linear.predict(x_test)"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.逻辑回归"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在本例中,从线性回归到逻辑回归唯一改变的是我们要使用的算法。我们将 LinearRegression 改为 LogisticRegression。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"# Import modules\nfrom sklearn.linear_model import LogisticRegression\n\n# Create training and test subsets\nx_train = train_dataset_predictor_variables\ny_train = train_dataset_predicted_variable\n\nx_test = test_dataset_precictor_variables\n\n# Create logistic regression object\nmodel = LogisticRegression()\n\n# Train the model with training data and checking the score\nmodel.fit(x_train, y_train)\nmodel.score(x_train, y_train)\n\n# Collect coefficients\nprint('Coefficient: \\n', model.coef_)\nprint('Intercept: \\n', model.intercept_)\n\n# Make predictions\npredicted_vaues = model.predict(x_teste)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.决策树"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们再次将算法更改为 DecisionTreeRegressor:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"# Import modules\nfrom sklearn import tree\n\n# Create training and test subsets\nx_train = train_dataset_predictor_variables\ny_train = train_dataset_predicted_variable\n\nx_test = test_dataset_precictor_variables\n\n# Create Decision Tree Regressor Object\nmodel = tree.DecisionTreeRegressor()\n\n# Create Decision Tree Classifier Object\nmodel = tree.DecisionTreeClassifier()\n\n# Train the model with training data and checking the score\nmodel.fit(x_train, y_train)\nmodel.score(x_train, y_train)\n\n# Make predictions\npredicted_values = model.predict(x_test)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"4.朴素贝叶斯"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们再次将算法更改为 DecisionTreeRegressor:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"# Import modules\nfrom sklearn.naive_bayes import GaussianNB\n\n# Create training and test subsets\nx_train = train_dataset_predictor_variables\ny_train = train_dataset_predicted variable\n\nx_test = test_dataset_precictor_variables\n\n# Create GaussianNB object\nmodel = GaussianNB()\n\n# Train the model with training data \nmodel.fit(x_train, y_train)\n\n# Make predictions\npredicted_values = model.predict(x_test)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"5.支持向量机"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在本例中,我们使用 SVM 库的 SVC 类。如果是 SVR,它就是一个回归函数:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"# Import modules\nfrom sklearn import svm\n\n# Create training and test subsets\nx_train = train_dataset_predictor_variables\ny_train = train_dataset_predicted variable\n\nx_test = test_dataset_precictor_variables\n\n# Create SVM Classifier object \nmodel = svm.svc()\n\n# Train the model with training data and checking the score\nmodel.fit(x_train, y_train)\nmodel.score(x_train, y_train)\n\n# Make predictions\npredicted_values = model.predict(x_test)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"6.K- 最近邻"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 KneighborsClassifier 算法中,我们有一个超参数叫做 n_neighbors,就是我们对这个算法进行调整。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"# Import modules\nfrom sklearn.neighbors import KNeighborsClassifier\n\n# Create training and test subsets\nx_train = train_dataset_predictor_variables\ny_train = train_dataset_predicted variable\n\nx_test = test_dataset_precictor_variables\n\n# Create KNeighbors Classifier Objects \nKNeighborsClassifier(n_neighbors = 6) # default value = 5\n\n# Train the model with training data\nmodel.fit(x_train, y_train)\n\n# Make predictions\npredicted_values = model.predict(x_test)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"7.K- 均值"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"# Import modules\nfrom sklearn.cluster import KMeans\n\n# Create training and test subsets\nx_train = train_dataset_predictor_variables\ny_train = train_dataset_predicted variable\n\nx_test = test_dataset_precictor_variables\n\n# Create KMeans objects \nk_means = KMeans(n_clusters = 3, random_state = 0)\n\n# Train the model with training data\nmodel.fit(x_train)\n\n# Make predictions\npredicted_values = model.predict(x_test)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"8.随机森林"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"# Import modules\nfrom sklearn.ensemble import RandomForestClassifier\n\n# Create training and test subsets\nx_train = train_dataset_predictor_variables\ny_train = train_dataset_predicted variable\n\nx_test = test_dataset_precictor_variables\n\n# Create Random Forest Classifier objects \nmodel = RandomForestClassifier()\n\n# Train the model with training data \nmodel.fit(x_train, x_test)\n\n# Make predictions\npredicted_values = model.predict(x_test)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"9.降维"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"# Import modules\nfrom sklearn import decomposition\n\n# Create training and test subsets\nx_train = train_dataset_predictor_variables\ny_train = train_dataset_predicted variable\n\nx_test = test_dataset_precictor_variables\n\n# Creating PCA decomposition object\npca = decomposition.PCA(n_components = k)\n\n# Creating Factor analysis decomposition object\nfa = decomposition.FactorAnalysis()\n\n# Reduc the size of the training set using PCA\nreduced_train = pca.fit_transform(train)\n\n# Reduce the size of the training set using PCA\nreduced_test = pca.transform(test)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"10.梯度提升和 AdaBoost"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"# Import modules\nfrom sklearn.ensemble import GradientBoostingClassifier\n\n# Create training and test subsets\nx_train = train_dataset_predictor_variables\ny_train = train_dataset_predicted variable\n\nx_test = test_dataset_precictor_variables\n\n# Creating Gradient Boosting Classifier object\nmodel = GradientBoostingClassifier(n_estimators = 100, learning_rate = 1.0, max_depth = 1, random_state = 0)\n\n# Training the model with training data \nmodel.fit(x_train, x_test)\n\n# Make predictions\npredicted_values = model.predict(x_test)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们的工作将是把这些算法中的每一个块转化为一个项目。首先,定义一个业务问题,对数据进行预处理,训练算法,调整超参数,获得可验证的结果,在这个过程中不断迭代,直到我们达到满意的精度,做出理想的预测。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文链接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https:\/\/levelup.gitconnected.com\/10-templates-for-building-machine-learning-models-with-notebook-282c4eb0987f"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章