Python實戰：詳解銀行用戶流失預測

項目介紹

這次我們要學習的是銀行用戶流失預測項目，首先先來看看數據，數據分別存放在兩個文件中，’Churn-Modelling.csv’裏面是訓練數據，’Churn-Modelling-Test-Data.csv’裏面是測試數據。下面是數據內容：

數據來源於國外匿名化處理後的真實數據

RowNumber：行號
CustomerID：用戶編號
Surname：用戶姓名
CreditScore：信用分數
Geography：用戶所在國家/地區
Gender：用戶性別
Age：年齡
Tenure：當了本銀行多少年用戶
Balance：存貸款情況
NumOfProducts：使用產品數量
HasCrCard：是否有本行信用卡
IsActiveMember：是否活躍用戶
EstimatedSalary：估計收入
Exited：是否已流失，這將作爲我們的標籤數據

首先先載入一些常用模塊

import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn import neighbors
from sklearn.metrics import classification_report
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import LabelEncoder

然後用numpy讀入數據，因爲數據中有字符串類型的數據，所以讀入數據的時候dtype設置爲np.str

train_data = np.genfromtxt('Churn-Modelling.csv' , delimiter=',' , dtype=np.str)
test_data = np.genfromtxt('Churn-Modelling-Test-Data.csv',delimiter=',',dtype=np.str)

數據切分，表頭不需要，第0到第倒數第2列爲數據，最後1列爲標籤

x_train = train_data[1:,:-1]
y_train = train_data[1:,-1]
x_test = test_data[1:,:-1]
y_test = test_data[1:,-1]

第0,1,2列數據數據分別爲編號，ID，人名，這三個數據對最後的結果應該影響不大，所以可以刪除掉。

x_train = np.delete(x_train,[0,1,2],axis=1)
x_test = np.delete(x_test,[0,1,2],axis=1)

刪除掉0,1,2列數據後剩下的1,2列數據爲國家地區和性別，都是字符型的數據，需要轉化爲數字類型的數據才能構建模型

labelencoder1 = LabelEncoder()
x_train[:,1] = labelencoder1.fit_transform(x_train[:,1])
x_test[:,1] = labelencoder1.transform(x_test[:,1])
labelencoder2 = LabelEncoder()
x_train[:,2] = labelencoder2.fit_transform(x_train[:,2])
x_test[:,2] = labelencoder2.transform(x_test[:,2])

由於讀取數據的時候用的是np.str類型，所以訓練模型之前要先把string類型的數據變成float類型

x_train = x_train.astype(np.float32)
x_test = x_test.astype(np.float32)
y_train = y_train.astype(np.float32)
y_test = y_test.astype(np.float32)

然後做數據標準化

sc = StandardScaler()
x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)

構建KNN模型並檢驗測試集結果

knn = neighbors.KNeighborsClassifier(n_neighbors=5)
knn.fit(x_train, y_train)
predictions = knn.predict(x_test)
print(classification_report(y_test, predictions))

precision recall f1-score support

0.0 0.80 0.95 0.87 740

1.0 0.69 0.33 0.45 260

micro avg 0.79 0.79 0.79 1000

macro avg 0.75 0.64 0.66 1000

weighted avg 0.77 0.79 0.76 1000

構建MLP模型並檢驗測試集結果

mlp = MLPClassifier(hidden_layer_sizes=(20,10) ,max_iter=500)
mlp.fit(x_train,y_train)
predictions = mlp.predict(x_test)
print(classification_report(y_test, predictions))

precision recall f1-score support

0.0 0.82 0.96 0.88 740

1.0 0.77 0.38 0.51 260

micro avg 0.81 0.81 0.81 1000

macro avg 0.79 0.67 0.70 1000

weighted avg 0.80 0.81 0.79 1000

項目打包

百度網盤
密碼：4t6k

Python實戰：詳解銀行用戶流失預測

項目介紹

項目打包

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

free AI online tools All In One

痞子衡嵌入式：恩智浦i.MX RT1xxx系列MCU啓動那些事（12.A）- uSDHC eMMC啓動時間(RT1170)

linux安裝cuda和cudnn

Mellanox網卡開啓SR-IOV

模擬手機設備：使用 Playwright 實現移動端自動化測試

HTML 00 Tutorial

全面系統的AI學習路徑，幫助普通人也能玩轉AI

從零開始：使用 Playwright 腳本錄製實現自動化測試

uni-app實現上拉加載

實戰 | 一行命令實現看圖說話（Google的im2txt模型）

實戰 | 一行命令訓練你的圖像分類模型

實戰| 一行命令對你的圖像視頻進行風格遷移

OpenCV和Zbar兩個Python模塊實現二維碼和條形碼識別

實戰 | 抖音百萬點贊：視頻的字符化

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結