深度學習筆記-----基於TensorFlow2.2.0代碼練習(第二課)

寫在正文之前:
這篇緊接着上一篇的博文
深度學習筆記-----基於TensorFlow2.2.0代碼練習(第一課)
主要寫的是TensorFlow2.0的代碼練習,跟隨着KGP Talkie的【TensorFlow 2.0】實戰進階教程進行學習,並將其中一些不適用的代碼錯誤進行修改。
本文跟隨視頻油管非常火的【TensorFlow 2.0】實戰進階教程(中英字幕+代碼實戰)第二課

課程所需要的數據鏈接:https://pan.baidu.com/s/1Lpo3l3UaPANOGE_HGJf2TQ
提取碼:dqo4
注意:需要把數據放到jupyter目錄下

如何建立第一個ANN

1 數據處理
2 建立輸入層
3 初始隨機化輸入權重W
4 建立隱藏層
5 選擇優化,損失和精確性指標
6 編譯模型
7 使用model.fit 訓練模型
8 評估模型
9 如果有需要的話調整模型

#導入庫
import tensorflow as tf
from tensorflow import keras
from tensorflow.python.keras import Sequential
from tensorflow.python.keras.layers import Flatten,Dense
#導入包
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split#這是爲了把數據分割成訓練集和測試集
dataset = pd.read_csv('customer_Churn_Modelling.csv')#讀取數據,需要把數據放到和此文件的同一目錄
dataset.head()#查看數據
RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
0 1 15634602 Hargrave 619 France Female 42 2 0.00 1 1 1 101348.88 1
1 2 15647311 Hill 608 Spain Female 41 1 83807.86 1 0 1 112542.58 0
2 3 15619304 Onio 502 France Female 42 8 159660.80 3 1 0 113931.57 1
3 4 15701354 Boni 699 France Female 39 1 0.00 2 0 0 93826.63 0
4 5 15737888 Mitchell 850 Spain Female 43 2 125510.82 1 1 1 79084.10 0
X = dataset.drop(labels=['CustomerId','Surname','RowNumber','Exited'],axis =1)#刪除數據中的一些然後存入X中
y = dataset['Exited']#y的數據
X.head()
CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary
0 619 France Female 42 2 0.00 1 1 1 101348.88
1 608 Spain Female 41 1 83807.86 1 0 1 112542.58
2 502 France Female 42 8 159660.80 3 1 0 113931.57
3 699 France Female 39 1 0.00 2 0 0 93826.63
4 850 Spain Female 43 2 125510.82 1 1 1 79084.10
y.head()
0    1
1    0
2    1
3    0
4    0
Name: Exited, dtype: int64
#處理標籤
#將國家Geography和性別gender中的字符轉換爲數字
from sklearn.preprocessing import LabelEncoder
label1 = LabelEncoder()
X['Geography'] = label1.fit_transform(X['Geography'])#將國家通過LabelEncoder轉換爲數值
X.head()
CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary
0 619 0 Female 42 2 0.00 1 1 1 101348.88
1 608 2 Female 41 1 83807.86 1 0 1 112542.58
2 502 0 Female 42 8 159660.80 3 1 0 113931.57
3 699 0 Female 39 1 0.00 2 0 0 93826.63
4 850 2 Female 43 2 125510.82 1 1 1 79084.10
label2 = LabelEncoder()
X['Gender'] = label1.fit_transform(X['Gender'])#將國家通過LabelEncoder轉換爲數值
X.head()
CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary
0 619 0 0 42 2 0.00 1 1 1 101348.88
1 608 2 0 41 1 83807.86 1 0 1 112542.58
2 502 0 0 42 8 159660.80 3 1 0 113931.57
3 699 0 0 39 1 0.00 2 0 0 93826.63
4 850 2 0 43 2 125510.82 1 1 1 79084.10

CreditScore Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Geography_1 Geography_2
0 619 0 42 2 0.00 1 1 1 101348.88 0 0
1 608 0 41 1 83807.86 1 0 1 112542.58 0 1
2 502 0 42 8 159660.80 3 1 0 113931.57 0 0
3 699 0 39 1 0.00 2 0 0 93826.63 0 0
4 850 0 43 2 125510.82 1 1 1 79084.10 0 1
5 645 1 44 8 113755.78 2 1 0 149756.71 0 1
6 822 1 50 7 0.00 2 1 1 10062.80 0 0
7 376 0 29 4 115046.74 4 1 0 119346.88 1 0
8 501 1 44 4 142051.07 2 0 1 74940.50 0 0
9 684 1 27 2 134603.88 1 1 1 71725.73 0 0
#把國家信息轉換爲0到1 的二進制數字,即爲某個國家就顯示1否則爲0
X = pd.get_dummies(X, drop_first=True, columns=['Geography'])
X.head(30)
CreditScore Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Geography_1 Geography_2
0 619 0 42 2 0.00 1 1 1 101348.88 0 0
1 608 0 41 1 83807.86 1 0 1 112542.58 0 1
2 502 0 42 8 159660.80 3 1 0 113931.57 0 0
3 699 0 39 1 0.00 2 0 0 93826.63 0 0
4 850 0 43 2 125510.82 1 1 1 79084.10 0 1
5 645 1 44 8 113755.78 2 1 0 149756.71 0 1
6 822 1 50 7 0.00 2 1 1 10062.80 0 0
7 376 0 29 4 115046.74 4 1 0 119346.88 1 0
8 501 1 44 4 142051.07 2 0 1 74940.50 0 0
9 684 1 27 2 134603.88 1 1 1 71725.73 0 0
10 528 1 31 6 102016.72 2 0 0 80181.12 0 0
11 497 1 24 3 0.00 2 1 0 76390.01 0 1
12 476 0 34 10 0.00 2 1 0 26260.98 0 0
13 549 0 25 5 0.00 2 0 0 190857.79 0 0
14 635 0 35 7 0.00 2 1 1 65951.65 0 1
15 616 1 45 3 143129.41 2 0 1 64327.26 1 0
16 653 1 58 1 132602.88 1 1 0 5097.67 1 0
17 549 0 24 9 0.00 2 1 1 14406.41 0 1
18 587 1 45 6 0.00 1 0 0 158684.81 0 1
19 726 0 24 6 0.00 2 1 1 54724.03 0 0
20 732 1 41 8 0.00 2 1 1 170886.17 0 0
21 636 0 32 8 0.00 2 1 0 138555.46 0 1
22 510 0 38 4 0.00 1 1 0 118913.53 0 1
23 669 1 46 3 0.00 2 0 1 8487.75 0 0
24 846 0 38 5 0.00 1 1 1 187616.16 0 0
25 577 1 25 3 0.00 2 0 1 124508.29 0 0
26 756 1 36 2 136815.64 1 1 1 170041.95 1 0
27 571 1 44 9 0.00 2 0 0 38433.35 0 0
28 574 0 43 3 141349.43 1 1 1 100187.43 1 0
29 411 1 29 0 59697.17 2 1 1 53483.21 0 0

特徵標準化

#用自帶的預處理包進行
from sklearn.preprocessing import StandardScaler
X_train, X_test,y_train,y_test = train_test_split(X,y,test_size = 0.2, random_state = 0, stratify = y)#分測試訓練比例爲20%。隨機關閉,並且按y中類的比例進行分配,避免出現類分佈不均衡
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.fit_transform(X_test)##標準化測試和訓練
y_test
1344    1
8167    0
4747    0
5004    1
3124    1
       ..
9107    0
8249    0
8337    0
6279    1
412     0
Name: Exited, Length: 2000, dtype: int64

構建ANN

model = Sequential()#序列模型
model.add(Dense(X.shape[1],activation='relu',input_dim = X.shape[1]))#輸入層的建立X_shape是提取其所有特徵數量
model.add(Dense(128,activation = 'relu'))#隱藏層建立
model.add(Dense(1,activation = 'sigmoid'))#輸出層建立
WARNING:tensorflow:From F:\Anaconda3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
model.compile(optimizer = 'adam',loss ='binary_crossentropy',metrics=['accuracy'])#採用隨機梯度優化,
model.fit(X_train,y_train.to_numpy(),batch_size=10,epochs=10,verbose=1)
WARNING:tensorflow:From F:\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Epoch 1/10
8000/8000 [==============================] - 1s 94us/sample - loss: 0.4515 - acc: 0.8049
Epoch 2/10
8000/8000 [==============================] - 1s 80us/sample - loss: 0.4185 - acc: 0.8202
Epoch 3/10
8000/8000 [==============================] - 1s 80us/sample - loss: 0.4057 - acc: 0.8324
Epoch 4/10
8000/8000 [==============================] - 1s 77us/sample - loss: 0.3752 - acc: 0.8431
Epoch 5/10
8000/8000 [==============================] - 1s 79us/sample - loss: 0.3507 - acc: 0.8571
Epoch 6/10
8000/8000 [==============================] - 1s 78us/sample - loss: 0.3415 - acc: 0.8591
Epoch 7/10
8000/8000 [==============================] - 1s 79us/sample - loss: 0.3363 - acc: 0.8620
Epoch 8/10
8000/8000 [==============================] - 1s 84us/sample - loss: 0.3345 - acc: 0.8619
Epoch 9/10
8000/8000 [==============================] - 1s 74us/sample - loss: 0.3328 - acc: 0.8602
Epoch 10/10
8000/8000 [==============================] - 1s 74us/sample - loss: 0.3302 - acc: 0.8626





<tensorflow.python.keras.callbacks.History at 0x1d77c75d248>
y_pred = model.predict_classes(X_test)
y_pred
array([[0],
       [0],
       [0],
       ...,
       [0],
       [1],
       [0]])
y_test
1344    1
8167    0
4747    0
5004    1
3124    1
       ..
9107    0
8249    0
8337    0
6279    1
412     0
Name: Exited, Length: 2000, dtype: int64
model.evaluate(X_test, y_test.to_numpy())#利用測試集測試訓練下的模型的準確度
2000/2000 [==============================] - 0s 34us/sample - loss: 0.3583 - acc: 0.8535





[0.3583366745710373, 0.8535]
#另一種計算精度的方法
from sklearn.metrics import confusion_matrix, accuracy_score
confusion_matrix(y_test,y_pred)
array([[1525,   68],
       [ 225,  182]], dtype=int64)
accuracy_score(y_test,y_pred)
0.8535

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章