写在正文之前:
这篇紧接着上一篇的博文
深度学习笔记-----基于TensorFlow2.2.0代码练习(第一课)
主要写的是TensorFlow2.0的代码练习,跟随着KGP Talkie的【TensorFlow 2.0】实战进阶教程进行学习,并将其中一些不适用的代码错误进行修改。
本文跟随视频油管非常火的【TensorFlow 2.0】实战进阶教程(中英字幕+代码实战) 第二课
课程所需要的数据链接:https://pan.baidu.com/s/1Lpo3l3UaPANOGE_HGJf2TQ
提取码:dqo4
注意:需要把数据放到jupyter目录下
如何建立第一个ANN
1 数据处理
2 建立输入层
3 初始随机化输入权重W
4 建立隐藏层
5 选择优化,损失和精确性指标
6 编译模型
7 使用model.fit 训练模型
8 评估模型
9 如果有需要的话调整模型
import tensorflow as tf
from tensorflow import keras
from tensorflow. python. keras import Sequential
from tensorflow. python. keras. layers import Flatten, Dense
import numpy as np
import pandas as pd
from sklearn. model_selection import train_test_split
dataset = pd. read_csv( 'customer_Churn_Modelling.csv' )
dataset. head( )
RowNumber
CustomerId
Surname
CreditScore
Geography
Gender
Age
Tenure
Balance
NumOfProducts
HasCrCard
IsActiveMember
EstimatedSalary
Exited
0
1
15634602
Hargrave
619
France
Female
42
2
0.00
1
1
1
101348.88
1
1
2
15647311
Hill
608
Spain
Female
41
1
83807.86
1
0
1
112542.58
0
2
3
15619304
Onio
502
France
Female
42
8
159660.80
3
1
0
113931.57
1
3
4
15701354
Boni
699
France
Female
39
1
0.00
2
0
0
93826.63
0
4
5
15737888
Mitchell
850
Spain
Female
43
2
125510.82
1
1
1
79084.10
0
X = dataset. drop( labels= [ 'CustomerId' , 'Surname' , 'RowNumber' , 'Exited' ] , axis = 1 )
y = dataset[ 'Exited' ]
X. head( )
CreditScore
Geography
Gender
Age
Tenure
Balance
NumOfProducts
HasCrCard
IsActiveMember
EstimatedSalary
0
619
France
Female
42
2
0.00
1
1
1
101348.88
1
608
Spain
Female
41
1
83807.86
1
0
1
112542.58
2
502
France
Female
42
8
159660.80
3
1
0
113931.57
3
699
France
Female
39
1
0.00
2
0
0
93826.63
4
850
Spain
Female
43
2
125510.82
1
1
1
79084.10
y. head( )
0 1
1 0
2 1
3 0
4 0
Name: Exited, dtype: int64
from sklearn. preprocessing import LabelEncoder
label1 = LabelEncoder( )
X[ 'Geography' ] = label1. fit_transform( X[ 'Geography' ] )
X. head( )
CreditScore
Geography
Gender
Age
Tenure
Balance
NumOfProducts
HasCrCard
IsActiveMember
EstimatedSalary
0
619
0
Female
42
2
0.00
1
1
1
101348.88
1
608
2
Female
41
1
83807.86
1
0
1
112542.58
2
502
0
Female
42
8
159660.80
3
1
0
113931.57
3
699
0
Female
39
1
0.00
2
0
0
93826.63
4
850
2
Female
43
2
125510.82
1
1
1
79084.10
label2 = LabelEncoder( )
X[ 'Gender' ] = label1. fit_transform( X[ 'Gender' ] )
X. head( )
CreditScore
Geography
Gender
Age
Tenure
Balance
NumOfProducts
HasCrCard
IsActiveMember
EstimatedSalary
0
619
0
0
42
2
0.00
1
1
1
101348.88
1
608
2
0
41
1
83807.86
1
0
1
112542.58
2
502
0
0
42
8
159660.80
3
1
0
113931.57
3
699
0
0
39
1
0.00
2
0
0
93826.63
4
850
2
0
43
2
125510.82
1
1
1
79084.10
CreditScore
Gender
Age
Tenure
Balance
NumOfProducts
HasCrCard
IsActiveMember
EstimatedSalary
Geography_1
Geography_2
0
619
0
42
2
0.00
1
1
1
101348.88
0
0
1
608
0
41
1
83807.86
1
0
1
112542.58
0
1
2
502
0
42
8
159660.80
3
1
0
113931.57
0
0
3
699
0
39
1
0.00
2
0
0
93826.63
0
0
4
850
0
43
2
125510.82
1
1
1
79084.10
0
1
5
645
1
44
8
113755.78
2
1
0
149756.71
0
1
6
822
1
50
7
0.00
2
1
1
10062.80
0
0
7
376
0
29
4
115046.74
4
1
0
119346.88
1
0
8
501
1
44
4
142051.07
2
0
1
74940.50
0
0
9
684
1
27
2
134603.88
1
1
1
71725.73
0
0
X = pd. get_dummies( X, drop_first= True , columns= [ 'Geography' ] )
X. head( 30 )
CreditScore
Gender
Age
Tenure
Balance
NumOfProducts
HasCrCard
IsActiveMember
EstimatedSalary
Geography_1
Geography_2
0
619
0
42
2
0.00
1
1
1
101348.88
0
0
1
608
0
41
1
83807.86
1
0
1
112542.58
0
1
2
502
0
42
8
159660.80
3
1
0
113931.57
0
0
3
699
0
39
1
0.00
2
0
0
93826.63
0
0
4
850
0
43
2
125510.82
1
1
1
79084.10
0
1
5
645
1
44
8
113755.78
2
1
0
149756.71
0
1
6
822
1
50
7
0.00
2
1
1
10062.80
0
0
7
376
0
29
4
115046.74
4
1
0
119346.88
1
0
8
501
1
44
4
142051.07
2
0
1
74940.50
0
0
9
684
1
27
2
134603.88
1
1
1
71725.73
0
0
10
528
1
31
6
102016.72
2
0
0
80181.12
0
0
11
497
1
24
3
0.00
2
1
0
76390.01
0
1
12
476
0
34
10
0.00
2
1
0
26260.98
0
0
13
549
0
25
5
0.00
2
0
0
190857.79
0
0
14
635
0
35
7
0.00
2
1
1
65951.65
0
1
15
616
1
45
3
143129.41
2
0
1
64327.26
1
0
16
653
1
58
1
132602.88
1
1
0
5097.67
1
0
17
549
0
24
9
0.00
2
1
1
14406.41
0
1
18
587
1
45
6
0.00
1
0
0
158684.81
0
1
19
726
0
24
6
0.00
2
1
1
54724.03
0
0
20
732
1
41
8
0.00
2
1
1
170886.17
0
0
21
636
0
32
8
0.00
2
1
0
138555.46
0
1
22
510
0
38
4
0.00
1
1
0
118913.53
0
1
23
669
1
46
3
0.00
2
0
1
8487.75
0
0
24
846
0
38
5
0.00
1
1
1
187616.16
0
0
25
577
1
25
3
0.00
2
0
1
124508.29
0
0
26
756
1
36
2
136815.64
1
1
1
170041.95
1
0
27
571
1
44
9
0.00
2
0
0
38433.35
0
0
28
574
0
43
3
141349.43
1
1
1
100187.43
1
0
29
411
1
29
0
59697.17
2
1
1
53483.21
0
0
特征标准化
from sklearn. preprocessing import StandardScaler
X_train, X_test, y_train, y_test = train_test_split( X, y, test_size = 0.2 , random_state = 0 , stratify = y)
scaler = StandardScaler( )
X_train = scaler. fit_transform( X_train)
X_test = scaler. fit_transform( X_test)
y_test
1344 1
8167 0
4747 0
5004 1
3124 1
..
9107 0
8249 0
8337 0
6279 1
412 0
Name: Exited, Length: 2000, dtype: int64
构建ANN
model = Sequential( )
model. add( Dense( X. shape[ 1 ] , activation= 'relu' , input_dim = X. shape[ 1 ] ) )
model. add( Dense( 128 , activation = 'relu' ) )
model. add( Dense( 1 , activation = 'sigmoid' ) )
WARNING:tensorflow:From F:\Anaconda3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
model. compile ( optimizer = 'adam' , loss = 'binary_crossentropy' , metrics= [ 'accuracy' ] )
model. fit( X_train, y_train. to_numpy( ) , batch_size= 10 , epochs= 10 , verbose= 1 )
WARNING:tensorflow:From F:\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Epoch 1/10
8000/8000 [==============================] - 1s 94us/sample - loss: 0.4515 - acc: 0.8049
Epoch 2/10
8000/8000 [==============================] - 1s 80us/sample - loss: 0.4185 - acc: 0.8202
Epoch 3/10
8000/8000 [==============================] - 1s 80us/sample - loss: 0.4057 - acc: 0.8324
Epoch 4/10
8000/8000 [==============================] - 1s 77us/sample - loss: 0.3752 - acc: 0.8431
Epoch 5/10
8000/8000 [==============================] - 1s 79us/sample - loss: 0.3507 - acc: 0.8571
Epoch 6/10
8000/8000 [==============================] - 1s 78us/sample - loss: 0.3415 - acc: 0.8591
Epoch 7/10
8000/8000 [==============================] - 1s 79us/sample - loss: 0.3363 - acc: 0.8620
Epoch 8/10
8000/8000 [==============================] - 1s 84us/sample - loss: 0.3345 - acc: 0.8619
Epoch 9/10
8000/8000 [==============================] - 1s 74us/sample - loss: 0.3328 - acc: 0.8602
Epoch 10/10
8000/8000 [==============================] - 1s 74us/sample - loss: 0.3302 - acc: 0.8626
<tensorflow.python.keras.callbacks.History at 0x1d77c75d248>
y_pred = model. predict_classes( X_test)
y_pred
array([[0],
[0],
[0],
...,
[0],
[1],
[0]])
y_test
1344 1
8167 0
4747 0
5004 1
3124 1
..
9107 0
8249 0
8337 0
6279 1
412 0
Name: Exited, Length: 2000, dtype: int64
model. evaluate( X_test, y_test. to_numpy( ) )
2000/2000 [==============================] - 0s 34us/sample - loss: 0.3583 - acc: 0.8535
[0.3583366745710373, 0.8535]
from sklearn. metrics import confusion_matrix, accuracy_score
confusion_matrix( y_test, y_pred)
array([[1525, 68],
[ 225, 182]], dtype=int64)
accuracy_score( y_test, y_pred)
0.8535