一、代碼:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
network_shape=[1,5,10,1]
learning_rate=0.1
display_step=500
num_steps=5000
x_dot=np.linspace(1,2,300,dtype=np.float32)[:,np.newaxis]
y_dot=2*np.power(x_dot,3)+np.power(x_dot,2)+np.random.normal(0,0.5,x_dot.shape)
X_p=tf.placeholder(dtype=tf.float32,shape=[None,network_shape[0]],name="input")
Y_p=tf.placeholder(dtype=tf.float32,shape=[None,network_shape[-1]],name="output")
w={"w1":tf.Variable(tf.random_normal([network_shape[0],network_shape[1]])),
"w2":tf.Variable(tf.random_normal([network_shape[1],network_shape[2]])),
"out":tf.Variable(tf.random_normal([network_shape[2],network_shape[3]]))}
b={"b1":tf.Variable(tf.random_normal([network_shape[1]])),
"b2": tf.Variable(tf.random_normal([network_shape[2]])),
"out": tf.Variable(tf.random_normal([network_shape[3]]))}
def network(x):
layer1=tf.nn.relu(tf.matmul(x,w['w1'])+b['b1'])
layer2=tf.nn.relu(tf.matmul(layer1,w['w2'])+b['b2'])
output=tf.matmul(layer2,w['out'])+b['out']
return output
prediction=network(X_p)
loss = tf.reduce_mean(tf.reduce_sum(tf.square(Y_p-prediction), reduction_indices=[1]))
train_step=tf.train.AdamOptimizer(learning_rate).minimize(loss)
init=tf.global_variables_initializer()
with tf.Session()as sess:
sess.run(init)
Plt=plt.figure().add_subplot(1, 1, 1)
Plt.scatter(x_dot,y_dot)
plt.ion()#使matplotlib的顯示模式轉換爲交互(interactive)模式。即使在腳本中遇到plt.show(),代碼還是會繼續執行
plt.show()
for i in range(1,num_steps+1):
_,Loss=sess.run([train_step,loss], feed_dict={X_p: x_dot, Y_p: y_dot})
if i%display_step ==0 or i ==1:
print("echo : ",i,"loss = ",Loss)
prediction_value=sess.run(prediction,feed_dict={X_p:x_dot})#shape=(300,1)
if i !=1:
Plt.lines.remove(lines[0])#刪去上次畫的圖
# try:
# Plt.lines.remove(lines[0])
# except Exception:
# pass
lines=Plt.plot(x_dot,prediction_value)#
plt.pause(1)# 爲防止matplotlib畫圖過快,畫完圖後自動關閉圖像窗口
二、一腳踩到的坑:
1.placeholder的形狀
由network_shape=[1,5,10,1]可知本網絡結構,由x_dot可知有300個點。在訓練的時候採用的是300個點同時送入網絡,那麼輸入層應容納【None,1】的形狀的輸入,同理輸出形狀應爲【None,1】,這也就是placeholder的形狀。
2.輸入數據的形狀
由1可知,網絡輸入形狀爲【None,1】,那麼輸入的300個點,維度應該爲【300,1】而不是【300】,所以需要用到【:,np.newaxis】增加一個維度。
a=np.array([1,3,5])
print(a,a.shape)
print(a[:,np.newaxis],a[:,np.newaxis].shape)
[1 3 5] (3,)
[[1]
[3]
[5]] (3, 1)
3.均方誤差
loss = tf.reduce_mean(tf.reduce_sum(tf.square(Y_p-prediction), reduction_indices=[1]))
這裏計算的是實際的y_dot和通過網絡的預測值的差異,是一個(300,1)和(300,1)形狀的數據在比較。見文識意:均(reduce_mean,reduce_sum)方(square)誤差(-),注意不要遺漏reduce_sum。
reduction_indices表示函數處理數據的維度,reduction_indices=[1]代表按行處理,reduction_indices=[0]代表按列處理,這裏按行進行sum,後求均值。
舉個例子:
import tensorflow as tf
import numpy as np
a=np.array([1.,3.,5.])[:,np.newaxis]
b=np.array([3.,4.,5.])[:,np.newaxis]
with tf.Session() as sess:
c=sess.run(tf.square(a-b))
print(c,c.shape)
re=sess.run(tf.reduce_sum(tf.square(a-b)))
print('re :',re)
re2 = sess.run(tf.reduce_sum(tf.square(a - b),reduction_indices=[1]))
print('re2 : ' , re2.shape,re2)
re3 = sess.run(tf.reduce_mean(tf.reduce_sum(tf.square(a - b), reduction_indices=[1])))
print('re3 : ', re3)
re4 = sess.run(tf.reduce_sum(tf.square(a - b), reduction_indices=[0]))
print('re4 : ', re4.shape, re4)
[[4.]
[1.]
[0.]] (3, 1)
re : 5.0
re2 : (3,) [4. 1. 0.]
re3 : 1.6666666666666667
re4 : (1,) [5.]
4.init
全局變量初始化放在所有變量後面,否則報錯
三、參考鏈接:
https://blog.csdn.net/xierhacker/article/details/53697515
https://blog.csdn.net/williamyi96/article/details/78313924
https://blog.csdn.net/shine19930820/article/details/78359249
https://blog.csdn.net/qq_34562093/article/details/80611611 reduction_indices