tensorflow 自然語言處理讀書筆記 24-30 0828

原創

2019-09-03 09:58

本文用於記錄下自己的學習過程

一、輸入、變量、輸出和操作

import tensorflow as tf
import numpy as np
import os

# Defining the graph and session

graph = tf.Graph() # Creates a graph
session = tf.InteractiveSession(graph=graph) # Creates a session

# Building the graph

# A placeholder is an symbolic input
x = tf.placeholder(shape=[1,10],dtype=tf.float32,name='x') 

# Variable
W = tf.Variable(tf.random_uniform(shape=[10,5], minval=-0.1, maxval=0.1, dtype=tf.float32),name='W') 
# Variable
b = tf.Variable(tf.zeros(shape=[5],dtype=tf.float32),name='b') 

h = tf.nn.sigmoid(tf.matmul(x,W) + b) # Operation to be performed

# Executing operations and evaluating nodes in the graph
tf.global_variables_initializer().run() # Initialize the variables

# Run the operation by providing a value to the symbolic input x
h_eval = session.run(h,feed_dict={x: np.random.rand(1,10)}) 

print(h_eval)
session.close() # Frees all the resources associated with the session

用餐廳點餐的例子來說，
graph中的每一個小項相當於你點的菜，所有你點的菜（訂單）來組成整個graph
服務員相當於session，他將你的訂單傳達到後臺，當你走了之後這個session就結束了

服務員收到訂單之後將訂單告訴廚房經理，廚房經理此時相當於分佈式中的主服務器
廚房經理將需要的菜分配給兩個worker，worker1是主廚（操作執行器），worker2是廚師（參數服務器）

graph中定義自變量需要用tf.placeholder（佔位符）、定義參數常量用variable（可變）constant（不可變）、定義函數tf.nn.sigmoid（或其他函數）可以用tf.cast來進行類型轉換如：tf.cast(x,dtype=tf.float32)

大體執行過程：

先申請graph session，然後定義其中的自變量，參數常量，需要的一些函數，然後調用session.run得到輸出結果。

二、通道

通道是用來處理大量數據的。使用並行的方式來處理數據，主要用於可以從硬盤中讀取數據，然後提供給需要處理的函數，包含以下元素：

文件名列表
文件名隊列，用於爲輸入讀取器生成文件名
記錄讀取器
解碼器，用於解碼讀取的記錄
預處理步驟（可選）
解碼輸入的隊列


# Defining the graph and session
graph = tf.Graph() # Creates a graph
session = tf.InteractiveSession(graph=graph) # Creates a session

# The filename queue
filenames = ['test%d.txt'%i for i in range(1,4)]
filename_queue = tf.train.string_input_producer(filenames, capacity=3, shuffle=True,name='string_input_producer')

# check if all files are there
for f in filenames:
    if not tf.gfile.Exists(f):
        raise ValueError('Failed to find file: ' + f)
    else:
        print('File %s found.'%f)

# Reader which takes a filename queue and 
# read() which outputs data one by one
reader = tf.TextLineReader()

# ready the data of the file and output as key,value pairs 
# We're discarding the key
key, value = reader.read(filename_queue, name='text_read_op')

# if any problems encountered with reading file 
# this is the value returned
record_defaults = [[-1.0], [-1.0], [-1.0], [-1.0], [-1.0], [-1.0], [-1.0], [-1.0], [-1.0], [-1.0]]

# decoding the read value to columns
col1, col2, col3, col4, col5, col6, col7, col8, col9, col10 = tf.decode_csv(value, record_defaults=record_defaults)
features = tf.stack([col1, col2, col3, col4, col5, col6, col7, col8, col9, col10])

# output x is randomly assigned a batch of data of batch_size 
# where the data is read from the txt files
x = tf.train.shuffle_batch([features], batch_size=3,
                           capacity=5, name='data_batch', 
                           min_after_dequeue=1,num_threads=1)

# QueueRunner retrieve data from queues and we need to explicitly start them
# Coordinator coordinates multiple QueueRunners
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord, sess=session)

# Building the graph by defining the variables and calculations

W = tf.Variable(tf.random_uniform(shape=[10,5], minval=-0.1, maxval=0.1, dtype=tf.float32),name='W') # Variable
b = tf.Variable(tf.zeros(shape=[5],dtype=tf.float32),name='b') # Variable

h = tf.nn.sigmoid(tf.matmul(x,W) + b) # Operation to be performed

# Executing operations and evaluating nodes in the graph
tf.global_variables_initializer().run() # Initialize the variables

# Calculate h with x and print the results for 5 steps
for step in range(5):
    x_eval, h_eval = session.run([x,h]) 
    print('========== Step %d =========='%step)
    print('Evaluated data (x)')
    print(x_eval)
    print('Evaluated data (h)')
    print(h_eval)
    print('')

# We also need to explicitly stop the coordinator 
# otherwise the process will hang indefinitely
coord.request_stop()
coord.join(threads)
session.close()

x = tf.train.shuffle_batch([features], batch_size=3,
capacity=5, name='data_batch',
min_after_dequeue=1,num_threads=1)

batch_size是採樣的批次大小，capacity是數據隊列的容量，min_after_dequeue是出隊後留在隊列中的最小元素數量。

num_threads定義用於生成一批數據的線程數，tf.train.shuffle_batch函數對應生成通道。

coord = tf.train.Coordinator()//線程管理器
threads = tf.train.start_queue_runners(coord=coord, sess=session)//創建線程

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

tensorflow 自然語言處理讀書筆記 24-30 0828

Attention Guided Graph Convolutional Networks for Relation Extraction(ACL19) 閱讀筆記

大一用C#編寫的鬥地主程序

阿里雲服務器搭建及安裝Ambari環境

ACE05 關係抽取數據集

tensorflow 自然語言處理讀書筆記 24-30 0828

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

tensorflow 自然語言處理 讀書筆記 24-30 0828

tensorflow 自然語言處理讀書筆記 24-30 0828