ptb_producer註釋

def ptb_producer(raw_data, batch_size, num_steps, name=None):
  """Iterate on the raw PTB data.

  This chunks up raw_data into batches of examples and returns Tensors that
  are drawn from these batches.

  Args:
    raw_data: one of the raw data outputs from ptb_raw_data.
    batch_size: int, the batch size.
    num_steps: int, the number of unrolls.
    name: the name of this operation (optional).

  Returns:
    A pair of Tensors, each shaped [batch_size, num_steps]. The second element
    of the tuple is the same data time-shifted to the right by one.

  Raises:
    tf.errors.InvalidArgumentError: if batch_size or num_steps are too high.
  """
  with tf.name_scope(name, "PTBProducer", [raw_data, batch_size, num_steps]):
    #原始數據就是一個個的單詞，這裏將原始數據轉換爲tensor
    raw_data = tf.convert_to_tensor(raw_data, name="raw_data", dtype=tf.int32)
    #求單詞的個數
    data_len = tf.size(raw_data)
    #得到總共批的個數
    batch_len = data_len // batch_size
    #將樣本進行reshape
    #shape的行數是一個批的大小，最後處理的時候是一列一列處理的
    #shape的列數是總共批的個數
    data = tf.reshape(raw_data[0 : batch_size * batch_len],
                      [batch_size, batch_len])

    #epoch_size是用總的批數除以時間步長長度
    #得到的就是運行一個epoch需要運行num_steps的個數
    epoch_size = (batch_len - 1) // num_steps
    assertion = tf.assert_positive(
        epoch_size,
        messageepoch_size = (batch_len - 1) // num_step="epoch_size == 0, decrease batch_size or num_steps")
    with tf.control_dependencies([assertion]):
      epoch_size = tf.identity(epoch_size, name="epoch_size")

    #產生一個隊列，隊列的長度爲epoch_size，未對樣本打亂
    #i是一個出列的操作，每次出列1，也就是一個num_steps
    i = tf.train.range_input_producer(epoch_size, shuffle=False).dequeue()
    #將數據進行切片，起始點是[0, i * num_steps]
    #終止點是[batch_size, (i + 1) * num_steps]
    #其中終止點的batch_size代表的是維度
    #(i + 1) * num_steps代表的是數據的長度
    #這裏即將data數據從第i * num_steps列開始，向後取(i + 1) * num_steps列，即一個num_steps的長度
    x = tf.strided_slice(data, [0, i * num_steps],
                         [batch_size, (i + 1) * num_steps])
    #將取到的數據reshape一下
    x.set_shape([batch_size, num_steps])
    #y的切法和x類似，只是y要向後一列移動一個單位，因爲這裏是根據上一個單詞預測下一個單詞
    y = tf.strided_slice(data, [0, i * num_steps + 1],
                         [batch_size, (i + 1) * num_steps + 1])
    y.set_shape([batch_size, num_steps])
    return x, y

ptb_word_lm.py
中一些參數的理解：
num_steps，也就是rnn中的time_steps，我的理解是一句話，有time_steps個單詞，
有n句話，那麼會形成一個n x time_steps的矩陣，那麼輸入的時候就是每次輸入每句話中相同時刻的單詞，
然後會得到一個輸出，假設也爲一個單詞，那麼經過time_steps-1次處理後，
就會預測到n x (time_steps-1)個單詞，然後與label相比較，求誤差就會得到梯度，
然後這個梯度反向傳播的時候，能夠到達的範圍就是time_steps

config.hidden_size，這個指的是隱藏層的個數，同時也是輸出向量ht的維度，同時也是輸入詞embedding後的向量維數
那麼也就是說，輸入向量大小爲config.hidden_size時，而對應循環網絡這個循環而言，真正循環的次數，應該是循環time_steps
次，不知道理解的對不對

下面是一個rnn 一個step更新的簡單示意圖與解釋
classRNN:

# ...

def step(self, x):

# update the hidden state

self.h = np.tanh(np.dot(self.W_hh, self.h) + np.dot(self.W_xh, x))

# compute the output vector

y = np.dot(self.W_hy, self.h)

return y

W_xh：輸入矩陣
W_hy：輸出矩陣
W_hh：網絡連接，W_hh理論上可以可以刻畫輸入的整個歷史對於最終輸出的任何反饋形式
x：輸入
y：輸出
h：隱藏變量，也就是網絡每個神經元的狀態，
就是通常說的神經網絡本體，也正是循環得以實現的基礎，
因爲它如同一個可以儲存無窮歷史信息(理論上)的水庫

附lstm結構示意圖：

tensorflow官網有lstm語言建模的例子，也可以使用lstm識別手寫數字：

https://morvanzhou.github.io/tutorials/machine-learning/tensorflow/5-08-RNN2/

ptb_producer註釋

MySQL 核心模塊揭祕 | 18 期 | 鎖在內存里長什麼樣*

使用perf工具生成火焰圖

HttpSecurity 是如何組裝過濾器鏈的

數說海南——近6年海南各市縣人口簡單看

長序列中Transformers的高級注意力機制總結

WebStorm 創建 Vue 項目

大齡程序員思考

響應式界面控件DevExtreme * 更強的數據分析和可視化功能

caffe用python產生prototxt文件

selu激活函數和自歸一化網絡(SNN)

tf moving average

tensorflow調參總結（不斷更新中）

caffe group參數

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結