Pytorch -- 簡單的rnn 記不住的 api

原創

Yif_Zhou

2020-02-21 23:14

也不太簡單的流程圖

layer　api

流程圖看到，需要幾個 layer， encoder 這裏就選擇 nn.Embedding, 循環神經

nn.Embedding

torch.nn.Embedding(num_embeddings, embedding_dim, padding_idx=None, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, sparse=False, _weight=None)
- num_embeddings: 詞彙量有多大呀，一共有5000個不同單詞（token)，你給我傳 100 不是很爲難我嘛
- embedding_dim：每個 vector 用多少數字表示捏？
- padding_idx：我這沒有這個貨，你說你用啥填吧，你說的算，不說就按0填入了。
- Output: (*, H), where * is the input shape and H=embedding_dim

循環神經

先回顧一下公式

$h_t =tanh(W_{ih}x _t+b_{ih} +W_{hh}h_{t−1}+b_{hh})$

where $h_t$ is the hidden state at time t,
$x_t$ is the input at time t
$h_{(t-1)}$ is the hidden state of the previous layer at time t-1 or the initial hidden state at time 0.
If nonlinearity is ‘relu’, then ReLU is used instead of tanh.

api 參數

input_size – The number of expected features in the input x，也就是你有多少 token(one hot 情況下），但是之前經過了 embedding layer，這個就爲 embedding dictionary dim,
hidden_size – The number of features in the hidden state h. 這裏其實不太好理解，看回公式，上一個hidden state 要與新進入的 input 有個加和，矩陣運算來講，不同的維度是不能相加減，好在 torch 已經做的很好，這個調試就好啦。
num_layers – Number of recurrent layers. E.g, setting num_layers=2 would mean stacking two RNNs together to form a stacked RNN, with the second RNN taking in outputs of the first RNN and computing the final results. Default: 1
nonlinearity – The non-linearity to use. Can be either ‘tanh’ or ‘relu’. Default: ‘tanh’
bias – If False, then the layer does not use bias weights $b_ih$ and $b_hh$ . Default: True
batch_first – If True, then the input and output tensors are provided as (batch, seq, feature). Default: False
dropout – If non-zero, introduces a Dropout layer on the outputs of each RNN layer except the last layer, with dropout probability equal to dropout. Default: 0
bidirectional – If True, becomes a bidirectional RNN. Default: False

當然 rnn 現在用的已經很少，幾乎被 lstm，gru 取代，但是思路沒什麼差距，會用一個其他也會，這裏不贅述了

decoder

這裏比較簡單，用 nn,Linear() ,太簡單就不多講了

上代碼

class RNNModel(nn.Module):
    """ 一個簡單的循環神經網絡"""
    def __init__(self, rnn_type, ntoken, ninp, nhid, nlayers, dropout=0.5):
        ''' 該模型包含以下幾層:
            - 詞嵌入層
            - 一個循環神經網絡層(RNN, LSTM, GRU)
            - 一個線性層，從hidden state到輸出單詞表
            - 一個dropout層，用來做regularization
        ''' 
        super(RNNModel, self).__init__()
        self.drop = nn.Dropout(dropout)
        self.encoder = nn.Embedding(ntoken, ninp)
        if rnn_type in ['LSTM', 'GRU']:
            self.rnn = getattr(nn, rnn_type)(ninp, nhid, nlayers, dropout=dropout)
        else:
            try:
                nonlinearity = {'RNN_TANH': 'tanh', 'RNN_RELU': 'relu'}[rnn_type]
            except KeyError:
                raise ValueError( """An invalid option for `--model` was supplied,
                                 options are ['LSTM', 'GRU', 'RNN_TANH' or 'RNN_RELU']""")
            self.rnn = nn.RNN(ninp, nhid, nlayers, nonlinearity=nonlinearity, dropout=dropout)
        self.decoder = nn.Linear(nhid, ntoken)
        self.init_weights()
        self.rnn_type = rnn_type
        self.nhid = nhid
        self.nlayers = nlayers

    def init_weights(self):
        initrange = 0.1
        self.encoder.weight.data.uniform_(-initrange, initrange)
        self.decoder.bias.data.zero_()
        self.decoder.weight.data.uniform_(-initrange, initrange)

    def init_hidden(self, bsz, requires_grad=True):
        weight = next(self.parameters())
        if self.rnn_type == 'LSTM':
            return (weight.new_zeros((self.nlayers, bsz, self.nhid), requires_grad=requires_grad),
                    weight.new_zeros((self.nlayers, bsz, self.nhid), requires_grad=requires_grad))
        else:
            return weight.new_zeros((self.nlayers, bsz, self.nhid), requires_grad=requires_grad)
    
    def forward(self, input, hidden):
        ''' Forward pass:
            - word embedding
            - 輸入循環神經網絡
            - 一個線性層從hidden state轉化爲輸出單詞表
        '''
        emb = self.drop(self.encoder(input))
        output, hidden = self.rnn(emb, hidden)
        output = self.drop(output)
        decoded = self.decoder(output.view(output.size(0)*output.size(1), output.size(2)))
        return decoded.view(output.size(0), output.size(1), decoded.size(1)), hidden

Yif_Zhou

發佈了16 篇原創文章 · 獲贊 6 · 訪問量 5721

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Pytorch -- 簡單的rnn 記不住的 api

也不太簡單的流程圖

layer　api

nn.Embedding

循環神經

先回顧一下公式

api 參數

decoder

上代碼

MySQL 核心模塊揭祕 | 18 期 | 鎖在內存里長什麼樣*

使用perf工具生成火焰圖

大齡程序員思考

響應式界面控件DevExtreme * 更強的數據分析和可視化功能

HttpSecurity 是如何組裝過濾器鏈的

數說海南——近6年海南各市縣人口簡單看

長序列中Transformers的高級注意力機制總結

WebStorm 創建 Vue 項目

torch text -- dataset 迷魂陣

樹常考題 ( python 實現）--- 待更新

模型評價指標，精確，精準，召回真的那麼難麼？

兩個棧實現堆

二叉樹鏡像

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

Pytorch -- 簡單的rnn 記不住的 api

也不太簡單的流程圖

layer api

先回顧一下公式

api 參數

decoder

上代碼

layer　api