DL學習筆記【20】nn包中的各位Simple layers

原創

Sun7_She

2020-06-29 07:37

來自教程：

https://github.com/torch/nn/blob/master/doc/simple.md

這個還是比較容易明白，只簡單寫一下咯

Parameterized Modules

Linear

公式如下：

y = Ax + b

SparseLinear

稀疏的，輸入的x與普通的有所不同：

x = torch.Tensor({ {1, 0.1}, {2, 0.3}, {10, 0.3}, {31, 0.2} })

 print(x)

  1.0000   0.1000
  2.0000   0.3000
 10.0000   0.3000
 31.0000   0.2000

第一個參數爲位置，第二個參數爲該位置的數值

Bilinear

公式如下：

\forall k: y_k = x_1 A_k x_2 + b

例子代碼如下：

 input = {torch.randn(128, 10), torch.randn(128, 5)}  -- 128 input examples
 module:forward(input)

PartialLinear

可以只取輸入數據的一部分來用，比如輸入是5維，我們可以只取兩個維度來計算。之後還可以恢復使用5維。

例子代碼：

module = nn.PartialLinear(5, 3)  -- 5 inputs, 3 outputs
module:setPartition(torch.Tensor({2,4})) -- only compute the 2nd and 4th indices out of a total of 5 indices

Add

只學習bias

CAdd

多維bias

Mul

只學習w

CMul

多維w

Euclidean

公式：

y_j = || w_j - x ||

權重和輸入爲什麼可以相減？

WeightedEuclidean

公式：

y_j = || c_j * (w_j - x) ||

Cosine

公式：

y_j = (x · w_j) / ( || w_j || * || x || )

identity的代碼沒有看懂

pred_mlp = nn.Sequential()  -- A network that makes predictions given x.
pred_mlp:add(nn.Linear(5, 4))
pred_mlp:add(nn.Linear(4, 3))

xy_mlp = nn.ParallelTable() -- A network for predictions and for keeping the
xy_mlp:add(pred_mlp)        -- true label for comparison with a criterion
xy_mlp:add(nn.Identity())   -- by forwarding both x and y through the network.

mlp = nn.Sequential()       -- The main network that takes both x and y.
mlp:add(xy_mlp)             -- It feeds x and y to parallel networks;
cr = nn.MSECriterion()
cr_wrap = nn.CriterionTable(cr)
mlp:add(cr_wrap)            -- and then applies the criterion.

for i = 1, 100 do           -- Do a few training iterations
   x = torch.ones(5)        -- Make input features.
   y = torch.Tensor(3)
   y:copy(x:narrow(1,1,3))  -- Make output label.
   err = mlp:forward{x,y}   -- Forward both input and output.
   print(err)               -- Print error from criterion.

   mlp:zeroGradParameters() -- Do backprop...
   mlp:backward({x, y})
   mlp:updateParameters(0.05)
end

寫好多啊。。。不想寫啦。。。直接複製粘貼好啦，哈哈哈哈

Modules that adapt basic Tensor methods :

Copy :a copy of the input with type casting ; 看解釋好像是把input複製到output中，不太瞭解用處嗯。。output和input的值一樣多麼？

Narrow : a narrow operation over a given dimension ;

Replicate : repeats input n times along its first dimension ;

Reshape : a reshape of the inputs ;

View : a view of the inputs ;

Contiguous : contiguous of the inputs ;

Select : a select over a given dimension ;

MaskedSelect : a masked select module performs the torch.maskedSelect operation ;

Index : a index over a given dimension ;

Squeeze : squeezes the input;

Unsqueeze : unsqueeze the input, i.e., insert singleton dimension;

Transpose : transposes the input ;

Modules that adapt mathematical Tensor methods :

AddConstant : adding a constant ;

MulConstant : multiplying a constant ;

Max : a max operation over a given dimension ;

Min : a min operation over a given dimension ;

Mean : a mean operation over a given dimension ;

Sum : a sum operation over a given dimension ;

Exp : an element-wise exp operation ;

Log : an element-wise log operation ;

Abs : an element-wise abs operation ;

Power : an element-wise pow operation ;

Square : an element-wise square operation ;

Sqrt : an element-wise sqrt operation ;

Clamp : an element-wise clamp operation ;

Normalize : normalizes the input to have unit L_p norm ;

MM : matrix-matrix multiplication (also supports batches of matrices) ;

Miscellaneous Modules :

BatchNormalization : mean/std normalization over the mini-batch inputs (with an optional affine transform) ;

PixelShuffle : Rearranges elements in a tensor of shape [C*r, H, W] to a tensor of shape [C, H*r, W*r] ;

Identity : forward input as-is to output (useful with ParallelTable) ;

Dropout : masks parts of the input using binary samples from a bernoulli distribution ;

SpatialDropout : same as Dropout but for spatial inputs where adjacent pixels are strongly correlated ;

VolumetricDropout : same as Dropout but for volumetric inputs where adjacent voxels are strongly correlated ;

Padding : adds padding to a dimension ;

L1Penalty : adds an L1 penalty to an input (for sparsity) ;

GradientReversal : reverses the gradient (to maximize an objective function) ;

GPU : decorates a module so that it can be executed on a specific GPU device.

TemporalDynamicKMaxPooling : selects the k highest values in a sequence. k can be calculated based on sequence length ;

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

DL學習筆記【20】nn包中的各位Simple layers

DAPPER 事務 TRANSACTION

Java中線程的創建方式

total variation

win10 - Texlive - File numcompress.sty not found 解決方案

人臉數據庫簡要介紹

DL學習筆記【20】nn包中的各位Simple layers

巨坑。。cuda！隨手記錄一點經驗（慎點-估計只有自己能看懂）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結