序
- Transformer代碼看完,正好試試它分類的效果,雖然不太正經…
參數
-
詞向量隨機初始化
-
加了原論文的位置向量和mask
-
兩個head
-
三個block塊
-
num_epochs = 20 # epochs
-
batch_size = 32 # batch_size
-
代碼其實就是transformer的encode部分,除了最後輸出要變一下用於分類。
-
代碼對比看了一下:文本分類——Transformer模型
-
代碼依舊有過擬和問題,但是效果比之前的分類其都好。
訓練效果
It is epoch 1
step: 100,train loss: 0.911, train accuracy: 0.500, val loss: 0.782, val accuracy: 0.509,training speed: 0.426sec/batch *
step: 200,train loss: 0.779, train accuracy: 0.344, val loss: 0.689, val accuracy: 0.507,training speed: 0.409sec/batch
It is epoch 2
step: 300,train loss: 0.702, train accuracy: 0.625, val loss: 0.817, val accuracy: 0.482,training speed: 0.161sec/batch
step: 400,train loss: 0.712, train accuracy: 0.469, val loss: 0.697, val accuracy: 0.483,training speed: 0.422sec/batch
step: 500,train loss: 0.833, train accuracy: 0.500, val loss: 0.865, val accuracy: 0.505,training speed: 0.422sec/batch
It is epoch 3
step: 600,train loss: 0.685, train accuracy: 0.500, val loss: 0.689, val accuracy: 0.483,training speed: 0.285sec/batch
step: 700,train loss: 0.670, train accuracy: 0.688, val loss: 0.708, val accuracy: 0.482,training speed: 0.420sec/batch
It is epoch 4
step: 800,train loss: 0.727, train accuracy: 0.375, val loss: 0.686, val accuracy: 0.515,training speed: 0.008sec/batch *
step: 900,train loss: 0.701, train accuracy: 0.500, val loss: 0.693, val accuracy: 0.510,training speed: 0.407sec/batch
step: 1000,train loss: 0.728, train accuracy: 0.438, val loss: 0.697, val accuracy: 0.482,training speed: 0.423sec/batch
It is epoch 5
step: 1100,train loss: 0.742, train accuracy: 0.375, val loss: 0.690, val accuracy: 0.507,training speed: 0.157sec/batch
step: 1200,train loss: 0.702, train accuracy: 0.469, val loss: 0.687, val accuracy: 0.486,training speed: 0.408sec/batch
step: 1300,train loss: 0.702, train accuracy: 0.500, val loss: 0.690, val accuracy: 0.507,training speed: 0.402sec/batch
It is epoch 6
step: 1400,train loss: 0.771, train accuracy: 0.438, val loss: 0.702, val accuracy: 0.508,training speed: 0.272sec/batch
step: 1500,train loss: 0.686, train accuracy: 0.500, val loss: 0.686, val accuracy: 0.504,training speed: 0.391sec/batch
It is epoch 7
step: 1600,train loss: 0.765, train accuracy: 0.438, val loss: 0.722, val accuracy: 0.483,training speed: 0.016sec/batch
step: 1700,train loss: 0.698, train accuracy: 0.500, val loss: 0.687, val accuracy: 0.482,training speed: 0.389sec/batch
step: 1800,train loss: 0.680, train accuracy: 0.562, val loss: 0.704, val accuracy: 0.510,training speed: 0.394sec/batch
It is epoch 8
step: 1900,train loss: 0.700, train accuracy: 0.500, val loss: 0.686, val accuracy: 0.509,training speed: 0.149sec/batch
step: 2000,train loss: 0.685, train accuracy: 0.594, val loss: 0.687, val accuracy: 0.510,training speed: 0.432sec/batch
step: 2100,train loss: 0.691, train accuracy: 0.562, val loss: 0.693, val accuracy: 0.482,training speed: 0.552sec/batch
It is epoch 9
step: 2200,train loss: 0.687, train accuracy: 0.531, val loss: 0.684, val accuracy: 0.549,training speed: 0.371sec/batch *
step: 2300,train loss: 0.575, train accuracy: 0.781, val loss: 0.693, val accuracy: 0.607,training speed: 0.527sec/batch *
It is epoch 10
step: 2400,train loss: 0.594, train accuracy: 0.750, val loss: 0.607, val accuracy: 0.701,training speed: 0.031sec/batch *
step: 2500,train loss: 0.474, train accuracy: 0.844, val loss: 0.601, val accuracy: 0.725,training speed: 0.531sec/batch *
step: 2600,train loss: 0.532, train accuracy: 0.750, val loss: 0.667, val accuracy: 0.721,training speed: 0.516sec/batch
It is epoch 11
step: 2700,train loss: 0.416, train accuracy: 0.875, val loss: 0.714, val accuracy: 0.740,training speed: 0.208sec/batch *
step: 2800,train loss: 0.374, train accuracy: 0.875, val loss: 0.724, val accuracy: 0.754,training speed: 0.515sec/batch *
step: 2900,train loss: 0.380, train accuracy: 0.875, val loss: 0.628, val accuracy: 0.753,training speed: 0.533sec/batch
It is epoch 12
step: 3000,train loss: 0.346, train accuracy: 0.906, val loss: 0.667, val accuracy: 0.759,training speed: 0.384sec/batch *
step: 3100,train loss: 0.498, train accuracy: 0.844, val loss: 0.663, val accuracy: 0.754,training speed: 0.537sec/batch
It is epoch 13
step: 3200,train loss: 0.505, train accuracy: 0.812, val loss: 0.900, val accuracy: 0.736,training speed: 0.042sec/batch
step: 3300,train loss: 0.286, train accuracy: 0.938, val loss: 0.737, val accuracy: 0.744,training speed: 0.519sec/batch
step: 3400,train loss: 0.470, train accuracy: 0.875, val loss: 0.680, val accuracy: 0.750,training speed: 0.519sec/batch
It is epoch 14
step: 3500,train loss: 0.229, train accuracy: 0.969, val loss: 0.814, val accuracy: 0.751,training speed: 0.218sec/batch
step: 3600,train loss: 0.386, train accuracy: 0.906, val loss: 0.743, val accuracy: 0.747,training speed: 0.515sec/batch
step: 3700,train loss: 0.354, train accuracy: 0.906, val loss: 0.745, val accuracy: 0.742,training speed: 0.519sec/batch
It is epoch 15
step: 3800,train loss: 0.325, train accuracy: 0.906, val loss: 0.741, val accuracy: 0.747,training speed: 0.395sec/batch
step: 3900,train loss: 0.259, train accuracy: 1.000, val loss: 0.701, val accuracy: 0.740,training speed: 0.541sec/batch
It is epoch 16
step: 4000,train loss: 0.375, train accuracy: 0.906, val loss: 0.823, val accuracy: 0.748,training speed: 0.052sec/batch
step: 4100,train loss: 0.401, train accuracy: 0.875, val loss: 0.760, val accuracy: 0.747,training speed: 0.528sec/batch
step: 4200,train loss: 0.445, train accuracy: 0.812, val loss: 0.724, val accuracy: 0.731,training speed: 0.518sec/batch
It is epoch 17
step: 4300,train loss: 0.433, train accuracy: 0.844, val loss: 0.752, val accuracy: 0.748,training speed: 0.228sec/batch
step: 4400,train loss: 0.316, train accuracy: 0.969, val loss: 0.742, val accuracy: 0.738,training speed: 0.519sec/batch
step: 4500,train loss: 0.454, train accuracy: 0.781, val loss: 0.809, val accuracy: 0.739,training speed: 0.528sec/batch
It is epoch 18
step: 4600,train loss: 0.401, train accuracy: 0.906, val loss: 0.844, val accuracy: 0.733,training speed: 0.408sec/batch
step: 4700,train loss: 0.246, train accuracy: 0.969, val loss: 0.853, val accuracy: 0.741,training speed: 0.517sec/batch
It is epoch 19
step: 4800,train loss: 0.300, train accuracy: 0.906, val loss: 0.872, val accuracy: 0.739,training speed: 0.062sec/batch
step: 4900,train loss: 0.229, train accuracy: 1.000, val loss: 0.794, val accuracy: 0.742,training speed: 0.521sec/batch
step: 5000,train loss: 0.252, train accuracy: 1.000, val loss: 0.828, val accuracy: 0.738,training speed: 0.488sec/batch
No optimization over 1000 steps, stop training
Train acc is 0.71375
Value acc is 0.632273792780122
MAX train acc is 1.0
MAX value acc is 0.7590248476324426