tensorflow - tensor2tensor - v1.0.12

tensorflow - tensor2tensor - v1.0.12

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
深度學習模型和數據集的庫,旨在使深度學習更易於訪問並加速 ML 研究。tensor2tensor 是一個庫,裏面封裝了很多的模型。

https://github.com/tensorflow/tensor2tensor
https://github.com/tensorflow/tensor2tensor/releases
https://github.com/tensorflow/tensor2tensor/tags
https://github.com/tensorflow/tensor2tensor/releases/tag/v1.0.12
https://github.com/tensorflow/tensor2tensor/tree/v1.0.12
https://github.com/tensorflow/tensorflow/tree/r1.4
https://tensorflow.google.cn/install/source

在這裏插入圖片描述

T2T is a modular and extensible library and binaries for supervised learning with TensorFlow and with support for sequence tasks. It is actively used and maintained by researchers and engineers within the Google Brain team. You can read more about Tensor2Tensor in the recent Google Research Blog post introducing it.
https://github.com/tensorflow/tensor2tensor/issues
https://github.com/tensorflow/tensor2tensor/blob/v1.0.12/CONTRIBUTING.md
https://gitter.im/tensor2tensor/Lobby

tensor2tensor 主要有 4 個步驟:生成數據、訓練、解碼、評分。

0. References

The Illustrated Transformer
https://jalammar.github.io/illustrated-transformer/
Kyubyong - transformer
https://github.com/Kyubyong/transformer

1. T2T: Tensor2Tensor Transformers - v1.0.12

yongqiang@yongqiang:~$ ls
yongqiang@yongqiang:~$
yongqiang@yongqiang:~$ mkdir software
yongqiang@yongqiang:~$ mkdir tensorflow_work
yongqiang@yongqiang:~$ mkdir yongqiang
yongqiang@yongqiang:~$ ls
software  tensorflow_work  yongqiang
yongqiang@yongqiang:~$
yongqiang@yongqiang:~$ ll
total 8
drwxr-xr-x 1 yongqiang yongqiang  512 May 30 07:27 ./
drwxr-xr-x 1 root      root       512 Apr 30 17:52 ../
-rw------- 1 yongqiang yongqiang  396 May 30 07:26 .bash_history
-rw-r--r-- 1 yongqiang yongqiang  220 Apr 30 17:52 .bash_logout
-rw-r--r-- 1 yongqiang yongqiang 3771 Apr 30 17:52 .bashrc
drwxrwxrwx 1 yongqiang yongqiang  512 Apr 30 17:52 .cache/
drwx------ 1 yongqiang yongqiang  512 Apr 30 17:52 .config/
-rw-r--r-- 1 yongqiang yongqiang  807 Apr 30 17:52 .profile
-rw------- 1 yongqiang yongqiang    7 May  2 09:17 .python_history
-rw-r--r-- 1 yongqiang yongqiang    0 May  2 08:15 .sudo_as_admin_successful
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 07:27 software/
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 07:27 tensorflow_work/
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 07:27 yongqiang/
yongqiang@yongqiang:~$
yongqiang@yongqiang:~$ mkdir tensor2tensor
yongqiang@yongqiang:~$ cd tensor2tensor/
yongqiang@yongqiang:~/tensor2tensor$ mkdir v1_0_12
yongqiang@yongqiang:~/tensor2tensor$ ls
v1_0_12
yongqiang@yongqiang:~/tensor2tensor$ cd v1_0_12/
yongqiang@yongqiang:~/tensor2tensor/v1_0_12$ ls
yongqiang@yongqiang:~/tensor2tensor/v1_0_12$
yongqiang@yongqiang:~/tensor2tensor/v1_0_12$ git clone https://github.com/tensorflow/tensor2tensor.git
Cloning into 'tensor2tensor'...
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 32312 (delta 0), reused 5 (delta 0), pack-reused 32307
Receiving objects: 100% (32312/32312), 15.47 MiB | 46.00 KiB/s, done.
Resolving deltas: 100% (26143/26143), done.
yongqiang@yongqiang:~/tensor2tensor/v1_0_12$
yongqiang@yongqiang:~/tensor2tensor/v1_0_12$ ls
tensor2tensor
yongqiang@yongqiang:~/tensor2tensor/v1_0_12$
yongqiang@yongqiang:~/tensor2tensor/v1_0_12$ cd tensor2tensor/
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$ ll
total 52
drwxrwxrwx 1 yongqiang yongqiang   512 May 30 07:49 ./
drwxrwxrwx 1 yongqiang yongqiang   512 May 30 07:43 ../
drwxrwxrwx 1 yongqiang yongqiang   512 May 30 07:49 .git/
-rw-rw-rw- 1 yongqiang yongqiang   310 May 30 07:49 .gitignore
-rw-rw-rw- 1 yongqiang yongqiang   779 May 30 07:49 .travis.yml
-rw-rw-rw- 1 yongqiang yongqiang   311 May 30 07:49 AUTHORS
-rw-rw-rw- 1 yongqiang yongqiang  1280 May 30 07:49 CONTRIBUTING.md
-rw-rw-rw- 1 yongqiang yongqiang   266 May 30 07:49 ISSUE_TEMPLATE.md
-rw-rw-rw- 1 yongqiang yongqiang 11358 May 30 07:49 LICENSE
-rw-rw-rw- 1 yongqiang yongqiang 19726 May 30 07:49 README.md
drwxrwxrwx 1 yongqiang yongqiang   512 May 30 07:49 docs/
-rw-rw-rw- 1 yongqiang yongqiang    34 May 30 07:49 floyd.yml
-rw-rw-rw- 1 yongqiang yongqiang    14 May 30 07:49 floyd_requirements.txt
drwxrwxrwx 1 yongqiang yongqiang   512 May 30 07:49 oss_scripts/
-rw-rw-rw- 1 yongqiang yongqiang  7866 May 30 07:49 pylintrc
-rw-rw-rw- 1 yongqiang yongqiang  3413 May 30 07:49 setup.py
drwxrwxrwx 1 yongqiang yongqiang   512 May 30 07:49 tensor2tensor/
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$ git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$

git branch - 查看本地所有分支 (結果列表中前面標 * 號表示本地當前使用的分支)。
git branch 命令,如果不加任何參數運行它,會得到當前所有分支的一個列表。注意 master 分支前的 * 字符:它代表現在檢出的那一個分支 (也就是說,當前 HEAD 指針所指向的分支)。這意味着如果在這時候提交,master 分支將會隨着新的工作向前移動。如果需要查看每一個分支的最後一次提交,可以運行 git branch -v 命令。

yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$ git branch
* master
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$

git branch -a - 查看所有分支 (本地 + 遠程)。

yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$ git branch -a
* master
  remotes/origin/HEAD -> origin/master
  remotes/origin/master
  remotes/origin/revert-1726-master
  remotes/origin/revert-1749-revert-1726-master
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$

git branch -r - 查看遠程所有分支。

yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$ git branch -r
  origin/HEAD -> origin/master
  origin/master
  origin/revert-1726-master
  origin/revert-1749-revert-1726-master
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$

Git 可以給倉庫歷史中的某一個提交打上標籤,以示重要。比較有代表性的是人們會使用這個功能來標記發佈結點 (v1.0v2.0 等等)。

在 Git 中列出已有的標籤非常簡單,只需要輸入 git tag (可帶上可選的 -l 選項 --list)。這個命令以字母順序列出標籤,但是它們顯示的順序並不重要。

按照通配符列出標籤需要 -l--list 選項。如果你只想要完整的標籤列表,那麼運行 git tag 就會默認假定你想要一個列表,它會直接給你列出來, 此時的 -l--list 是可選的。如果你提供了一個匹配標籤名的通配模式,那麼 -l--list 就是強制使用的。

yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$ git tag
1.3.2
v.1.12.0
v1.0.10
v1.0.11
v1.0.12
v1.0.13
v1.0.14
......
v1.9.0
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$ git tag -l
1.3.2
v.1.12.0
v1.0.10
v1.0.11
v1.0.12
v1.0.13
v1.0.14
......
v1.9.0
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$ git tag -l 'v1.0.*'
v1.0.10
v1.0.11
v1.0.12
v1.0.13
v1.0.14
v1.0.3
v1.0.4
v1.0.5
v1.0.6
v1.0.7
v1.0.8
v1.0.9
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$

git show v1.0.12 - 查看 tag v1.0.12 的版本信息。

yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$ git show v1.0.12
commit 83ac9fb735930c4e1b3b14504e63c1c55b95da75 (tag: v1.0.12)
Merge: fbb6f9ae 54622a51
Author: Lukasz Kaiser <[email protected]>
Date:   Fri Jul 7 17:30:39 2017 -0700

    Merge pull request #118 from lukaszkaiser/push

    1.0.12

yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$

git checkout tag_name - git 切換到某個 taggit 可能會提示你當前處於一個 detached HEAD 狀態。因爲 tag 相當於是一個快照,是不能更改它的代碼的。如果要在 tag 代碼的基礎上做修改,你需要新建一個分支。

yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$ git checkout v1.0.12
Note: checking out 'v1.0.12'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 83ac9fb7 Merge pull request #118 from lukaszkaiser/push
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$ git status
HEAD detached at v1.0.12
nothing to commit, working tree clean
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$ ls
AUTHORS  CONTRIBUTING.md  LICENSE  README.md  setup.py  tensor2tensor
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$ ll
total 32
drwxrwxrwx 1 yongqiang yongqiang   512 May 30 08:10 ./
drwxrwxrwx 1 yongqiang yongqiang   512 May 30 07:43 ../
drwxrwxrwx 1 yongqiang yongqiang   512 May 30 08:10 .git/
-rw-rw-rw- 1 yongqiang yongqiang   251 May 30 08:10 .gitignore
-rw-rw-rw- 1 yongqiang yongqiang   293 May 30 08:10 AUTHORS
-rw-rw-rw- 1 yongqiang yongqiang   969 May 30 08:10 CONTRIBUTING.md
-rw-rw-rw- 1 yongqiang yongqiang 11358 May 30 07:49 LICENSE
-rw-rw-rw- 1 yongqiang yongqiang  9069 May 30 08:10 README.md
-rw-rw-rw- 1 yongqiang yongqiang   988 May 30 08:10 setup.py
drwxrwxrwx 1 yongqiang yongqiang   512 May 30 08:10 tensor2tensor/
yongqiang@yongqiang:~/tensor2tensor/v1_0_12/tensor2tensor$

2. Tensorflow 1.4.1

(base) yongqiang@yongqiang:~$ conda env list
# conda environments:
#
base                  *  /home/yongqiang/miniconda3
tf_cpu_1.4.1             /home/yongqiang/miniconda3/envs/tf_cpu_1.4.1

(base) yongqiang@yongqiang:~$
(base) yongqiang@yongqiang:~$ conda activate tf_cpu_1.4.1
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ python
Python 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 21:41:56)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>>
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
2020-05-30 21:41:06.278325: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
>>> sess.run(hello)
b'Hello, TensorFlow!'
>>> a = tf.constant(10)
>>> b = tf.constant(32)
>>> sess.run(a + b)
42
>>> sess.close()
>>> exit()
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ conda deactivate
(base) yongqiang@yongqiang:~$

3. Walkthrough

Here’s a walkthrough training a good English-to-German translation model using the Transformer model from Attention Is All You Need on WMT data.
https://arxiv.org/abs/1706.03762

Workshop on Statistical Machine Translation,WMT

3.1 安裝 tensor2tensor 1.0.12

https://pypi.org/project/tensor2tensor/
Tensor2Tensor, or T2T for short, is a library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research. We keep it running and welcome bug-fixes, but encourage users to use the successor library Trax.

  • 安裝 tensor2tensor 1.0.12 - pip install tensor2tensor==1.0.12
(base) yongqiang@yongqiang:~$ conda env list
# conda environments:
#
base                  *  /home/yongqiang/miniconda3
tf_cpu_1.4.1             /home/yongqiang/miniconda3/envs/tf_cpu_1.4.1

(base) yongqiang@yongqiang:~$
(base) yongqiang@yongqiang:~$ conda activate tf_cpu_1.4.1
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ls
miniconda3  pycharm_work  software  tensor2tensor  tensorflow_work  yongqiang
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ pip install tensor2tensor==1.0.12
Collecting tensor2tensor==1.0.12
  Downloading https://files.pythonhosted.org/packages/9c/c3/d14192a122d7beb5f915b11e3f809ee07e636c5aae4259eedaa73dedc7a9/tensor2tensor-1.0.12-py2.py3-none-any.whl (195kB)
    100% |████████████████████████████████| 204kB 9.1kB/s
Requirement already satisfied: numpy in ./miniconda3/envs/tf_cpu_1.4.1/lib/python3.5/site-packages (from tensor2tensor==1.0.12) (1.16.4)
Requirement already satisfied: six in ./miniconda3/envs/tf_cpu_1.4.1/lib/python3.5/site-packages (from tensor2tensor==1.0.12) (1.15.0)
Collecting sympy (from tensor2tensor==1.0.12)
  Downloading https://files.pythonhosted.org/packages/a4/ea/590e1f2c73a1b8f878a1398b0edbf261d660439a9f851accb39db73d8e2f/sympy-1.6-py3-none-any.whl (5.8MB)
    100% |████████████████████████████████| 5.8MB 88kB/s
Collecting mpmath>=0.19 (from sympy->tensor2tensor==1.0.12)
  Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='pypi.org', port=443): Read timed out. (read timeout=15)",)': /simple/mpmath/
  Downloading https://files.pythonhosted.org/packages/ca/63/3384ebb3b51af9610086b23ea976e6d27d6d97bf140a76a365bd77a3eb32/mpmath-1.1.0.tar.gz (512kB)
    100% |████████████████████████████████| 522kB 151kB/s
Building wheels for collected packages: mpmath
  Running setup.py bdist_wheel for mpmath ... done
  Stored in directory: /home/yongqiang/.cache/pip/wheels/63/9d/8e/37c3f6506ed3f152733a699e92d8e0c9f5e5f01dea262f80ad
Successfully built mpmath
Installing collected packages: mpmath, sympy, tensor2tensor
Successfully installed mpmath-1.1.0 sympy-1.6 tensor2tensor-1.0.12
You are using pip version 10.0.1, however version 20.2b1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
  • t2t-trainer --registry_help
    See what problems, models, and hyperparameter sets are available. You can easily swap between them (and add new ones).
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ t2t-trainer --registry_help
INFO:tensorflow:
Registry contents:
------------------

  Models: ['attention_lm', 'attention_lm_moe', 'baseline_lstm_seq2seq', 'baseline_lstm_seq2seq_attention', 'blue_net', 'byte_net', 'diagonal_neural_gpu', 'multi_model', 'neural_gpu', 'slice_net', 'transformer', 'xception']

  HParams (by model):
    * attention: ['attention_lm_base', 'attention_lm_moe_base', 'attention_lm_moe_large', 'attention_lm_moe_small']
    * basic: ['basic_1']
    * bluenet: ['bluenet_base', 'bluenet_tiny']
    * bytenet: ['bytenet_base']
    * lstm: ['lstm_attention']
    * multimodel: ['multimodel_base', 'multimodel_tiny']
    * neuralgpu: ['neuralgpu_1']
    * slicenet: ['slicenet_1', 'slicenet_1noam', 'slicenet_1tiny']
    * transformer: ['transformer_base', 'transformer_base_single_gpu', 'transformer_big', 'transformer_big_dr1', 'transformer_big_dr2', 'transformer_big_enfr', 'transformer_big_single_gpu', 'transformer_dr0', 'transformer_dr2', 'transformer_ff1024', 'transformer_ff4096', 'transformer_h1', 'transformer_h16', 'transformer_h32', 'transformer_h4', 'transformer_hs1024', 'transformer_hs256', 'transformer_k128', 'transformer_k256', 'transformer_l2', 'transformer_l4', 'transformer_l8', 'transformer_ls0', 'transformer_ls2', 'transformer_parameter_attention_a', 'transformer_parameter_attention_b', 'transformer_parsing_base', 'transformer_parsing_big', 'transformer_tiny']
    * xception: ['xception_base', 'xception_tiny']

  RangedHParams: ['basic1', 'slicenet1', 'transformer_big_single_gpu']

  Modalities: ['audio:audio_spectral_modality', 'audio:default', 'audio:identity', 'class_label:class_label_2d', 'class_label:default', 'class_label:identity', 'generic:default', 'image:default', 'image:identity', 'image:small_image_modality', 'symbol:default', 'symbol:identity']

(tf_cpu_1.4.1) yongqiang@yongqiang:~$

3.2 創建項目目錄

PROBLEM=wmt_ende_tokens_32k
MODEL=transformer
HPARAMS=transformer_base_single_gpu

DATA_DIR=$HOME/t2t_data
TMP_DIR=/tmp/t2t_datagen
TRAIN_DIR=$HOME/t2t_train/$PROBLEM/$MODEL-$HPARAMS

mkdir -p $DATA_DIR $TMP_DIR $TRAIN_DIR

TMP_DIR=/tmp/t2t_datagen 修改爲 TMP_DIR=$HOME/t2t_datagen 較佳。

PROBLEM=translate_ende_wmt32k 用來生成數據集,並定義任務的類型。這裏選擇的是翻譯任務 (translate),從英語翻譯成德文 (ende),選擇使用 transformer model

HPARAMS:超參數。
DATA_DIR:用於訓練的數據。
TMP_DIR:未經處理的原始數據,tensor2tensor 提供的數據。
TRAIN_DIR:在訓練過程中產生的文件。

(tf_cpu_1.4.1) yongqiang@yongqiang:~$ echo $HOME
/home/yongqiang
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ll
total 20
drwxr-xr-x 1 yongqiang yongqiang  512 May 31 09:17 ./
drwxr-xr-x 1 root      root       512 Apr 30 17:52 ../
-rw------- 1 yongqiang yongqiang 3816 May 31 10:24 .bash_history
-rw-r--r-- 1 yongqiang yongqiang  220 Apr 30 17:52 .bash_logout
-rw-r--r-- 1 yongqiang yongqiang 4377 May 31 09:17 .bashrc
drwxrwxrwx 1 yongqiang yongqiang  512 May 31 00:03 .cache/
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 07:53 .conda/
drwx------ 1 yongqiang yongqiang  512 May 31 09:53 .config/
drwxrwxrwx 1 yongqiang yongqiang  512 May 31 00:03 .java/
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 20:20 .keras/
drwxrwxrwx 1 yongqiang yongqiang  512 May 31 00:04 .local/
-rw-r--r-- 1 yongqiang yongqiang  807 Apr 30 17:52 .profile
-rw------- 1 yongqiang yongqiang  355 May 30 21:42 .python_history
-rw-r--r-- 1 yongqiang yongqiang    0 May  2 08:15 .sudo_as_admin_successful
-rw------- 1 yongqiang yongqiang  794 May 31 09:17 .viminfo
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 07:53 miniconda3/
drwxrwxrwx 1 yongqiang yongqiang  512 May 31 09:09 pycharm_work/
drwxrwxrwx 1 yongqiang yongqiang  512 May 31 09:55 software/
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 07:43 tensor2tensor/
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 07:27 tensorflow_work/
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 07:27 yongqiang/
(tf_cpu_1.4.1) yongqiang@yongqiang:~$

(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ PROBLEM=wmt_ende_tokens_32k
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ MODEL=transformer
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ HPARAMS=transformer_base_single_gpu
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ DATA_DIR=$HOME/t2t_data
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ TMP_DIR=/tmp/t2t_datagen
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ TRAIN_DIR=$HOME/t2t_train/$PROBLEM/$MODEL-$HPARAMS
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ mkdir -p $DATA_DIR $TMP_DIR $TRAIN_DIR
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ll
total 20
drwxr-xr-x 1 yongqiang yongqiang  512 May 31 16:28 ./
drwxr-xr-x 1 root      root       512 Apr 30 17:52 ../
-rw------- 1 yongqiang yongqiang 3816 May 31 10:24 .bash_history
-rw-r--r-- 1 yongqiang yongqiang  220 Apr 30 17:52 .bash_logout
-rw-r--r-- 1 yongqiang yongqiang 4377 May 31 09:17 .bashrc
drwxrwxrwx 1 yongqiang yongqiang  512 May 31 00:03 .cache/
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 07:53 .conda/
drwx------ 1 yongqiang yongqiang  512 May 31 09:53 .config/
drwxrwxrwx 1 yongqiang yongqiang  512 May 31 00:03 .java/
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 20:20 .keras/
drwxrwxrwx 1 yongqiang yongqiang  512 May 31 00:04 .local/
-rw-r--r-- 1 yongqiang yongqiang  807 Apr 30 17:52 .profile
-rw------- 1 yongqiang yongqiang  355 May 30 21:42 .python_history
-rw-r--r-- 1 yongqiang yongqiang    0 May  2 08:15 .sudo_as_admin_successful
-rw------- 1 yongqiang yongqiang  794 May 31 09:17 .viminfo
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 07:53 miniconda3/
drwxrwxrwx 1 yongqiang yongqiang  512 May 31 09:09 pycharm_work/
drwxrwxrwx 1 yongqiang yongqiang  512 May 31 09:55 software/
drwxrwxrwx 1 yongqiang yongqiang  512 May 31 16:28 t2t_data/
drwxrwxrwx 1 yongqiang yongqiang  512 May 31 16:28 t2t_train/
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 07:43 tensor2tensor/
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 07:27 tensorflow_work/
drwxrwxrwx 1 yongqiang yongqiang  512 May 30 07:27 yongqiang/
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ls /tmp/t2t_datagen/
(tf_cpu_1.4.1) yongqiang@yongqiang:~$

3.3 生成數據

https://github.com/tensorflow/tensor2tensor/tree/v1.0.12

# Generate data
t2t-datagen --data_dir=$DATA_DIR --tmp_dir=$TMP_DIR --num_shards=100 --problem=$PROBLEM

cp $TMP_DIR/tokens.vocab.* $DATA_DIR

https://github.com/tensorflow/tensor2tensor

# Generate data
t2t-datagen --data_dir=$DATA_DIR --tmp_dir=$TMP_DIR --problem=$PROBLEM

https://github.com/tensorflow/tensor2tensor/tree/v1.0.12

(tf_cpu_1.4.1) yongqiang@yongqiang:~$ t2t-datagen --data_dir=$DATA_DIR --tmp_dir=$TMP_DIR --num_shards=100 --problem=$PROBLEM
INFO:tensorflow:Generating problems:
  * wmt_ende_tokens_32k

INFO:tensorflow:Generating training data for wmt_ende_tokens_32k.
INFO:tensorflow:Generating vocab from: [['http://data.statmt.org/wmt16/translation-task/training-parallel-nc-v11.tgz', ['training-parallel-nc-v11/news-commentary-v11.de-en.en', 'training-parallel-nc-v11/news-commentary-v11.de-en.de']], ['http://www.statmt.org/wmt13/training-parallel-commoncrawl.tgz', ['commoncrawl.de-en.en', 'commoncrawl.de-en.de', 'commoncrawl.fr-en.en', 'commoncrawl.fr-en.fr']], ['http://www.statmt.org/wmt13/training-parallel-europarl-v7.tgz', ['training/europarl-v7.de-en.en', 'training/europarl-v7.de-en.de', 'training/europarl-v7.fr-en.en', 'training/europarl-v7.fr-en.fr']], ['http://www.statmt.org/wmt10/training-giga-fren.tar', ['giga-fren.release2.fixed.en.gz', 'giga-fren.release2.fixed.fr.gz']], ['http://www.statmt.org/wmt13/training-parallel-un.tgz', ['un/undoc.2000.fr-en.en', 'un/undoc.2000.fr-en.fr']]]
INFO:tensorflow:Downloading http://data.statmt.org/wmt16/translation-task/training-parallel-nc-v11.tgz to /tmp/t2t_datagen/training-parallel-nc-v11.tgz
100% completed
INFO:tensorflow:Succesfully downloaded training-parallel-nc-v11.tgz, 75178032 bytes.
INFO:tensorflow:Reading file: training-parallel-nc-v11/news-commentary-v11.de-en.en
INFO:tensorflow:Reading file: training-parallel-nc-v11/news-commentary-v11.de-en.de
INFO:tensorflow:Downloading http://www.statmt.org/wmt13/training-parallel-commoncrawl.tgz to /tmp/t2t_datagen/training-parallel-commoncrawl.tgz
100% completed
INFO:tensorflow:Succesfully downloaded training-parallel-commoncrawl.tgz, 918311367 bytes.
INFO:tensorflow:Reading file: commoncrawl.de-en.en
INFO:tensorflow:Reading file: commoncrawl.de-en.de
INFO:tensorflow:Reading file: commoncrawl.fr-en.en
INFO:tensorflow:Reading file: commoncrawl.fr-en.fr
INFO:tensorflow:Downloading http://www.statmt.org/wmt13/training-parallel-europarl-v7.tgz to /tmp/t2t_datagen/training-parallel-europarl-v7.tgz
100% completed
INFO:tensorflow:Succesfully downloaded training-parallel-europarl-v7.tgz, 657632379 bytes.
INFO:tensorflow:Reading file: training/europarl-v7.de-en.en
INFO:tensorflow:Reading file: training/europarl-v7.de-en.de
INFO:tensorflow:Reading file: training/europarl-v7.fr-en.en
INFO:tensorflow:Reading file: training/europarl-v7.fr-en.fr
INFO:tensorflow:Downloading http://www.statmt.org/wmt10/training-giga-fren.tar to /tmp/t2t_datagen/training-giga-fren.tar
100% completed
INFO:tensorflow:Succesfully downloaded training-giga-fren.tar, 2595102720 bytes.
INFO:tensorflow:Reading file: giga-fren.release2.fixed.en.gz
INFO:tensorflow:Unpacking subdirectory /tmp/t2t_datagen/giga-fren.release2.fixed.en.gz
INFO:tensorflow:Unpacking /tmp/t2t_datagen/giga-fren.release2.fixed.en.gz to /tmp/t2t_datagen/giga-fren.release2.fixed.en
INFO:tensorflow:Reading file: giga-fren.release2.fixed.fr.gz
INFO:tensorflow:Unpacking subdirectory /tmp/t2t_datagen/giga-fren.release2.fixed.fr.gz
INFO:tensorflow:Unpacking /tmp/t2t_datagen/giga-fren.release2.fixed.fr.gz to /tmp/t2t_datagen/giga-fren.release2.fixed.fr
INFO:tensorflow:Downloading http://www.statmt.org/wmt13/training-parallel-un.tgz to /tmp/t2t_datagen/training-parallel-un.tgz
100% completed
INFO:tensorflow:Succesfully downloaded training-parallel-un.tgz, 2365634246 bytes.
INFO:tensorflow:Reading file: un/undoc.2000.fr-en.en
INFO:tensorflow:Reading file: un/undoc.2000.fr-en.fr
INFO:tensorflow:Trying min_count 500
INFO:tensorflow:Iteration 0
INFO:tensorflow:vocab_size = 2692
INFO:tensorflow:Iteration 1
INFO:tensorflow:vocab_size = 1276
INFO:tensorflow:Iteration 2
INFO:tensorflow:vocab_size = 1387
INFO:tensorflow:Iteration 3
INFO:tensorflow:vocab_size = 1368
[680, 355, 406, 56, 294, 836, 993, 9, 150, 6, 847, 225, 386, 66, 824, 178, 75, 662, 200, 33, 5]
['This_', 'sen', 'ten', 'ce_', 'was_', 'enc', 'ode', 'd_', 'by_', 'the_', 'Su', 'b', 'wor', 'd', 'Te', 'x', 't', 'En', 'co', 'der_', '._']
This sentence was encoded by the SubwordTextEncoder.
INFO:tensorflow:Trying min_count 250
INFO:tensorflow:Iteration 0
INFO:tensorflow:vocab_size = 4921
INFO:tensorflow:Iteration 1
INFO:tensorflow:vocab_size = 2180
INFO:tensorflow:Iteration 2
INFO:tensorflow:vocab_size = 2346
INFO:tensorflow:Iteration 3
INFO:tensorflow:vocab_size = 2285
[395, 1321, 483, 43, 135, 2261, 2019, 2, 72, 6, 500, 298, 755, 101, 488, 311, 103, 954, 207, 31, 5]
['This_', 'sen', 'ten', 'ce_', 'was_', 'enco', 'ded', '_', 'by_', 'the_', 'Su', 'b', 'wor', 'd', 'Te', 'x', 't', 'En', 'co', 'der_', '._']
This sentence was encoded by the SubwordTextEncoder.
INFO:tensorflow:Trying min_count 125
INFO:tensorflow:Iteration 0
INFO:tensorflow:vocab_size = 9071
INFO:tensorflow:Iteration 1
INFO:tensorflow:vocab_size = 3656
INFO:tensorflow:Iteration 2
INFO:tensorflow:vocab_size = 3904
INFO:tensorflow:Iteration 3
INFO:tensorflow:vocab_size = 3833
[192, 3413, 657, 90, 3537, 1574, 60, 6, 256, 283, 1522, 404, 2153, 945, 455, 29, 4]
['This_', 'sent', 'ence_', 'was_', 'enco', 'ded_', 'by_', 'the_', 'Su', 'b', 'word', 'Te', 'xt', 'En', 'co', 'der_', '._']
This sentence was encoded by the SubwordTextEncoder.
INFO:tensorflow:Trying min_count 62
INFO:tensorflow:Iteration 0
INFO:tensorflow:vocab_size = 16030
INFO:tensorflow:Iteration 1
INFO:tensorflow:vocab_size = 6094
INFO:tensorflow:Iteration 2
INFO:tensorflow:vocab_size = 6398
INFO:tensorflow:Iteration 3
INFO:tensorflow:vocab_size = 6337
[147, 1998, 1162, 62, 85, 1056, 6214, 15, 55, 5, 534, 455, 1287, 271, 563, 3898, 770, 574, 29, 4]
['This_', 'sen', 'ten', 'ce_', 'was_', 'enc', 'ode', 'd_', 'by_', 'the_', 'Su', 'b', 'wor', 'd', 'Te', 'xt', 'En', 'co', 'der_', '._']
This sentence was encoded by the SubwordTextEncoder.
INFO:tensorflow:Trying min_count 31
INFO:tensorflow:Iteration 0
INFO:tensorflow:vocab_size = 26906
INFO:tensorflow:Iteration 1
INFO:tensorflow:vocab_size = 9865
INFO:tensorflow:Iteration 2
INFO:tensorflow:vocab_size = 10224
INFO:tensorflow:Iteration 3
INFO:tensorflow:vocab_size = 10176
[132, 2700, 1926, 75, 4758, 2027, 53, 5, 3728, 2585, 346, 5471, 329, 1095, 7275, 40, 4]
['This_', 'sent', 'ence_', 'was_', 'enco', 'ded_', 'by_', 'the_', 'Sub', 'wor', 'd', 'Tex', 't', 'En', 'cod', 'er_', '._']
This sentence was encoded by the SubwordTextEncoder.
INFO:tensorflow:Trying min_count 15
INFO:tensorflow:Iteration 0
INFO:tensorflow:vocab_size = 44146
INFO:tensorflow:Iteration 1
INFO:tensorflow:vocab_size = 15834
INFO:tensorflow:Iteration 2
INFO:tensorflow:vocab_size = 16294
INFO:tensorflow:Iteration 3
INFO:tensorflow:vocab_size = 16224
[119, 10073, 565, 71, 6733, 2317, 49, 5, 4054, 3493, 998, 10477, 1899, 6760, 40, 3]
['This_', 'sente', 'nce_', 'was_', 'enco', 'ded_', 'by_', 'the_', 'Sub', 'wor', 'd', 'Text', 'En', 'code', 'r_', '._']
This sentence was encoded by the SubwordTextEncoder.
INFO:tensorflow:Trying min_count 7
INFO:tensorflow:Iteration 0
INFO:tensorflow:vocab_size = 71625
INFO:tensorflow:Iteration 1
INFO:tensorflow:vocab_size = 25196
INFO:tensorflow:Iteration 2
INFO:tensorflow:vocab_size = 25741
INFO:tensorflow:Iteration 3
INFO:tensorflow:vocab_size = 25660
[113, 12454, 5, 66, 14726, 3859, 46, 4, 5460, 18242, 15158, 1333, 3106, 17159, 54, 3]
['This_', 'sentence', '_', 'was_', 'enco', 'ded_', 'by_', 'the_', 'Sub', 'word', 'Tex', 't', 'En', 'code', 'r_', '._']
This sentence was encoded by the SubwordTextEncoder.
INFO:tensorflow:Trying min_count 3
INFO:tensorflow:Iteration 0
INFO:tensorflow:vocab_size = 118383
INFO:tensorflow:Iteration 1
INFO:tensorflow:vocab_size = 40335
INFO:tensorflow:Iteration 2
INFO:tensorflow:vocab_size = 41029
INFO:tensorflow:Iteration 3
INFO:tensorflow:vocab_size = 40938
[108, 15212, 62, 24807, 23, 42, 4, 14289, 17583, 26393, 11393, 19634, 95, 3]
['This_', 'sentence_', 'was_', 'encode', 'd_', 'by_', 'the_', 'Sub', 'word', 'Text', 'En', 'code', 'r_', '._']
This sentence was encoded by the SubwordTextEncoder.
INFO:tensorflow:Trying min_count 5
INFO:tensorflow:Iteration 0
INFO:tensorflow:vocab_size = 87678
INFO:tensorflow:Iteration 1
INFO:tensorflow:vocab_size = 30452
INFO:tensorflow:Iteration 2
INFO:tensorflow:vocab_size = 31135
INFO:tensorflow:Iteration 3
INFO:tensorflow:vocab_size = 31022
[111, 20623, 64, 15493, 4478, 45, 4, 6387, 14372, 12204, 2310, 30424, 23, 3]
['This_', 'sentence_', 'was_', 'enco', 'ded_', 'by_', 'the_', 'Sub', 'word', 'Tex', 't', 'Enco', 'der_', '._']
This sentence was encoded by the SubwordTextEncoder.
INFO:tensorflow:Trying min_count 4
INFO:tensorflow:Iteration 0
INFO:tensorflow:vocab_size = 100818
INFO:tensorflow:Iteration 1
INFO:tensorflow:vocab_size = 34547
INFO:tensorflow:Iteration 2
INFO:tensorflow:vocab_size = 35306
INFO:tensorflow:Iteration 3
INFO:tensorflow:vocab_size = 35174
[110, 17462, 63, 30500, 23, 45, 4, 19525, 12792, 11021, 3072, 25323, 21, 3]
['This_', 'sentence_', 'was_', 'encode', 'd_', 'by_', 'the_', 'Sub', 'word', 'Tex', 't', 'Enco', 'der_', '._']
This sentence was encoded by the SubwordTextEncoder.
INFO:tensorflow:Generating case 100000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 200000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 300000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 400000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 500000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 600000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 700000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 800000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 900000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 1000000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 1100000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 1200000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 1300000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 1400000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 1500000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 1600000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 1700000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 1800000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 1900000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 2000000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 2100000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 2200000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 2300000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 2400000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 2500000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 2600000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 2700000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 2800000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 2900000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 3000000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 3100000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 3200000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 3300000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 3400000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 3500000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 3600000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 3700000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 3800000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 3900000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 4000000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 4100000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 4200000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 4300000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 4400000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating case 4500000 for wmt_ende_tokens_32k-unshuffled-train.
INFO:tensorflow:Generating development data for wmt_ende_tokens_32k.
INFO:tensorflow:Found vocab file: /tmp/t2t_datagen/tokens.vocab.32768
INFO:tensorflow:Downloading http://data.statmt.org/wmt16/translation-task/dev.tgz to /tmp/t2t_datagen/dev.tgz
100% completed
INFO:tensorflow:Succesfully downloaded dev.tgz, 22836484 bytes.
INFO:tensorflow:Shuffling data...
(tf_cpu_1.4.1) yongqiang@yongqiang:~$

--tmp_dir=$TMP_DIR 文件夾下數據:

(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ll /tmp/t2t_datagen/
total 21037120
drwxrwxrwx 1 yongqiang yongqiang        512 Jun  1 03:51 ./
drwxrwxrwt 1 root      root             512 Jun  1 07:42 ../
-rw-r--r-- 1 yongqiang yongqiang    2310368 Jan 12  2013 commoncrawl.cs-en.annotation
-rw-r--r-- 1 yongqiang yongqiang   20745616 Jan 12  2013 commoncrawl.cs-en.cs
-rw-r--r-- 1 yongqiang yongqiang   20655491 Jan 12  2013 commoncrawl.cs-en.en
-rw-r--r-- 1 yongqiang yongqiang   35544656 Jan 12  2013 commoncrawl.de-en.annotation
-rw-r--r-- 1 yongqiang yongqiang  340513464 Jan 12  2013 commoncrawl.de-en.de
-rw-r--r-- 1 yongqiang yongqiang  314178160 Jan 12  2013 commoncrawl.de-en.en
-rw-r--r-- 1 yongqiang yongqiang   25951409 Jan 12  2013 commoncrawl.es-en.annotation
-rw-r--r-- 1 yongqiang yongqiang  248845283 Jan 12  2013 commoncrawl.es-en.en
-rw-r--r-- 1 yongqiang yongqiang  272813663 Jan 12  2013 commoncrawl.es-en.es
-rw-r--r-- 1 yongqiang yongqiang   43896615 Jan 12  2013 commoncrawl.fr-en.annotation
-rw-r--r-- 1 yongqiang yongqiang  434655470 Jan 12  2013 commoncrawl.fr-en.en
-rw-r--r-- 1 yongqiang yongqiang  500374763 Jan 12  2013 commoncrawl.fr-en.fr
-rw-r--r-- 1 yongqiang yongqiang   14945508 Jan 12  2013 commoncrawl.ru-en.annotation
-rw-r--r-- 1 yongqiang yongqiang  116164488 Jan 12  2013 commoncrawl.ru-en.en
-rw-r--r-- 1 yongqiang yongqiang  214585860 Feb 26  2013 commoncrawl.ru-en.ru
drwxr-xr-x 1 yongqiang yongqiang        512 Jan  6  2016 dev/
-rw-rw-rw- 1 yongqiang yongqiang   22836484 Jun  1 03:51 dev.tgz
-rw-rw-rw- 1 yongqiang yongqiang 3789873031 Jun  1 02:22 giga-fren.release2.fixed.en
-rw-rw-r-- 1 yongqiang yongqiang 1214224978 Aug 31  2016 giga-fren.release2.fixed.en.gz
-rw-rw-rw- 1 yongqiang yongqiang 4565271815 Jun  1 02:24 giga-fren.release2.fixed.fr
-rw-rw-r-- 1 yongqiang yongqiang 1380871453 Aug 30  2016 giga-fren.release2.fixed.fr.gz
-rw-rw-rw- 1 yongqiang yongqiang     306558 Jun  1 03:32 tokens.vocab.32768
drwxrwxrwx 1 yongqiang yongqiang        512 Jun  1 00:51 training/
-rw-rw-rw- 1 yongqiang yongqiang 2595102720 Jun  1 02:20 training-giga-fren.tar
-rw-rw-rw- 1 yongqiang yongqiang  918311367 May 31 21:48 training-parallel-commoncrawl.tgz
-rw-rw-rw- 1 yongqiang yongqiang  657632379 Jun  1 00:51 training-parallel-europarl-v7.tgz
drwxrwxr-x 1 yongqiang yongqiang        512 Jan  6  2016 training-parallel-nc-v11/
-rw-rw-rw- 1 yongqiang yongqiang   75178032 May 31 17:39 training-parallel-nc-v11.tgz
-rw-rw-rw- 1 yongqiang yongqiang 2365634246 Jun  1 03:28 training-parallel-un.tgz
drwxrwxrwx 1 yongqiang yongqiang        512 Jun  1 03:30 un/
-rw-rw-rw- 1 yongqiang yongqiang     332974 Jun  1 03:51 wmt_ende_tok_dev.lang1
-rw-rw-rw- 1 yongqiang yongqiang     385433 Jun  1 03:51 wmt_ende_tok_dev.lang2
-rw-rw-rw- 1 yongqiang yongqiang  636460447 Jun  1 03:33 wmt_ende_tok_train.lang1
-rw-rw-rw- 1 yongqiang yongqiang  710260525 Jun  1 03:34 wmt_ende_tok_train.lang2
(tf_cpu_1.4.1) yongqiang@yongqiang:~$

其中 .lang1 是英語語料,要翻譯的數據。.lang2 是德語語料,翻譯後的數據。

(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ll /tmp/t2t_datagen/un
total 8287744
drwxrwxrwx 1 yongqiang yongqiang        512 Jun  1 03:30 ./
drwxrwxrwx 1 yongqiang yongqiang        512 Jun  1 03:51 ../
-rw-r--r-- 1 yongqiang yongqiang 1851755840 Oct 13  2011 undoc.2000.es-en.en
-rw-r--r-- 1 yongqiang yongqiang 2121526762 Oct 13  2011 undoc.2000.es-en.es
-rw-r--r-- 1 yongqiang yongqiang 2085411017 Oct 13  2011 undoc.2000.fr-en.en
-rw-r--r-- 1 yongqiang yongqiang 2427161501 Oct 13  2011 undoc.2000.fr-en.fr
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ll /tmp/t2t_datagen/dev/
total 83520
drwxr-xr-x 1 yongqiang yongqiang    512 Jan  6  2016 ./
drwxrwxrwx 1 yongqiang yongqiang    512 Jun  1 03:51 ../
-rw-r--r-- 1 yongqiang yongqiang  16384 Jan 16  2014 .newsdev2014-ref.en.sgm.swp
-rw-r--r-- 1 yongqiang yongqiang  16384 Jan 16  2014 .newstest2013-ref.en.sgm.swp
-rw-r--r-- 1 yongqiang yongqiang 326194 Dec  3  2009 news-test2008-ref.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang 360808 Dec  3  2009 news-test2008-ref.de.sgm
-rw-r--r-- 1 yongqiang yongqiang 325852 Dec  3  2009 news-test2008-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 351112 Dec  3  2009 news-test2008-ref.es.sgm
-rw-r--r-- 1 yongqiang yongqiang 351196 Dec  3  2009 news-test2008-ref.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 349559 Dec  3  2009 news-test2008-ref.hu.sgm
-rw-r--r-- 1 yongqiang yongqiang 325101 Dec  3  2009 news-test2008-src.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang 359715 Dec  3  2009 news-test2008-src.de.sgm
-rw-r--r-- 1 yongqiang yongqiang 324759 Dec  3  2009 news-test2008-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 350019 Dec  3  2009 news-test2008-src.es.sgm
-rw-r--r-- 1 yongqiang yongqiang 350103 Dec  3  2009 news-test2008-src.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 348466 Dec  3  2009 news-test2008-src.hu.sgm
-rw-r--r-- 1 yongqiang yongqiang 267659 Dec 23  2009 news-test2008.cs
-rw-r--r-- 1 yongqiang yongqiang 298369 Dec 23  2009 news-test2008.de
-rw-r--r-- 1 yongqiang yongqiang 263240 Dec 23  2009 news-test2008.en
-rw-r--r-- 1 yongqiang yongqiang 288487 Dec 23  2009 news-test2008.es
-rw-r--r-- 1 yongqiang yongqiang 301220 Dec 23  2009 news-test2008.fr
-rw-r--r-- 1 yongqiang yongqiang  68931 Jan 16  2014 newsdev2014-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 137001 Jan 16  2014 newsdev2014-ref.hi.sgm
-rw-r--r-- 1 yongqiang yongqiang  68918 Jan 13  2014 newsdev2014-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 136988 Jan 13  2014 newsdev2014-src.hi.sgm
-rw-r--r-- 1 yongqiang yongqiang  55133 Jan 13  2014 newsdev2014.en
-rw-r--r-- 1 yongqiang yongqiang 123204 Jan 13  2014 newsdev2014.hi
-rw-r--r-- 1 yongqiang yongqiang 216726 Jan 28  2015 newsdev2015-enfi-ref.fi.sgm
-rw-r--r-- 1 yongqiang yongqiang 207286 Jan 28  2015 newsdev2015-enfi-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 207299 Jan 28  2015 newsdev2015-fien-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 216713 Jan 28  2015 newsdev2015-fien-src.fi.sgm
-rw-rw-r-- 1 yongqiang yongqiang 336460 Jan  6  2016 newsdev2016-enro-ref.ro.sgm
-rw-rw-r-- 1 yongqiang yongqiang 306002 Jan  6  2016 newsdev2016-enro-src.en.sgm
-rw-rw-r-- 1 yongqiang yongqiang 139478 Jan  6  2016 newsdev2016-entr-ref.tr.sgm
-rw-rw-r-- 1 yongqiang yongqiang 135181 Jan  6  2016 newsdev2016-entr-src.en.sgm
-rw-rw-r-- 1 yongqiang yongqiang 306015 Jan  6  2016 newsdev2016-roen-ref.en.sgm
-rw-rw-r-- 1 yongqiang yongqiang 336447 Jan  6  2016 newsdev2016-roen-src.ro.sgm
-rw-rw-r-- 1 yongqiang yongqiang 135194 Jan  6  2016 newsdev2016-tren-ref.en.sgm
-rw-rw-r-- 1 yongqiang yongqiang 139465 Jan  6  2016 newsdev2016-tren-src.tr.sgm
-rw-r--r-- 1 yongqiang yongqiang 188019 Jan 28  2015 newsdiscussdev2015-enfr-ref.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 171890 Jan 28  2015 newsdiscussdev2015-enfr-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 171903 Jan 28  2015 newsdiscussdev2015-fren-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 188006 Jan 28  2015 newsdiscussdev2015-fren-src.fr.sgm
-rw-rw-r-- 1 yongqiang yongqiang 188490 Jan  6  2016 newsdiscusstest2015-enfr-ref.fr.sgm
-rw-rw-r-- 1 yongqiang yongqiang 170340 Jan  6  2016 newsdiscusstest2015-enfr-src.en.sgm
-rw-rw-r-- 1 yongqiang yongqiang 170353 Jan  6  2016 newsdiscusstest2015-fren-ref.en.sgm
-rw-rw-r-- 1 yongqiang yongqiang 188477 Jan  6  2016 newsdiscusstest2015-fren-src.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang  77096 Dec  3  2009 newssyscomb2009-ref.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang  83899 Dec  3  2009 newssyscomb2009-ref.de.sgm
-rw-r--r-- 1 yongqiang yongqiang  76771 Dec  3  2009 newssyscomb2009-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang  83167 Dec  3  2009 newssyscomb2009-ref.es.sgm
-rw-r--r-- 1 yongqiang yongqiang  84788 Dec  3  2009 newssyscomb2009-ref.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang  81826 Dec  3  2009 newssyscomb2009-ref.hu.sgm
-rw-r--r-- 1 yongqiang yongqiang  80940 Dec  3  2009 newssyscomb2009-ref.it.sgm
-rw-r--r-- 1 yongqiang yongqiang  76783 Dec  3  2009 newssyscomb2009-src.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang  83587 Dec  3  2009 newssyscomb2009-src.de.sgm
-rw-r--r-- 1 yongqiang yongqiang  76458 Dec  3  2009 newssyscomb2009-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang  82856 Dec  3  2009 newssyscomb2009-src.es.sgm
-rw-r--r-- 1 yongqiang yongqiang  84482 Dec  3  2009 newssyscomb2009-src.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang  81513 Dec  3  2009 newssyscomb2009-src.hu.sgm
-rw-r--r-- 1 yongqiang yongqiang  80654 Dec  3  2009 newssyscomb2009-src.it.sgm
-rw-r--r-- 1 yongqiang yongqiang  62880 Dec 10  2011 newssyscomb2009.cs
-rw-r--r-- 1 yongqiang yongqiang  69689 Dec 10  2011 newssyscomb2009.de
-rw-r--r-- 1 yongqiang yongqiang  62557 Dec 10  2011 newssyscomb2009.en
-rw-r--r-- 1 yongqiang yongqiang  68947 Dec 10  2011 newssyscomb2009.es
-rw-r--r-- 1 yongqiang yongqiang  70529 Dec 10  2011 newssyscomb2009.fr
-rw-r--r-- 1 yongqiang yongqiang 417868 Dec  3  2009 newstest2009-ref.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang 452329 Dec  3  2009 newstest2009-ref.de.sgm
-rw-r--r-- 1 yongqiang yongqiang 415276 Dec  3  2009 newstest2009-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 443880 Dec  3  2009 newstest2009-ref.es.sgm
-rw-r--r-- 1 yongqiang yongqiang 460824 Dec  3  2009 newstest2009-ref.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 448263 Dec  3  2009 newstest2009-ref.hu.sgm
-rw-r--r-- 1 yongqiang yongqiang 434489 Dec  3  2009 newstest2009-ref.it.sgm
-rw-r--r-- 1 yongqiang yongqiang 416524 Dec  3  2009 newstest2009-src.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang 450989 Dec  3  2009 newstest2009-src.de.sgm
-rw-r--r-- 1 yongqiang yongqiang 413934 Dec  3  2009 newstest2009-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 442554 Dec  3  2009 newstest2009-src.es.sgm
-rw-r--r-- 1 yongqiang yongqiang 459552 Dec  3  2009 newstest2009-src.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 446918 Dec  3  2009 newstest2009-src.hu.sgm
-rw-r--r-- 1 yongqiang yongqiang 433226 Dec  3  2009 newstest2009-src.it.sgm
-rw-r--r-- 1 yongqiang yongqiang 413934 Dec  3  2009 newstest2009-src.xx.sgm
-rw-r--r-- 1 yongqiang yongqiang 348154 Dec 23  2009 newstest2009.cs
-rw-r--r-- 1 yongqiang yongqiang 382617 Dec 23  2009 newstest2009.de
-rw-r--r-- 1 yongqiang yongqiang 345562 Dec 23  2009 newstest2009.en
-rw-r--r-- 1 yongqiang yongqiang 374178 Dec 23  2009 newstest2009.es
-rw-r--r-- 1 yongqiang yongqiang 391108 Dec 23  2009 newstest2009.fr
-rw-r--r-- 1 yongqiang yongqiang 406600 Mar  5  2010 newstest2010-ref.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang 453580 Mar  5  2010 newstest2010-ref.de.sgm
-rw-r--r-- 1 yongqiang yongqiang 398937 Mar  5  2010 newstest2010-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 434326 Mar  5  2010 newstest2010-ref.es.sgm
-rw-r--r-- 1 yongqiang yongqiang 453687 Mar  5  2010 newstest2010-ref.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 405159 Mar  5  2010 newstest2010-src.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang 452139 Mar  5  2010 newstest2010-src.de.sgm
-rw-r--r-- 1 yongqiang yongqiang 397496 Mar  5  2010 newstest2010-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 432885 Mar  5  2010 newstest2010-src.es.sgm
-rw-r--r-- 1 yongqiang yongqiang 452246 Mar  5  2010 newstest2010-src.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 336311 Mar  5  2010 newstest2010.cs
-rw-r--r-- 1 yongqiang yongqiang 383291 Mar  5  2010 newstest2010.de
-rw-r--r-- 1 yongqiang yongqiang 328648 Mar  5  2010 newstest2010.en
-rw-r--r-- 1 yongqiang yongqiang 364037 Mar  5  2010 newstest2010.es
-rw-r--r-- 1 yongqiang yongqiang 383398 Mar  5  2010 newstest2010.fr
-rw-r--r-- 1 yongqiang yongqiang 485879 Nov 24  2011 newstest2011-ref.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang 530959 Mar 16  2011 newstest2011-ref.de.sgm
-rw-r--r-- 1 yongqiang yongqiang 477272 Mar 16  2011 newstest2011-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 514062 Mar 16  2011 newstest2011-ref.es.sgm
-rw-r--r-- 1 yongqiang yongqiang 537007 Mar 16  2011 newstest2011-ref.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 484546 Mar 16  2011 newstest2011-src.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang 529626 Mar 16  2011 newstest2011-src.de.sgm
-rw-r--r-- 1 yongqiang yongqiang 475939 Mar 16  2011 newstest2011-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 512729 Mar 16  2011 newstest2011-src.es.sgm
-rw-r--r-- 1 yongqiang yongqiang 535674 Mar 16  2011 newstest2011-src.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 405486 Dec 10  2011 newstest2011.cs
-rw-r--r-- 1 yongqiang yongqiang 450571 Dec 10  2011 newstest2011.de
-rw-r--r-- 1 yongqiang yongqiang 396884 Dec 10  2011 newstest2011.en
-rw-r--r-- 1 yongqiang yongqiang 433659 Dec 10  2011 newstest2011.es
-rw-r--r-- 1 yongqiang yongqiang 456618 Dec 10  2011 newstest2011.fr
-rw-r--r-- 1 yongqiang yongqiang 484553 Feb 16  2012 newstest2012-ref.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang 524750 Feb 16  2012 newstest2012-ref.de.sgm
-rw-r--r-- 1 yongqiang yongqiang 459733 Mar  3  2012 newstest2012-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 504934 Feb 16  2012 newstest2012-ref.es.sgm
-rw-r--r-- 1 yongqiang yongqiang 518968 Feb 16  2012 newstest2012-ref.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 776643 Jan 23  2013 newstest2012-ref.ru.sgm
-rw-r--r-- 1 yongqiang yongqiang 483352 Mar  3  2012 newstest2012-src.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang 523549 Mar  3  2012 newstest2012-src.de.sgm
-rw-r--r-- 1 yongqiang yongqiang 458532 Mar  3  2012 newstest2012-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 503733 Mar  3  2012 newstest2012-src.es.sgm
-rw-r--r-- 1 yongqiang yongqiang 517767 Mar  3  2012 newstest2012-src.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 776618 Jan 23  2013 newstest2012-src.ru.sgm
-rw-r--r-- 1 yongqiang yongqiang 406578 Jan 23  2013 newstest2012.cs
-rw-r--r-- 1 yongqiang yongqiang 446765 Jan 23  2013 newstest2012.de
-rw-r--r-- 1 yongqiang yongqiang 381734 Jan 23  2013 newstest2012.en
-rw-r--r-- 1 yongqiang yongqiang 426936 Jan 23  2013 newstest2012.es
-rw-r--r-- 1 yongqiang yongqiang 440985 Jan 23  2013 newstest2012.fr
-rw-r--r-- 1 yongqiang yongqiang 699810 Jan 23  2013 newstest2012.ru
-rw-r--r-- 1 yongqiang yongqiang 423686 Dec 13  2013 newstest2013-ref.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang 457608 Dec 13  2013 newstest2013-ref.de.sgm
-rw-r--r-- 1 yongqiang yongqiang 405170 Dec 13  2013 newstest2013-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 451575 Dec 13  2013 newstest2013-ref.es.sgm
-rw-r--r-- 1 yongqiang yongqiang 465895 Dec 13  2013 newstest2013-ref.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 699161 Dec 13  2013 newstest2013-ref.ru.sgm
-rw-r--r-- 1 yongqiang yongqiang 423049 Dec 13  2013 newstest2013-src.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang 456971 Dec 13  2013 newstest2013-src.de.sgm
-rw-r--r-- 1 yongqiang yongqiang 404533 Dec 13  2013 newstest2013-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 450938 Dec 13  2013 newstest2013-src.es.sgm
-rw-r--r-- 1 yongqiang yongqiang 465258 Dec 13  2013 newstest2013-src.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 698524 Dec 13  2013 newstest2013-src.ru.sgm
-rw-r--r-- 1 yongqiang yongqiang 351491 Dec 13  2013 newstest2013.cs
-rw-r--r-- 1 yongqiang yongqiang 385433 Dec 13  2013 newstest2013.de
-rw-r--r-- 1 yongqiang yongqiang 332974 Dec 13  2013 newstest2013.en
-rw-r--r-- 1 yongqiang yongqiang 379357 Dec 13  2013 newstest2013.es
-rw-r--r-- 1 yongqiang yongqiang 393465 Dec 13  2013 newstest2013.fr
-rw-r--r-- 1 yongqiang yongqiang 626964 Dec 13  2013 newstest2013.ru
-rw-r--r-- 1 yongqiang yongqiang 444462 Jul  1  2014 newstest2014-csen-ref.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang 428063 Jul  1  2014 newstest2014-csen-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 442757 Jul  1  2014 newstest2014-csen-src.cs.sgm
-rw-r--r-- 1 yongqiang yongqiang 426358 Jul  1  2014 newstest2014-csen-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 468154 Jul  1  2014 newstest2014-deen-ref.de.sgm
-rw-r--r-- 1 yongqiang yongqiang 428650 Jul  1  2014 newstest2014-deen-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 466173 Jul  1  2014 newstest2014-deen-src.de.sgm
-rw-r--r-- 1 yongqiang yongqiang 426669 Jul  1  2014 newstest2014-deen-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 443970 Jul  1  2014 newstest2014-fren-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 514587 Jul  1  2014 newstest2014-fren-ref.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 441845 Jul  1  2014 newstest2014-fren-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 512462 Jul  1  2014 newstest2014-fren-src.fr.sgm
-rw-r--r-- 1 yongqiang yongqiang 356077 Jul  1  2014 newstest2014-hien-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 820215 Jul  1  2014 newstest2014-hien-ref.hi.sgm
-rw-r--r-- 1 yongqiang yongqiang 353792 Jul  1  2014 newstest2014-hien-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 818658 Jul  1  2014 newstest2014-hien-src.hi.sgm
-rw-r--r-- 1 yongqiang yongqiang 443149 Jul  1  2014 newstest2014-ruen-ref.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 763821 Jul  1  2014 newstest2014-ruen-ref.ru.sgm
-rw-r--r-- 1 yongqiang yongqiang 441036 Jul  1  2014 newstest2014-ruen-src.en.sgm
-rw-r--r-- 1 yongqiang yongqiang 761708 Jul  1  2014 newstest2014-ruen-src.ru.sgm
-rw-rw-r-- 1 yongqiang yongqiang 333884 Jan  6  2016 newstest2015-csen-ref.en.sgm
-rw-rw-r-- 1 yongqiang yongqiang 337118 Jan  6  2016 newstest2015-csen-src.cs.sgm
-rw-rw-r-- 1 yongqiang yongqiang 288866 Jan  6  2016 newstest2015-deen-ref.en.sgm
-rw-rw-r-- 1 yongqiang yongqiang 316788 Jan  6  2016 newstest2015-deen-src.de.sgm
-rw-rw-r-- 1 yongqiang yongqiang 337131 Jan  6  2016 newstest2015-encs-ref.cs.sgm
-rw-rw-r-- 1 yongqiang yongqiang 333871 Jan  6  2016 newstest2015-encs-src.en.sgm
-rw-rw-r-- 1 yongqiang yongqiang 316801 Jan  6  2016 newstest2015-ende-ref.de.sgm
-rw-rw-r-- 1 yongqiang yongqiang 288853 Jan  6  2016 newstest2015-ende-src.en.sgm
-rw-rw-r-- 1 yongqiang yongqiang 185637 Jan  6  2016 newstest2015-enfi-ref.fi.sgm
-rw-rw-r-- 1 yongqiang yongqiang 178431 Jan  6  2016 newstest2015-enfi-src.en.sgm
-rw-rw-r-- 1 yongqiang yongqiang 658969 Jan  6  2016 newstest2015-enru-ref.ru.sgm
-rw-rw-r-- 1 yongqiang yongqiang 406258 Jan  6  2016 newstest2015-enru-src.en.sgm
-rw-rw-r-- 1 yongqiang yongqiang 178444 Jan  6  2016 newstest2015-fien-ref.en.sgm
-rw-rw-r-- 1 yongqiang yongqiang 185624 Jan  6  2016 newstest2015-fien-src.fi.sgm
-rw-rw-r-- 1 yongqiang yongqiang 406271 Jan  6  2016 newstest2015-ruen-ref.en.sgm
-rw-rw-r-- 1 yongqiang yongqiang 658956 Jan  6  2016 newstest2015-ruen-src.ru.sgm
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ll /tmp/t2t_datagen/training/
total 2028544
drwxrwxrwx 1 yongqiang yongqiang       512 Jun  1 00:51 ./
drwxrwxrwx 1 yongqiang yongqiang       512 Jun  1 03:51 ../
-rw-r--r-- 1 yongqiang yongqiang  98150314 Nov 21  2011 europarl-v7.cs-en.cs
-rw-r--r-- 1 yongqiang yongqiang  94299411 Nov 21  2011 europarl-v7.cs-en.en
-rw-r--r-- 1 yongqiang yongqiang 328463491 Nov 21  2011 europarl-v7.de-en.de
-rw-r--r-- 1 yongqiang yongqiang 287250069 Nov 21  2011 europarl-v7.de-en.en
-rw-r--r-- 1 yongqiang yongqiang 294931656 Nov 21  2011 europarl-v7.es-en.en
-rw-r--r-- 1 yongqiang yongqiang 324915736 Nov 21  2011 europarl-v7.es-en.es
-rw-r--r-- 1 yongqiang yongqiang 301523301 Nov 21  2011 europarl-v7.fr-en.en
-rw-r--r-- 1 yongqiang yongqiang 346919801 Nov 21  2011 europarl-v7.fr-en.fr
(tf_cpu_1.4.1) yongqiang@yongqiang:~$

training-parallel-nc-v11 是一個詞典,運行的過程會先檢查 training-parallel-nc-v11.tgz 是否存在。如果不存在,則下載。如果存在,則直接使用 training-parallel-nc-v11 文件夾中的詞典。如果想使用自己的詞典,可以直接替換文件夾裏面的內容,或者是自定義一個 problem

TRAIN_DIR=$HOME/t2t_train/$PROBLEM/$MODEL-$HPARAMS 文件夾下數據:

(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ll /tmp/t2t_datagen/training/
total 2028544
drwxrwxrwx 1 yongqiang yongqiang       512 Jun  1 00:51 ./
drwxrwxrwx 1 yongqiang yongqiang       512 Jun  1 03:51 ../
-rw-r--r-- 1 yongqiang yongqiang  98150314 Nov 21  2011 europarl-v7.cs-en.cs
-rw-r--r-- 1 yongqiang yongqiang  94299411 Nov 21  2011 europarl-v7.cs-en.en
-rw-r--r-- 1 yongqiang yongqiang 328463491 Nov 21  2011 europarl-v7.de-en.de
-rw-r--r-- 1 yongqiang yongqiang 287250069 Nov 21  2011 europarl-v7.de-en.en
-rw-r--r-- 1 yongqiang yongqiang 294931656 Nov 21  2011 europarl-v7.es-en.en
-rw-r--r-- 1 yongqiang yongqiang 324915736 Nov 21  2011 europarl-v7.es-en.es
-rw-r--r-- 1 yongqiang yongqiang 301523301 Nov 21  2011 europarl-v7.fr-en.en
-rw-r--r-- 1 yongqiang yongqiang 346919801 Nov 21  2011 europarl-v7.fr-en.fr
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ll /home/yongqiang/t2t_train/
total 0
drwxrwxrwx 1 yongqiang yongqiang 512 May 31 16:28 ./
drwxr-xr-x 1 yongqiang yongqiang 512 May 31 16:28 ../
drwxrwxrwx 1 yongqiang yongqiang 512 May 31 16:28 wmt_ende_tokens_32k/
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ll /home/yongqiang/t2t_train/wmt_ende_tokens_32k/
total 0
drwxrwxrwx 1 yongqiang yongqiang 512 May 31 16:28 ./
drwxrwxrwx 1 yongqiang yongqiang 512 May 31 16:28 ../
drwxrwxrwx 1 yongqiang yongqiang 512 May 31 16:28 transformer-transformer_base_single_gpu/
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ll /home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu/
total 0
drwxrwxrwx 1 yongqiang yongqiang 512 May 31 16:28 ./
drwxrwxrwx 1 yongqiang yongqiang 512 May 31 16:28 ../
(tf_cpu_1.4.1) yongqiang@yongqiang:~$

100 個 train 文件和 1 個 dev 文件,都是 TFRecord 格式的,在訓練過程中使用,同時還會生成一個字詞表。

(tf_cpu_1.4.1) yongqiang@yongqiang:~$ cp $TMP_DIR/tokens.vocab.* $DATA_DIR
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ll /home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu/
total 0
drwxrwxrwx 1 yongqiang yongqiang 512 May 31 16:28 ./
drwxrwxrwx 1 yongqiang yongqiang 512 May 31 16:28 ../
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ll /home/yongqiang/t2t_train/wmt_ende_tokens_32k/
total 0
drwxrwxrwx 1 yongqiang yongqiang 512 May 31 16:28 ./
drwxrwxrwx 1 yongqiang yongqiang 512 May 31 16:28 ../
drwxrwxrwx 1 yongqiang yongqiang 512 May 31 16:28 transformer-transformer_base_single_gpu/
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ll /home/yongqiang/t2t_train/
total 0
drwxrwxrwx 1 yongqiang yongqiang 512 May 31 16:28 ./
drwxr-xr-x 1 yongqiang yongqiang 512 May 31 16:28 ../
drwxrwxrwx 1 yongqiang yongqiang 512 May 31 16:28 wmt_ende_tokens_32k/
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ ll /home/yongqiang/t2t_data/
total 753680
drwxrwxrwx 1 yongqiang yongqiang     512 Jun  1 07:51 ./
drwxr-xr-x 1 yongqiang yongqiang     512 May 31 16:28 ../
-rw-rw-rw- 1 yongqiang yongqiang  306558 Jun  1 07:51 tokens.vocab.32768
-rw-rw-rw- 1 yongqiang yongqiang  452958 Jun  1 03:52 wmt_ende_tokens_32k-dev-00000-of-00001
-rw-rw-rw- 1 yongqiang yongqiang 7667929 Jun  1 03:51 wmt_ende_tokens_32k-train-00000-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7640828 Jun  1 03:51 wmt_ende_tokens_32k-train-00001-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7674237 Jun  1 03:51 wmt_ende_tokens_32k-train-00002-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7700999 Jun  1 03:51 wmt_ende_tokens_32k-train-00003-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7663250 Jun  1 03:51 wmt_ende_tokens_32k-train-00004-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7652482 Jun  1 03:51 wmt_ende_tokens_32k-train-00005-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7642697 Jun  1 03:51 wmt_ende_tokens_32k-train-00006-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7676712 Jun  1 03:51 wmt_ende_tokens_32k-train-00007-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7639986 Jun  1 03:51 wmt_ende_tokens_32k-train-00008-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7640001 Jun  1 03:51 wmt_ende_tokens_32k-train-00009-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7673384 Jun  1 03:51 wmt_ende_tokens_32k-train-00010-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7683955 Jun  1 03:51 wmt_ende_tokens_32k-train-00011-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7647624 Jun  1 03:51 wmt_ende_tokens_32k-train-00012-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7687257 Jun  1 03:51 wmt_ende_tokens_32k-train-00013-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7688525 Jun  1 03:51 wmt_ende_tokens_32k-train-00014-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7665535 Jun  1 03:51 wmt_ende_tokens_32k-train-00015-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7690883 Jun  1 03:51 wmt_ende_tokens_32k-train-00016-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7663359 Jun  1 03:51 wmt_ende_tokens_32k-train-00017-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7686219 Jun  1 03:51 wmt_ende_tokens_32k-train-00018-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7648707 Jun  1 03:51 wmt_ende_tokens_32k-train-00019-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7718829 Jun  1 03:51 wmt_ende_tokens_32k-train-00020-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7686182 Jun  1 03:51 wmt_ende_tokens_32k-train-00021-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7675800 Jun  1 03:51 wmt_ende_tokens_32k-train-00022-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7701203 Jun  1 03:51 wmt_ende_tokens_32k-train-00023-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7657998 Jun  1 03:51 wmt_ende_tokens_32k-train-00024-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7670140 Jun  1 03:51 wmt_ende_tokens_32k-train-00025-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7675587 Jun  1 03:51 wmt_ende_tokens_32k-train-00026-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7654837 Jun  1 03:51 wmt_ende_tokens_32k-train-00027-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7677641 Jun  1 03:51 wmt_ende_tokens_32k-train-00028-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7670186 Jun  1 03:51 wmt_ende_tokens_32k-train-00029-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7667826 Jun  1 03:51 wmt_ende_tokens_32k-train-00030-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7683834 Jun  1 03:51 wmt_ende_tokens_32k-train-00031-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7690318 Jun  1 03:51 wmt_ende_tokens_32k-train-00032-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7675815 Jun  1 03:51 wmt_ende_tokens_32k-train-00033-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7634405 Jun  1 03:51 wmt_ende_tokens_32k-train-00034-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7674313 Jun  1 03:51 wmt_ende_tokens_32k-train-00035-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7668282 Jun  1 03:51 wmt_ende_tokens_32k-train-00036-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7656065 Jun  1 03:51 wmt_ende_tokens_32k-train-00037-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7643719 Jun  1 03:51 wmt_ende_tokens_32k-train-00038-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7664325 Jun  1 03:51 wmt_ende_tokens_32k-train-00039-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7647871 Jun  1 03:51 wmt_ende_tokens_32k-train-00040-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7616733 Jun  1 03:51 wmt_ende_tokens_32k-train-00041-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7637022 Jun  1 03:51 wmt_ende_tokens_32k-train-00042-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7664961 Jun  1 03:51 wmt_ende_tokens_32k-train-00043-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7645109 Jun  1 03:51 wmt_ende_tokens_32k-train-00044-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7679298 Jun  1 03:51 wmt_ende_tokens_32k-train-00045-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7655371 Jun  1 03:51 wmt_ende_tokens_32k-train-00046-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7645346 Jun  1 03:51 wmt_ende_tokens_32k-train-00047-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7660534 Jun  1 03:51 wmt_ende_tokens_32k-train-00048-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7677668 Jun  1 03:51 wmt_ende_tokens_32k-train-00049-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7656197 Jun  1 03:51 wmt_ende_tokens_32k-train-00050-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7695677 Jun  1 03:51 wmt_ende_tokens_32k-train-00051-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7681680 Jun  1 03:51 wmt_ende_tokens_32k-train-00052-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7677127 Jun  1 03:51 wmt_ende_tokens_32k-train-00053-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7658826 Jun  1 03:51 wmt_ende_tokens_32k-train-00054-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7651920 Jun  1 03:51 wmt_ende_tokens_32k-train-00055-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7661295 Jun  1 03:51 wmt_ende_tokens_32k-train-00056-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7664402 Jun  1 03:51 wmt_ende_tokens_32k-train-00057-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7664979 Jun  1 03:51 wmt_ende_tokens_32k-train-00058-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7651649 Jun  1 03:51 wmt_ende_tokens_32k-train-00059-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7708872 Jun  1 03:51 wmt_ende_tokens_32k-train-00060-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7654428 Jun  1 03:51 wmt_ende_tokens_32k-train-00061-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7648101 Jun  1 03:51 wmt_ende_tokens_32k-train-00062-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7684443 Jun  1 03:51 wmt_ende_tokens_32k-train-00063-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7681919 Jun  1 03:51 wmt_ende_tokens_32k-train-00064-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7658840 Jun  1 03:52 wmt_ende_tokens_32k-train-00065-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7674366 Jun  1 03:52 wmt_ende_tokens_32k-train-00066-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7644321 Jun  1 03:52 wmt_ende_tokens_32k-train-00067-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7664720 Jun  1 03:52 wmt_ende_tokens_32k-train-00068-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7684936 Jun  1 03:52 wmt_ende_tokens_32k-train-00069-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7673587 Jun  1 03:52 wmt_ende_tokens_32k-train-00070-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7612716 Jun  1 03:52 wmt_ende_tokens_32k-train-00071-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7673989 Jun  1 03:52 wmt_ende_tokens_32k-train-00072-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7679255 Jun  1 03:52 wmt_ende_tokens_32k-train-00073-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7658291 Jun  1 03:52 wmt_ende_tokens_32k-train-00074-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7633770 Jun  1 03:52 wmt_ende_tokens_32k-train-00075-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7660341 Jun  1 03:52 wmt_ende_tokens_32k-train-00076-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7672302 Jun  1 03:52 wmt_ende_tokens_32k-train-00077-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7668570 Jun  1 03:52 wmt_ende_tokens_32k-train-00078-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7640331 Jun  1 03:52 wmt_ende_tokens_32k-train-00079-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7679929 Jun  1 03:52 wmt_ende_tokens_32k-train-00080-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7670161 Jun  1 03:52 wmt_ende_tokens_32k-train-00081-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7629498 Jun  1 03:52 wmt_ende_tokens_32k-train-00082-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7667861 Jun  1 03:52 wmt_ende_tokens_32k-train-00083-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7633190 Jun  1 03:52 wmt_ende_tokens_32k-train-00084-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7668988 Jun  1 03:52 wmt_ende_tokens_32k-train-00085-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7715339 Jun  1 03:52 wmt_ende_tokens_32k-train-00086-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7691430 Jun  1 03:52 wmt_ende_tokens_32k-train-00087-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7693875 Jun  1 03:52 wmt_ende_tokens_32k-train-00088-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7688168 Jun  1 03:52 wmt_ende_tokens_32k-train-00089-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7680385 Jun  1 03:52 wmt_ende_tokens_32k-train-00090-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7669305 Jun  1 03:52 wmt_ende_tokens_32k-train-00091-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7673107 Jun  1 03:52 wmt_ende_tokens_32k-train-00092-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7691164 Jun  1 03:52 wmt_ende_tokens_32k-train-00093-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7735837 Jun  1 03:52 wmt_ende_tokens_32k-train-00094-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7654446 Jun  1 03:52 wmt_ende_tokens_32k-train-00095-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7679692 Jun  1 03:52 wmt_ende_tokens_32k-train-00096-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7668198 Jun  1 03:52 wmt_ende_tokens_32k-train-00097-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7650696 Jun  1 03:52 wmt_ende_tokens_32k-train-00098-of-00100
-rw-rw-rw- 1 yongqiang yongqiang 7653008 Jun  1 03:52 wmt_ende_tokens_32k-train-00099-of-00100
(tf_cpu_1.4.1) yongqiang@yongqiang:~$

3.4 訓練階段

https://github.com/tensorflow/tensor2tensor/tree/v1.0.12

# Train
# If you run out of memory, add --hparams='batch_size=2048' or even 1024.
t2t-trainer --data_dir=$DATA_DIR --problems=$PROBLEM --model=$MODEL --hparams_set=$HPARAMS --output_dir=$TRAIN_DIR

https://github.com/tensorflow/tensor2tensor

# Train
# If you run out of memory, add --hparams='batch_size=1024'.
t2t-trainer --data_dir=$DATA_DIR --problem=$PROBLEM --model=$MODEL --hparams_set=$HPARAMS --output_dir=$TRAIN_DIR

注意 problemproblems 的區別,否則會報 NoneType 的錯誤。

  • Segmentation fault (core dumped)
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ t2t-trainer --data_dir=$DATA_DIR --problems=$PROBLEM --model=$MODEL --hparams_set=$HPARAMS --output_dir=$TRAIN_DIR
INFO:tensorflow:Creating experiment, storing model files in /home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu
INFO:tensorflow:datashard_devices: ['gpu:0']
INFO:tensorflow:caching_devices: None
INFO:tensorflow:Using config: {'_save_checkpoints_steps': None, '_task_id': 0, '_save_summary_steps': 100, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f1a0850cef0>, '_num_ps_replicas': 0, '_evaluation_master': '', '_log_step_count_steps': 100, '_num_worker_replicas': 0, '_save_checkpoints_secs': 600, '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_keep_checkpoint_max': 20, '_is_chief': True, '_master': '', '_environment': 'local', '_keep_checkpoint_every_n_hours': 10000, '_tf_random_seed': None, '_model_dir': '/home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu', '_task_type': None, '_session_config': allow_soft_placement: true
graph_options {
  optimizer_options {
  }
}
}
INFO:tensorflow:Performing local training.
WARNING:tensorflow:From /home/yongqiang/miniconda3/envs/tf_cpu_1.4.1/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/monitors.py:267: BaseMonitor.__init__ (from tensorflow.contrib.learn.python.learn.monitors) is deprecated and will be removed after 2016-12-05.
Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.
INFO:tensorflow:datashard_devices: ['gpu:0']
INFO:tensorflow:caching_devices: None
INFO:tensorflow:Doing model_fn_body took 1.564 sec.
INFO:tensorflow:This model_fn took 1.691 sec.
WARNING:tensorflow:From /home/yongqiang/miniconda3/envs/tf_cpu_1.4.1/lib/python3.5/site-packages/tensor2tensor/utils/trainer_utils.py:317: get_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.get_global_step
INFO:tensorflow:Weight    body/decoder/layer_0/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/decoder/layer_0/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_0/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_0/decoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/decoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_0/decoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/decoder/layer_0/decoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/decoder/layer_0/encdec_attention/kv_transform_single/bias                        shape    (1024,)                size    1024
INFO:tensorflow:Weight    body/decoder/layer_0/encdec_attention/kv_transform_single/kernel                      shape    (1, 1, 512, 1024)      size    524288
INFO:tensorflow:Weight    body/decoder/layer_0/encdec_attention/output_transform_single/bias                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/encdec_attention/output_transform_single/kernel                  shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_0/encdec_attention/q_transform_single/bias                         shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/encdec_attention/q_transform_single/kernel                       shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_0/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/layer_norm_2/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/layer_norm_2/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/decoder/layer_1/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_1/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_1/decoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/decoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_1/decoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/decoder/layer_1/decoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/decoder/layer_1/encdec_attention/kv_transform_single/bias                        shape    (1024,)                size    1024
INFO:tensorflow:Weight    body/decoder/layer_1/encdec_attention/kv_transform_single/kernel                      shape    (1, 1, 512, 1024)      size    524288
INFO:tensorflow:Weight    body/decoder/layer_1/encdec_attention/output_transform_single/bias                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/encdec_attention/output_transform_single/kernel                  shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_1/encdec_attention/q_transform_single/bias                         shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/encdec_attention/q_transform_single/kernel                       shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_1/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/layer_norm_2/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/layer_norm_2/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/decoder/layer_2/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_2/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_2/decoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/decoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_2/decoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/decoder/layer_2/decoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/decoder/layer_2/encdec_attention/kv_transform_single/bias                        shape    (1024,)                size    1024
INFO:tensorflow:Weight    body/decoder/layer_2/encdec_attention/kv_transform_single/kernel                      shape    (1, 1, 512, 1024)      size    524288
INFO:tensorflow:Weight    body/decoder/layer_2/encdec_attention/output_transform_single/bias                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/encdec_attention/output_transform_single/kernel                  shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_2/encdec_attention/q_transform_single/bias                         shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/encdec_attention/q_transform_single/kernel                       shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_2/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/layer_norm_2/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/layer_norm_2/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/decoder/layer_3/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_3/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_3/decoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/decoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_3/decoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/decoder/layer_3/decoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/decoder/layer_3/encdec_attention/kv_transform_single/bias                        shape    (1024,)                size    1024
INFO:tensorflow:Weight    body/decoder/layer_3/encdec_attention/kv_transform_single/kernel                      shape    (1, 1, 512, 1024)      size    524288
INFO:tensorflow:Weight    body/decoder/layer_3/encdec_attention/output_transform_single/bias                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/encdec_attention/output_transform_single/kernel                  shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_3/encdec_attention/q_transform_single/bias                         shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/encdec_attention/q_transform_single/kernel                       shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_3/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/layer_norm_2/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/layer_norm_2/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/decoder/layer_4/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_4/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_4/decoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/decoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_4/decoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/decoder/layer_4/decoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/decoder/layer_4/encdec_attention/kv_transform_single/bias                        shape    (1024,)                size    1024
INFO:tensorflow:Weight    body/decoder/layer_4/encdec_attention/kv_transform_single/kernel                      shape    (1, 1, 512, 1024)      size    524288
INFO:tensorflow:Weight    body/decoder/layer_4/encdec_attention/output_transform_single/bias                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/encdec_attention/output_transform_single/kernel                  shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_4/encdec_attention/q_transform_single/bias                         shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/encdec_attention/q_transform_single/kernel                       shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_4/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/layer_norm_2/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/layer_norm_2/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/decoder/layer_5/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_5/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_5/decoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/decoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_5/decoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/decoder/layer_5/decoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/decoder/layer_5/encdec_attention/kv_transform_single/bias                        shape    (1024,)                size    1024
INFO:tensorflow:Weight    body/decoder/layer_5/encdec_attention/kv_transform_single/kernel                      shape    (1, 1, 512, 1024)      size    524288
INFO:tensorflow:Weight    body/decoder/layer_5/encdec_attention/output_transform_single/bias                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/encdec_attention/output_transform_single/kernel                  shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_5/encdec_attention/q_transform_single/bias                         shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/encdec_attention/q_transform_single/kernel                       shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_5/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/layer_norm_2/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/layer_norm_2/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_0/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/encoder/layer_0/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_0/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_0/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_0/encoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_0/encoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/encoder/layer_0/encoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/encoder/layer_0/encoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/encoder/layer_0/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_0/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_0/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_0/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_1/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/encoder/layer_1/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_1/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_1/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_1/encoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_1/encoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/encoder/layer_1/encoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/encoder/layer_1/encoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/encoder/layer_1/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_1/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_1/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_1/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_2/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/encoder/layer_2/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_2/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_2/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_2/encoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_2/encoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/encoder/layer_2/encoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/encoder/layer_2/encoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/encoder/layer_2/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_2/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_2/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_2/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_3/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/encoder/layer_3/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_3/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_3/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_3/encoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_3/encoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/encoder/layer_3/encoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/encoder/layer_3/encoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/encoder/layer_3/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_3/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_3/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_3/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_4/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/encoder/layer_4/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_4/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_4/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_4/encoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_4/encoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/encoder/layer_4/encoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/encoder/layer_4/encoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/encoder/layer_4/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_4/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_4/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_4/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_5/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/encoder/layer_5/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_5/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_5/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_5/encoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_5/encoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/encoder/layer_5/encoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/encoder/layer_5/encoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/encoder/layer_5/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_5/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_5/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_5/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/target_space_embedding/kernel                                                    shape    (32, 512)              size    16384
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_0                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_10                                           shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_11                                           shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_12                                           shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_13                                           shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_14                                           shape    (1938, 512)            size    992256
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_15                                           shape    (1938, 512)            size    992256
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_1                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_2                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_3                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_4                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_5                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_6                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_7                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_8                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_9                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Total trainable variables size: 60038144
INFO:tensorflow:Computing gradients for global model_fn.
INFO:tensorflow:Global model_fn finished.
INFO:tensorflow:Create CheckpointSaverHook.
2020-06-01 07:55:08.844229: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
Segmentation fault (core dumped)
(tf_cpu_1.4.1) yongqiang@yongqiang:~$
  • t2t-trainer --data_dir=$DATA_DIR --problems=$PROBLEM --model=$MODEL --hparams_set=$HPARAMS --output_dir=$TRAIN_DIR --hparams='batch_size=1024'
(tf_cpu_1.4.1) yongqiang@yongqiang:~$ t2t-trainer --data_dir=$DATA_DIR --problems=$PROBLEM --model=$MODEL --hparams_set=$HPARAMS --output_dir=$TRAIN_DIR --hparams='batch_size=1024'
INFO:tensorflow:Creating experiment, storing model files in /home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu
INFO:tensorflow:datashard_devices: ['gpu:0']
INFO:tensorflow:caching_devices: None
INFO:tensorflow:Using config: {'_log_step_count_steps': 100, '_keep_checkpoint_max': 20, '_save_checkpoints_steps': None, '_environment': 'local', '_save_checkpoints_secs': 600, '_task_type': None, '_tf_random_seed': None, '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_session_config': allow_soft_placement: true
graph_options {
  optimizer_options {
  }
}
, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fe1d260cef0>, '_model_dir': '/home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu', '_save_summary_steps': 100, '_keep_checkpoint_every_n_hours': 10000, '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
}
INFO:tensorflow:Performing local training.
WARNING:tensorflow:From /home/yongqiang/miniconda3/envs/tf_cpu_1.4.1/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/monitors.py:267: BaseMonitor.__init__ (from tensorflow.contrib.learn.python.learn.monitors) is deprecated and will be removed after 2016-12-05.
Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.
INFO:tensorflow:datashard_devices: ['gpu:0']
INFO:tensorflow:caching_devices: None
INFO:tensorflow:Doing model_fn_body took 1.528 sec.
INFO:tensorflow:This model_fn took 1.657 sec.
WARNING:tensorflow:From /home/yongqiang/miniconda3/envs/tf_cpu_1.4.1/lib/python3.5/site-packages/tensor2tensor/utils/trainer_utils.py:317: get_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.get_global_step
INFO:tensorflow:Weight    body/decoder/layer_0/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/decoder/layer_0/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_0/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_0/decoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/decoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_0/decoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/decoder/layer_0/decoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/decoder/layer_0/encdec_attention/kv_transform_single/bias                        shape    (1024,)                size    1024
INFO:tensorflow:Weight    body/decoder/layer_0/encdec_attention/kv_transform_single/kernel                      shape    (1, 1, 512, 1024)      size    524288
INFO:tensorflow:Weight    body/decoder/layer_0/encdec_attention/output_transform_single/bias                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/encdec_attention/output_transform_single/kernel                  shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_0/encdec_attention/q_transform_single/bias                         shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/encdec_attention/q_transform_single/kernel                       shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_0/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/layer_norm_2/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_0/layer_norm_2/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/decoder/layer_1/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_1/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_1/decoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/decoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_1/decoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/decoder/layer_1/decoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/decoder/layer_1/encdec_attention/kv_transform_single/bias                        shape    (1024,)                size    1024
INFO:tensorflow:Weight    body/decoder/layer_1/encdec_attention/kv_transform_single/kernel                      shape    (1, 1, 512, 1024)      size    524288
INFO:tensorflow:Weight    body/decoder/layer_1/encdec_attention/output_transform_single/bias                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/encdec_attention/output_transform_single/kernel                  shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_1/encdec_attention/q_transform_single/bias                         shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/encdec_attention/q_transform_single/kernel                       shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_1/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/layer_norm_2/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_1/layer_norm_2/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/decoder/layer_2/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_2/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_2/decoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/decoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_2/decoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/decoder/layer_2/decoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/decoder/layer_2/encdec_attention/kv_transform_single/bias                        shape    (1024,)                size    1024
INFO:tensorflow:Weight    body/decoder/layer_2/encdec_attention/kv_transform_single/kernel                      shape    (1, 1, 512, 1024)      size    524288
INFO:tensorflow:Weight    body/decoder/layer_2/encdec_attention/output_transform_single/bias                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/encdec_attention/output_transform_single/kernel                  shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_2/encdec_attention/q_transform_single/bias                         shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/encdec_attention/q_transform_single/kernel                       shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_2/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/layer_norm_2/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_2/layer_norm_2/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/decoder/layer_3/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_3/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_3/decoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/decoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_3/decoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/decoder/layer_3/decoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/decoder/layer_3/encdec_attention/kv_transform_single/bias                        shape    (1024,)                size    1024
INFO:tensorflow:Weight    body/decoder/layer_3/encdec_attention/kv_transform_single/kernel                      shape    (1, 1, 512, 1024)      size    524288
INFO:tensorflow:Weight    body/decoder/layer_3/encdec_attention/output_transform_single/bias                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/encdec_attention/output_transform_single/kernel                  shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_3/encdec_attention/q_transform_single/bias                         shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/encdec_attention/q_transform_single/kernel                       shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_3/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/layer_norm_2/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_3/layer_norm_2/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/decoder/layer_4/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_4/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_4/decoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/decoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_4/decoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/decoder/layer_4/decoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/decoder/layer_4/encdec_attention/kv_transform_single/bias                        shape    (1024,)                size    1024
INFO:tensorflow:Weight    body/decoder/layer_4/encdec_attention/kv_transform_single/kernel                      shape    (1, 1, 512, 1024)      size    524288
INFO:tensorflow:Weight    body/decoder/layer_4/encdec_attention/output_transform_single/bias                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/encdec_attention/output_transform_single/kernel                  shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_4/encdec_attention/q_transform_single/bias                         shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/encdec_attention/q_transform_single/kernel                       shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_4/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/layer_norm_2/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_4/layer_norm_2/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/decoder/layer_5/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_5/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/decoder/layer_5/decoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/decoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_5/decoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/decoder/layer_5/decoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/decoder/layer_5/encdec_attention/kv_transform_single/bias                        shape    (1024,)                size    1024
INFO:tensorflow:Weight    body/decoder/layer_5/encdec_attention/kv_transform_single/kernel                      shape    (1, 1, 512, 1024)      size    524288
INFO:tensorflow:Weight    body/decoder/layer_5/encdec_attention/output_transform_single/bias                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/encdec_attention/output_transform_single/kernel                  shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_5/encdec_attention/q_transform_single/bias                         shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/encdec_attention/q_transform_single/kernel                       shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/decoder/layer_5/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/layer_norm_2/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/decoder/layer_5/layer_norm_2/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_0/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/encoder/layer_0/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_0/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_0/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_0/encoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_0/encoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/encoder/layer_0/encoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/encoder/layer_0/encoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/encoder/layer_0/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_0/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_0/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_0/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_1/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/encoder/layer_1/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_1/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_1/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_1/encoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_1/encoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/encoder/layer_1/encoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/encoder/layer_1/encoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/encoder/layer_1/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_1/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_1/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_1/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_2/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/encoder/layer_2/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_2/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_2/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_2/encoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_2/encoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/encoder/layer_2/encoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/encoder/layer_2/encoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/encoder/layer_2/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_2/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_2/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_2/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_3/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/encoder/layer_3/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_3/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_3/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_3/encoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_3/encoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/encoder/layer_3/encoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/encoder/layer_3/encoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/encoder/layer_3/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_3/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_3/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_3/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_4/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/encoder/layer_4/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_4/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_4/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_4/encoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_4/encoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/encoder/layer_4/encoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/encoder/layer_4/encoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/encoder/layer_4/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_4/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_4/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_4/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_5/conv_hidden_relu/conv1_single/bias                               shape    (2048,)                size    2048
INFO:tensorflow:Weight    body/encoder/layer_5/conv_hidden_relu/conv1_single/kernel                             shape    (1, 1, 512, 2048)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_5/conv_hidden_relu/conv2_single/bias                               shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_5/conv_hidden_relu/conv2_single/kernel                             shape    (1, 1, 2048, 512)      size    1048576
INFO:tensorflow:Weight    body/encoder/layer_5/encoder_self_attention/output_transform_single/bias              shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_5/encoder_self_attention/output_transform_single/kernel            shape    (1, 1, 512, 512)       size    262144
INFO:tensorflow:Weight    body/encoder/layer_5/encoder_self_attention/qkv_transform_single/bias                 shape    (1536,)                size    1536
INFO:tensorflow:Weight    body/encoder/layer_5/encoder_self_attention/qkv_transform_single/kernel               shape    (1, 1, 512, 1536)      size    786432
INFO:tensorflow:Weight    body/encoder/layer_5/layer_norm/layer_norm_bias                                       shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_5/layer_norm/layer_norm_scale                                      shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_5/layer_norm_1/layer_norm_bias                                     shape    (512,)                 size    512
INFO:tensorflow:Weight    body/encoder/layer_5/layer_norm_1/layer_norm_scale                                    shape    (512,)                 size    512
INFO:tensorflow:Weight    body/target_space_embedding/kernel                                                    shape    (32, 512)              size    16384
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_0                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_10                                           shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_11                                           shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_12                                           shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_13                                           shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_14                                           shape    (1938, 512)            size    992256
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_15                                           shape    (1938, 512)            size    992256
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_1                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_2                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_3                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_4                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_5                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_6                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_7                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_8                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Weight    symbol_modality_31022_512/shared/weights_9                                            shape    (1939, 512)            size    992768
INFO:tensorflow:Total trainable variables size: 60038144
INFO:tensorflow:Computing gradients for global model_fn.
INFO:tensorflow:Global model_fn finished.
INFO:tensorflow:Create CheckpointSaverHook.
2020-06-01 07:59:36.595618: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
INFO:tensorflow:Saving checkpoints for 1 into /home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu/model.ckpt.
INFO:tensorflow:loss = 9.617761, step = 1
INFO:tensorflow:global_step/sec: 0.184352
INFO:tensorflow:loss = 8.638795, step = 101 (542.447 sec)
INFO:tensorflow:Saving checkpoints for 111 into /home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu/model.ckpt.
INFO:tensorflow:global_step/sec: 0.182636
INFO:tensorflow:loss = 8.1433325, step = 201 (547.531 sec)
INFO:tensorflow:Saving checkpoints for 221 into /home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu/model.ckpt.
INFO:tensorflow:global_step/sec: 0.184702
INFO:tensorflow:loss = 7.774977, step = 301 (541.413 sec)
INFO:tensorflow:Saving checkpoints for 332 into /home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu/model.ckpt.
INFO:tensorflow:global_step/sec: 0.183971
INFO:tensorflow:loss = 7.6570582, step = 401 (543.564 sec)
INFO:tensorflow:Saving checkpoints for 444 into /home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu/model.ckpt.
INFO:tensorflow:global_step/sec: 0.183122
INFO:tensorflow:loss = 7.5161967, step = 501 (546.084 sec)
INFO:tensorflow:Saving checkpoints for 555 into /home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu/model.ckpt.
INFO:tensorflow:global_step/sec: 0.185353
INFO:tensorflow:loss = 7.1285443, step = 601 (539.511 sec)
INFO:tensorflow:Saving checkpoints for 665 into /home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu/model.ckpt.
INFO:tensorflow:global_step/sec: 0.182844
INFO:tensorflow:loss = 6.6412644, step = 701 (546.914 sec)
INFO:tensorflow:Saving checkpoints for 776 into /home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu/model.ckpt.
INFO:tensorflow:global_step/sec: 0.183772
INFO:tensorflow:loss = 6.9026318, step = 801 (544.153 sec)
INFO:tensorflow:Saving checkpoints for 887 into /home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu/model.ckpt.
INFO:tensorflow:global_step/sec: 0.183551
INFO:tensorflow:loss = 6.866945, step = 901 (544.809 sec)
INFO:tensorflow:Saving checkpoints for 997 into /home/yongqiang/t2t_train/wmt_ende_tokens_32k/transformer-transformer_base_single_gpu/model.ckpt.
INFO:tensorflow:global_step/sec: 0.182267
INFO:tensorflow:loss = 6.583713, step = 1001 (548.645 sec)
INFO:tensorflow:global_step/sec: 0.187032
......

3.5 解碼

對於翻譯任務,解碼可以理解爲翻譯的過程。可以手動創建兩個文件來進行測試 (當然也可以用 t2t-datagen 產生的數據)。

https://github.com/tensorflow/tensor2tensor/tree/v1.0.12

# Decode

DECODE_FILE=$DATA_DIR/decode_this.txt
echo "Hello world" >> $DECODE_FILE
echo "Goodbye world" >> $DECODE_FILE

https://github.com/tensorflow/tensor2tensor

# Decode

DECODE_FILE=$DATA_DIR/decode_this.txt
echo "Hello world" >> $DECODE_FILE
echo "Goodbye world" >> $DECODE_FILE
echo -e 'Hallo Welt\nAuf Wiedersehen Welt' > ref-translation.de

對創建好的文件進行解碼 (翻譯)。

https://github.com/tensorflow/tensor2tensor/tree/v1.0.12

BEAM_SIZE=4
ALPHA=0.6

t2t-trainer --data_dir=$DATA_DIR --problems=$PROBLEM --model=$MODEL --hparams_set=$HPARAMS --output_dir=$TRAIN_DIR --train_steps=0 --eval_steps=0 --decode_beam_size=$BEAM_SIZE --decode_alpha=$ALPHA --decode_from_file=$DECODE_FILE

cat $DECODE_FILE.$MODEL.$HPARAMS.beam$BEAM_SIZE.alpha$ALPHA.decodes

https://github.com/tensorflow/tensor2tensor

BEAM_SIZE=4
ALPHA=0.6

t2t-decoder \
  --data_dir=$DATA_DIR \
  --problem=$PROBLEM \
  --model=$MODEL \
  --hparams_set=$HPARAMS \
  --output_dir=$TRAIN_DIR \
  --decode_hparams="beam_size=$BEAM_SIZE,alpha=$ALPHA" \
  --decode_from_file=$DECODE_FILE \
  --decode_to_file=translation.en

# See the translations
cat translation.en

注意 --decode_from_file=--decode_to_file 的對應文件。

3.6 評分

https://github.com/tensorflow/tensor2tensor

# Evaluate the BLEU score
# Report this BLEU score in papers, not the internal approx_bleu metric.
t2t-bleu --translation=translation.en --reference=ref-translation.de

4. Installation

# Assumes tensorflow or tensorflow-gpu installed
pip install tensor2tensor

# Installs with tensorflow-gpu requirement
pip install tensor2tensor[tensorflow_gpu]

# Installs with tensorflow (cpu) requirement
pip install tensor2tensor[tensorflow]

Binaries:

# Data generator
t2t-datagen

# Trainer
t2t-trainer --registry_help

Library usage:

python -c "from tensor2tensor.models.transformer import Transformer"

5. T2T overview

5.1 Datasets

Datasets are all standardized on TFRecord files with tensorflow.Example protocol buffers. All datasets are registered and generated with the data generator and many common sequence datasets are already available for generation and use.

https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/bin/t2t-datagen

數據集都在帶有 tensorflow.Example 協議緩衝區的 TFRecord 文件上標準化。所有數據集都已通過數據生成器註冊並生成,並且許多通用序列數據集已經可供生成和使用。

5.2 Problems and Modalities

Problems define training-time hyperparameters for the dataset and task, mainly by setting input and output modalities (e.g. symbol, image, audio, label) and vocabularies, if applicable. All problems are defined in problem_hparams.py. Modalities, defined in modality.py, abstract away the input and output data types so that models may deal with modality-independent tensors.

https://github.com/tensorflow/tensor2tensor/blob/v1.0.12/tensor2tensor/data_generators/problem_hparams.py
https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/utils/modality.py

Problems 定義了數據集和任務的訓練時超參數,主要是通過設置輸入和輸出 modalities (例如符號、圖像、音頻、標籤) 和詞彙表 (如果適用) 來定義的。所有問題均在 problem_hparams.py 中定義。在 modality.py 中定義的 Modalities 可以抽象出輸入和輸出數據類型,以便 models 可以處理與模態無關的張量。

5.3 Models

T2TModels define the core tensor-to-tensor transformation, independent of input/output modality or task. Models take dense tensors in and produce dense tensors that may then be transformed in a final step by a modality depending on the task (e.g. fed through a final linear transform to produce logits for a softmax over classes). All models are imported in models.py, inherit from T2TModel - defined in t2t_model.py

T2TModels 定義了核心張量到張量的轉換,而與輸入/輸出模態或任務無關。模型採用密集的張量輸入併產生密集的張量,然後可以根據任務通過模態在最後一步對其進行轉換 (例如,通過最終的線性變換進行饋送以產生針對類的 softmax 的對數)。所有模型均導入到 models.py 中,並繼承自 T2TModel (在 t2t_model.py 中定義)。

https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/models/models.py
https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/utils/t2t_model.py
https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/utils/registry.py

5.4 Hyperparameter Sets

Hyperparameter sets are defined and registered in code with @registry.register_hparams and are encoded in tf.contrib.training.HParams objects. The HParams are available to both the problem specification and the model. A basic set of hyperparameters are defined in common_hparams.py and hyperparameter set functions can compose other hyperparameter set functions.

https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/utils/registry.py
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/training/python/training/hparam.py
https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/models/common_hparams.py

Hyperparameter sets 是通過 @registry.register_hparams 在代碼中定義和註冊的,並在tf.contrib.training.HParams 對象中進行了編碼。HParams 可用於問題說明和模型。在 common_hparams.py 中定義了一組基本的超參數,並且超參數集函數可以組成其他超參數集函數。

5.5 Trainer

The trainer binary is the main entrypoint for training, evaluation, and inference. Users can easily switch between problems, models, and hyperparameter sets by using the --model, --problems, and --hparams_set flags. Specific
hyperparameters can be overridden with the --hparams flag. --schedule and related flags control local and distributed training/evaluation (distributed training documentation).

https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/docs/distributed_training.md

trainer 二進制文件是進行訓練、評估和推斷的主要切入點。用戶可以使用 --model, --problems, and --hparams_set 標誌輕鬆地在問題、模型和超參數集之間切換。特定的超參數可以使用 --hparams 標誌覆蓋。--schedule 和相關標誌控制本地和分佈式訓練/評估 (distributed training documentation)。

References

https://www.zhihu.com/people/jin-tian-zuo-zuo-ye-mei/posts

https://blog.csdn.net/csa121/article/details/79605215
https://blog.csdn.net/qq_30650047/article/details/102996021
https://androidkt.com/training-neural-machine-translation-with-tensor2tensor/
https://blog.csdn.net/hpulfc/article/details/81172498

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章