如何利用TensorFlow Hub 讓BERT開發更簡單?

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在自然語言處理領域,"},{"type":"link","attrs":{"href":"https:\/\/ai.googleblog.com\/2018\/11\/open-sourcing-bert-state-of-art-pre.html","title":null,"type":null},"content":[{"type":"text","text":"BERT"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 和其他 "},{"type":"link","attrs":{"href":"https:\/\/ai.googleblog.com\/2017\/08\/transformer-novel-neural-network.html","title":null,"type":null},"content":[{"type":"text","text":"Transformer"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 編碼器架構都非常成功,無論是推進學術基準的技術水平,還是在 "},{"type":"link","attrs":{"href":"https:\/\/blog.google\/products\/search\/search-language-understanding-bert\/","title":null,"type":null},"content":[{"type":"text","text":"Google Search"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 這樣的大規模應用中,均是如此。BERT 自 TensorFlow 創建以來一直可用,但它最初依賴於非 TensorFlow 的 Python 代碼,以將原始文本轉換爲模型輸入。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"如今,在 TensorFlow 中構建 BERT 會更加簡單。開發者可在 "},{"type":"link","attrs":{"href":"https:\/\/tfhub.dev\/google\/collections\/bert\/1","title":null,"type":null},"content":[{"type":"text","text":"TensorFlow Hub"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 上使用"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"預訓練編碼器"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"和匹配的"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"文本預處理"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"模型。在 TensorFlow 中運行 BERT 對文本輸入的操作只需要幾行代碼:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"# Load BERT and the preprocessing model from TF Hub.\npreprocess = hub.load('https:\/\/tfhub.dev\/tensorflow\/bert_en_uncased_preprocess\/1')\nencoder = hub.load('https:\/\/tfhub.dev\/tensorflow\/bert_en_uncased_L-12_H-768_A-12\/3')\n\n\n# Use BERT on a batch of raw text inputs.\ninput = preprocess(['Batch of inputs', 'TF Hub makes BERT easy!', 'More text.'])\npooled_output = encoder(input)[\"pooled_output\"]\nprint(pooled_output)\n\n\ntf.Tensor(\n[[-0.8384154 -0.26902363 -0.3839138 ... -0.3949695 -0.58442086 0.8058556 ]\n [-0.8223734 -0.2883956 -0.09359277 ... -0.13833837 -0.6251748 0.88950026]\n [-0.9045408 -0.37877116 -0.7714909 ... -0.5112085 -0.70791864 0.92950743]],\nshape=(3, 768), dtype=float32"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這些編碼器和預處理模型已經用 "},{"type":"link","attrs":{"href":"https:\/\/github.com\/tensorflow\/models\/tree\/master\/official\/nlp","title":null,"type":null},"content":[{"type":"text","text":"TensorFlow Model Garden"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 的 NLP 庫構建,並以 "},{"type":"link","attrs":{"href":"https:\/\/www.tensorflow.org\/hub\/tf2_saved_model","title":null,"type":null},"content":[{"type":"text","text":"SavedModel 格式"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"導出到 TensorFlow Hub。實際上,預處理使用 "},{"type":"link","attrs":{"href":"https:\/\/blog.tensorflow.org\/2019\/06\/introducing-tftext.html","title":null,"type":null},"content":[{"type":"text","text":"TF.text"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 庫中的 TensorFlow ops 輸入文本進行標記化:允許開發者建立自己的 TensorFlow 模型,將原始文本輸入到預測輸出,而無需使用 Python 的循環。這樣可以提高計算速度,去除樣板代碼,減少出錯的可能性,並且可以將整個文本序列化爲輸出模型,使得 BERT 在生產環境中更容易使用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了詳細說明這些模型的具體作用,我們發佈了兩個新的教程:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.tensorflow.org\/tutorials\/text\/classify_text_with_bert","title":null,"type":null},"content":[{"type":"text","text":"初級"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"教程:解決一項情感分析任務,不需要任何特殊定製,就能得到很好的模型質量。這是最簡單的使用 BERT 和預處理模型的方法。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.tensorflow.org\/tutorials\/text\/solve_glue_tasks_using_bert_on_tpu","title":null,"type":null},"content":[{"type":"text","text":"高級"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"教程:解決了在 TPU 上運行 "},{"type":"link","attrs":{"href":"http:\/\/gluebenchmark.com\/","title":null,"type":null},"content":[{"type":"text","text":"GLUE 基準"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"中的自然語言處理分類任務。它還說明了如何在需要多段輸入的情況下使用預處理模型。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/e3\/e35307fe47b1a513a0fc27aef2cb932d.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"選擇 BERT 模型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"BERT 模型是在大型文本語料庫(例如,Wikipedia 文章的歸檔)上使用自我監督任務進行預訓練的,比如根據上下文預測句子中的單詞。這種類型的訓練使模型能夠在沒有標記數據的情況下學習文本語義的強大表示。但是訓練它需要大量的計算:在 16 個 TPU 上花費 4 天的時間(如 2018 年 "},{"type":"link","attrs":{"href":"https:\/\/arxiv.org\/abs\/1810.04805","title":null,"type":null},"content":[{"type":"text","text":"BERT 論文"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"所報道的)。所幸的是,在這種昂貴的預訓練完成一次後,我們就可以爲許多不同的任務高效地重用這種豐富的表示了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"八個 "},{"type":"link","attrs":{"href":"https:\/\/tfhub.dev\/google\/collections\/bert\/1","title":null,"type":null},"content":[{"type":"text","text":"BERT 模型"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"是與 BERT 原始作者發佈的訓練權重一起提供的。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"24 個 "},{"type":"link","attrs":{"href":"https:\/\/tfhub.dev\/google\/collections\/bert\/1","title":null,"type":null},"content":[{"type":"text","text":"Small BERT"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 具有相同的通用架構,但 Transformer 會更少或更小,這讓你可以探索速度、尺寸和質量之間的權衡。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/tfhub.dev\/google\/collections\/albert\/1","title":null,"type":null},"content":[{"type":"text","text":"ALBERT"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":":這是四種不同大小的“A Lite Bert”,通過在層之間共享參數來減少模型大小(但不是計算時間)。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"8 個 "},{"type":"link","attrs":{"href":"https:\/\/tfhub.dev\/google\/collections\/experts\/bert\/1","title":null,"type":null},"content":[{"type":"text","text":"BERT Experts"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 都具有相同的 BERT 架構和大小,但是爲預訓練域和中間微調任務提供了不同的選擇,以便更好地配合目標任務。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/tfhub.dev\/google\/collections\/electra\/1","title":null,"type":null},"content":[{"type":"text","text":"Electra"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 具有與 BERT 相同的架構(有三種不同的大小),但在預訓練時作爲判別器,類似於生成對抗網絡(Generative Adversarial Network,GAN)。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"BERT 與 Talking-Heads Attention 和 Gated GELU "},{"type":"link","attrs":{"href":"https:\/\/tfhub.dev\/tensorflow\/talkheads_ggelu_bert_en_base\/1","title":null,"type":null},"content":[{"type":"text","text":"[base"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":", "},{"type":"link","attrs":{"href":"https:\/\/tfhub.dev\/tensorflow\/talkheads_ggelu_bert_en_large\/1","title":null,"type":null},"content":[{"type":"text","text":"large"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"] 對 Transformer 架構的核心進行了兩個改進。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/tfhub.dev\/tensorflow\/lambert_en_uncased_L-24_H-1024_A-16\/1","title":null,"type":null},"content":[{"type":"text","text":"Lambert"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 已經接受了一些由 LAMB 優化器和 Roberta 提供的技術訓練。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"......."}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這些模型是 BERT 編碼器。上述鏈接將可以訪問 TF Hub 上的文檔,其中提到了各自所使用的正確的預處理模型。我們建議開發者訪問這些模型頁面,以便了解更多關於每個模型所針對的不同應用場景。基於其通用界面,通過更改編碼器模型及其預處理的 URL,可以方便地對不同編碼器進行特定任務的性能實驗和比較。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"預處理模型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"對於每個 BERT 編碼器,都有一個匹配的預處理模型。它使用 "},{"type":"codeinline","content":[{"type":"text","text":"TF.text"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 庫提供的 TensorFlow ops,它可以將原始文本轉換爲編碼器所期望的數字輸入時序。不像純 Python 的預處理那樣,這些操作可以作爲 TensorFlow 模型的一部分,用於直接從文本輸入中提供服務。每個 TF Hub 的預處理模型都已經配置了詞彙表及其相關的文本歸一化邏輯,無需進行進一步的設置。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"前面我們已經介紹了最簡單的預處理模型的使用方法,接下來讓我們仔細看看。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"preprocess = hub.load('https:\/\/tfhub.dev\/tensorflow\/bert_en_uncased_preprocess\/1')\ninput = preprocess([\"This is an amazing movie!\"])\n \n{'input_word_ids': ,\n 'input_mask': ,\n 'input_type_ids': }"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"像這樣調用 "},{"type":"codeinline","content":[{"type":"text","text":"preprocess()"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 可以將原始文本輸入轉換爲固定長度的 BERT 編碼器輸入序列。你可以看到,它由一個張量 "},{"type":"codeinline","content":[{"type":"text","text":"input_word_ids"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 組成,其中包含了每個標記化輸入的數字 id,包括開始、結束和填充標記,再加上兩個輔助張量:一個 "},{"type":"codeinline","content":[{"type":"text","text":"input_mask"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(用於區分非填充和填充標記)和每個標記的 "},{"type":"codeinline","content":[{"type":"text","text":"input_type_ids"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(可以區分每個輸入的多個文本段,我們將在下面討論)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"相同的預處理 SavedModel 還提供了更細粒度的 API,支持在編碼器的一個輸入序列中使用一個或兩個不同的文本段。下面我們來看一個句子蘊含任務:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"text_premises = [\"The fox jumped over the lazy dog.\",\n \"Good day.\"]\ntokenized_premises = preprocess.tokenize(text_premises)\n \n\n \ntext_hypotheses = [\"The dog was lazy.\", # Entailed.\n \"Axe handle!\"] # Not entailed.\ntokenized_hypotheses = preprocess.tokenize(text_hypotheses)\n \n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"每個標記化的結果是一個數字 "},{"type":"codeinline","content":[{"type":"text","text":"token id"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 的 "},{"type":"link","attrs":{"href":"https:\/\/www.tensorflow.org\/guide\/ragged_tensor","title":null,"type":null},"content":[{"type":"text","text":"RaggedTensor"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",完整地表示每一個文本輸入。如果某些前提和假設對太長,無法在下一步用於 BERT 輸入的 "},{"type":"codeinline","content":[{"type":"text","text":"seq_length"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 內適應,則可以在這裏進行額外的預處理,比如修剪文本段或將其分割成多個編碼器輸入。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"然後,將標記化的輸入打包爲用於 BERT 編碼器的固定長度的輸入序列:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"encoder_inputs = preprocess.bert_pack_inputs(\n [tokenized_premises, tokenized_hypotheses],\n seq_length=18) # Optional argument, defaults to 128.\n \n{'input_word_ids': ,\n 'input_mask': ,\n 'input_type_ids': }"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"打包的結果是已經熟悉的 "},{"type":"codeinline","content":[{"type":"text","text":"input_word_ids"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"、"},{"type":"codeinline","content":[{"type":"text","text":"input_mask"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"input_type_ids"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(第一個和第二個輸入分別爲 0 和 1)。所有輸出都有一個公共的 "},{"type":"codeinline","content":[{"type":"text","text":"seq_length"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(默認爲 128)。在打包過程中,超過 "},{"type":"codeinline","content":[{"type":"text","text":"seq_length"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 的輸入被截斷爲大致相等的大小。 "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"加速模型訓練"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"TensorFlow Hub 將 BERT 編碼器和預處理模型作爲獨立的部分,用於加速訓練,特別是在 TPU 上。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"張量處理單元(Tensor Processing Units,TPU)是 Google 定製開發的加速器硬件,它擅長於大規模機器學習計算,比如對 BERT 所需的計算進行微調。TPU 工作在密集的張量上,並期望像字符串這樣的可變長度數據,已由主機 CPU 轉換爲固定大小的張量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"由於 BERT 編碼器模型與其相關的預處理模型之間的解耦,可以將編碼器微調計算作爲模型訓練的一部分分配給 TPU,而預處理模型則在主機 CPU 上執行。通過使用 "},{"type":"codeinline","content":[{"type":"text","text":"tf.data.Dataset.map()"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",可以在數據集中異步運行預處理計算,並且TPU上的編碼器模型可以消耗密集的輸出。這種異步預處理還可以改善其他加速器的性能。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們的"},{"type":"link","attrs":{"href":"https:\/\/www.tensorflow.org\/tutorials\/text\/solve_glue_tasks_using_bert_on_tpu","title":null,"type":null},"content":[{"type":"text","text":"高級"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" BERT 教程可以在使用 TPU 工作器的 Colab 運行時中運行,並演示了這種端到端的方式。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在 TensorFlow 中使用 BERT 和類似的模型已經變得更加簡單了。TensorFlow Hub 提供了"},{"type":"link","attrs":{"href":"https:\/\/tfhub.dev\/google\/collections\/transformer_encoders_text\/1","title":null,"type":null},"content":[{"type":"text","text":"大量"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"預訓練 BERT 編碼器"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"和"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"文本預處理模型"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",只需幾行代碼就能很容易地使用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"作者介紹:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Arno Eigenwillig,軟件工程師。 Luiz GUStavo Martins,開發技術推廣工程師。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"https:\/\/blog.tensorflow.org\/2020\/12\/making-bert-easier-with-preprocessing-models-from-tensorflow-hub.html"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章