如何使用BERT進行自然語言處理?

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"迄今爲止,在我們的 ML.NET 之旅中,我們主要關注計算機視覺問題,例如圖像分類和目標檢測。在本文中,我們將轉向自然語言處理,並探索一些我們可以用機器學習來解決的問題。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自然語言處理(Natural language processing,NLP)是人工智能的一個子領域,其主要目的是幫助程序理解和處理自然語言數據。這一過程的輸出是一個計算機程序,它可以“理解”語言。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/4b\/4b344d1d98601df47119a1d96d8727ec.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"如果你擔心人工智能會奪走你的飯碗,那麼一定要成爲它的創造者,並與不斷上升的人工智能產業保持緊密聯繫。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"追溯到 2018 年,谷歌發表了一篇論文,其中有一個深度神經網絡叫做 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"B"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"idirectional "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"E"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"ncoder "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"R"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"epresentations from "},{"type":"link","attrs":{"href":"https:\/\/rubikscode.net\/2019\/07\/29\/introduction-to-transformers-architecture\/","title":null,"type":null},"content":[{"type":"text","text":"Transformers"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 或 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"BERT"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。因爲它的簡單性,它成爲目前最流行的一種自然語言處理算法。使用這種算法,任何人都能在短短的幾個小時內訓練自己最先進的問答系統(或其他各種模型)。在本文中,我們將使用 BERT 來創建一個問答系統。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/00\/00af69f6afbc6e4bf2889578a9248e6e.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"BERT 是基於 Transformer 架構的"},{"type":"link","attrs":{"href":"https:\/\/rubikscode.net\/deep-learning-for-programmers\/","title":null,"type":null},"content":[{"type":"text","text":"神經網絡"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。正因爲如此,在本文中,我們將首先探索這個架構,然後再進一步瞭解 BERT:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"前提"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"理解 Transformer 架構"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"BERT 直覺"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"ONNX 模型"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"用 ML.NET 實現"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"1. 前提"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"本文的實現用 "},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"C#"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 語言完成,我們使用最新的 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":".NET 5"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。因此要確保你已安裝此 SDK。若你正在使用 "},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Visual Studio"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",則隨附 16.8.3 版本。此外,確保你已安裝下列軟件包:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"$ dotnet add package Microsoft.ML\n$ dotnet add package Microsoft.ML.OnnxRuntime\n$ dotnet add package Microsoft.ML.OnnxTransformer"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"你可以在 "},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Package Manager Console "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"中執行相同操作:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"Install-Package Microsoft.ML\nInstall-Package Microsoft.ML.OnnxRuntime\nInstall-Package Microsoft.ML.OnnxTransformer"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"你可以使用 Visual Studio 的 "},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Manage NuGetPackage"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 選項來執行類似操作:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/7c\/7ca268ffbdd18423593c2df9fc85a9f9.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"假如你想了解使用 ML.NET 進行機器學習的基本知識,請看這篇文章:《"},{"type":"link","attrs":{"href":"https:\/\/rubikscode.net\/2021\/01\/04\/machine-learning-with-ml-net-introduction\/","title":null,"type":null},"content":[{"type":"text","text":"使用 ML.NET 進行機器學習:簡介"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"》("},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Machine Learning with ML.NET – Introduction"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":")。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"2. 理解 Transformer 架構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"語言是順序數據。從根本上說,你可以把它看成是一個詞流,每個詞的含義都取決於它前面的詞和後面的詞。因此,計算機理解語言非常困難,因爲要想理解一個詞,你需要一個"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"上下文"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"此外,有時候作爲輸出,還需要提供數據"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"序列"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(詞)。把英語翻譯成塞爾維亞語就是一個好例子。我們將詞序列作爲算法的輸入,同時對輸出也需要提供一個序列。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"本例中,一種算法要求我們理解英語,並理解如何將英語單詞映射到塞爾維亞語單詞(實質上,這意味着對塞爾維亞語也有某種程度的理解)。在過去的幾年裏,已經有很多"},{"type":"link","attrs":{"href":"https:\/\/rubikscode.net\/deep-learning-for-programmers\/","title":null,"type":null},"content":[{"type":"text","text":"深度學習"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"的架構用於這種目的,例如"},{"type":"link","attrs":{"href":"https:\/\/rubikscode.net\/deep-learning-for-programmers\/","title":null,"type":null},"content":[{"type":"text","text":"遞歸神經網絡"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(Recurrent Neural Network,RNN)和"},{"type":"link","attrs":{"href":"https:\/\/rubikscode.net\/deep-learning-for-programmers\/","title":null,"type":null},"content":[{"type":"text","text":"長短期記憶網絡"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(LSTM)。但是,"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Transformer"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 架構的使用改變了一切。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/15\/15ce49a76154637acfa5950c5ed22943.gif","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"由於 RNN 和 LSTM 難以訓練,且已出現梯度"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"消失"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(和爆炸),因此不能完全滿足需求。Transformer 的目的就是解決這些問題,帶來更好的性能和更好的語言"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"理解"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。它們於 2017 年推出,並被髮表在一篇名爲《"},{"type":"link","attrs":{"href":"https:\/\/arxiv.org\/pdf\/1706.03762.pdf","title":null,"type":null},"content":[{"type":"text","text":"注意力就是你所需要的一切"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"》("},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Attention is all you need"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":")的傳奇性論文上。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"簡而言之,他們使用編碼器 - 解碼器結構和自注意力層來更好地理解語言。如果我們回到翻譯的例子,編碼器負責"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"理解"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"英語,解碼器負責"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"理解"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"塞爾維亞語,並將英語"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"映射"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"到塞爾維亞語。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ff\/ffae9cad877ca1b4568ddf2c2dbb417e.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在訓練過程中,使用過程編碼器從英語語言中提取詞"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"嵌入"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。計算機並不理解單詞,它們理解的是數字和矩陣(一組數字)。這就是爲什麼我們要將詞轉換成"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"向量空間"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",也就是說,我們爲語言中的每個詞分配某些向量(將它們映射到某些潛在的向量空間)。這些就是詞嵌入。有許多可用的詞嵌入,如 Word2Vec。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"但是,該詞在句子中的位置也是影響上下文的重要因素,所以纔會有位置編碼。編碼器就是這樣獲取關於單詞和它的上下文信息的。編碼器的自注意力層確定了詞之間的關係,併爲我們提供了句子中每一個詞"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"相互關係"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"的信息。編碼器就是這樣理解英語的。接着,數據進入深度神經網絡,再進入解碼器的映射 - 注意力層。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"不過,在此之前,解碼器已獲取有關塞爾維亞語的同樣信息。用同樣的方法學習如何"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"理解"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"塞爾維亞語,使用詞嵌入、位置編碼和自注意力。解碼器的映射 - 注意力層既有英語也有塞爾維亞語的"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"信息"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",它只是學習如何從一種語言轉換到另一種語言的詞。如需有關 Transformer 的更多信息,請參閱這篇文章《"},{"type":"link","attrs":{"href":"https:\/\/rubikscode.net\/2019\/07\/29\/introduction-to-transformers-architecture\/","title":null,"type":null},"content":[{"type":"text","text":"Transformer 架構介紹"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"》("},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Introduction to Transformers Architecture"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":")。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"3. BERT 直覺"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"BERT 使用這種 Transformer 架構來理解語言。更爲確切的是,它使用了編碼器。這個架構有兩大里程碑。首先,它實現了"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"雙向性"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。也就是說,每個句子都是雙向學習的,並且更好地學習上下文,包括之前的上下文和將來的上下文。BERT 是首個採用純文本語料("},{"type":"link","attrs":{"href":"https:\/\/www.wikipedia.org\/","title":null,"type":null},"content":[{"type":"text","text":"維基百科"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":")進行訓練的深度雙向、無監督的語言表示。這也是最早應用於自然語言處理的一種"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"預訓練模型"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。在計算機視覺中,我們瞭解了遷移學習。但是,在 BERT 出現之前,這一概念就沒有在自然語言處理領域得到重視。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這有很大的意義,因爲你可以在大量的數據上訓練模型,並且一旦模型理解了語言,你就可以根據更具體的任務對它進行"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"微調"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。因此,BERT 的訓練分爲兩個階段:預訓練和微調。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/2a\/2a60cd79b1803d641c5aaab0d77ff544.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"BERT 預訓練採用兩種方法實現雙向性:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"掩碼語言建模:MLM(Masked Language Modeling)"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"下一句預測:NSP(Next Sentence Prediction)"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"掩碼語言建模"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"使用掩碼輸入。這意味着句子中的一些詞被掩碼,BERT 的工作就是填補這些"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"空白"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"下一句預測"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"是給出兩個句子作爲輸入,並期望 BERT 預測是一個句子接着另一個句子。在現實中,這兩種方法都是"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"同時"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"發生的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/db\/db2c8eac16e4e80c169a8d127ec46ac2.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在微調階段,我們爲特定的任務訓練 BERT。這就是說,如果我們想要創建一個問答系統的解決方案,我們只需要訓練 BERT 的"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"額外"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"層。這正是我們在本教程中所做的。所有我們需要做的就是將網絡的輸出層替換爲爲我們特定目的設計的新層集。我們有文本"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"段"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(或上下文)和"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"問題"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"作爲輸入,而作爲輸出,我們想要問題的"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"答案"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/cb\/cb712c3971ff7be4de229e6c651c44d5.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"舉例來說,我們的系統,應該使用兩個句子。爲了提供答案“Jim”,可以使用“Jim is walking through the woods.”(段落或上下文)和“What is his name?” (問題)。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"4. ONNX 模型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在進一步探討利用 ML.NET 實現對象檢測應用之前,我們還需要介紹一個理論上的內容。那就是"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"開放神經網絡交換"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"( Open Neural Network Exchange,ONNX)文件格式。這種文件格式是人工智能模型的一種開源格式,它支持框架之間的"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"互操作性"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"你可以用機器學習的框架(比如 "},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"PyTorch"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":")來訓練模型,保存模型,並將其轉換爲 ONNX 格式。那麼你就可以將 ONNX 模型用於另一個框架,比如 ML.NET。這正是我們在本教程中所做的內容。你可以在 "},{"type":"link","attrs":{"href":"https:\/\/onnx.ai\/","title":null,"type":null},"content":[{"type":"text","text":"ONNX 網站"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"上找到詳細信息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/3b\/3b0b5bff03cedc7b27a465c3b87d9223.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在本教程中,我們使用了預訓練 BERT 模型,在"},{"type":"link","attrs":{"href":"https:\/\/github.com\/onnx\/models\/tree\/master\/text\/machine_comprehension\/bert-squad","title":null,"type":null},"content":[{"type":"text","text":"這裏"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"可以找到該模型,即 BERT SQUAD。簡而言之就是,我們將這個模型導入到 ML.NET 中,並在應用中運行它。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在 ONNX 模型中,有一件非常有趣且有用的事情,那就是我們可以使用一系列工具來對模型進行可視化表示。這在像本教程一樣使用"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"預訓練"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"模型的情況下很有用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們常常需要知道輸入層和輸出層的名字,而這個工具在這方面很有優勢。所以,下載 BERT 模型之後,我們就可以使用這些工具中的一種來加載它,並進行"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"可視化表示"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。我們在這個指南中使用 "},{"type":"link","attrs":{"href":"https:\/\/netron.app\/","title":null,"type":null},"content":[{"type":"text","text":"Netron"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",這裏只有一部分輸出:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/e2\/e2ff7cfd4058a0681a510d1f1ed9cd5b.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我知道,這太瘋狂了,BERT 是個大模型。你可能會想,我怎麼能用這個,爲什麼我需要它?但是,爲了使用 ONNX 模型,我們通常需要知道模型的輸入和輸出層的名稱。BERT 看起來是下面這樣的:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/31\/31df1e39887d6b2eee1c23c702774b38.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"5. 用 ML.NET 實現"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在我們下載"},{"type":"link","attrs":{"href":"https:\/\/github.com\/onnx\/models\/tree\/master\/text\/machine_comprehension\/bert-squad","title":null,"type":null},"content":[{"type":"text","text":"模型"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"的 BERT-Squad 倉庫中,你會注意到關於依賴性的有趣部分。更爲確切的說,你將注意到依賴於 "},{"type":"link","attrs":{"href":"https:\/\/github.com\/onnx\/models\/blob\/master\/text\/machine_comprehension\/bert-squad\/dependencies\/tokenization.py0","title":null,"type":null},"content":[{"type":"text","text":"tokenization.py"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。這意味着我們需要自己進行標記化處理。詞標記化是將大量的文本樣本分割成詞的一個過程。在自然語言處理中,每一個詞都需要捕捉,並進行進一步分析。做這件事的方法有很多。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"實際上,我們進行詞編碼,併爲此使用 Word-Piece Tokenization,正如這篇"},{"type":"link","attrs":{"href":"https:\/\/paperswithcode.com\/method\/wordpiece","title":null,"type":null},"content":[{"type":"text","text":"論文"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"所描述的那樣。該版本由 "},{"type":"codeinline","content":[{"type":"text","text":"tokenzaton.py"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 移植。爲實現這一複雜的解決方案,我們構建了這樣的解決方案:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/61\/615acd331e43743596b23019eaf553b0.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在 "},{"type":"codeinline","content":[{"type":"text","text":"Assets"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 文件夾中,你可以找到下載的 .onnx 模型和包含詞彙的文件夾,我們要在這些詞彙上訓練我們的模型。"},{"type":"codeinline","content":[{"type":"text","text":"Machine Learning"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 文件夾包含我們在這個應用程序中所需要的代碼。"},{"type":"codeinline","content":[{"type":"text","text":"Trainer"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"Predictor"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 類就在這裏,就像爲數據建模的類一樣。在單獨的文件夾中,我們可以找到用於加載文件的 helper 類和用於 "},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Softmax"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 的 Enumerable 類型的 extension 類以及字符串的拆分。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/97\/976d3b3cddd3b2188a65987744f2d11b.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這個解決方案的靈感來源於 Gjeran Vlot 的實現,你可以在"},{"type":"link","attrs":{"href":"https:\/\/github.com\/GerjanVlot\/BERT-ML.NET","title":null,"type":null},"content":[{"type":"text","text":"這裏"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"找到。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"5.1 數據模型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你可能注意到,在 DataModel 文件夾中,我們爲 BERT 的輸入和預測提供了兩個類。BertInput 類是用來表示輸入的。它們的名稱和大小與模型中的層類似:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"using Microsoft.ML.Data;\n\nnamespace BertMlNet.MachineLearning.DataModel\n{\n public class BertInput\n {\n [VectorType(1)]\n [ColumnName(\"unique_ids_raw_output___9:0\")]\n public long[] UniqueIds { get; set; } \n\n[VectorType(1, 256)]\n [ColumnName(\"segment_ids:0\")]\n public long[] SegmentIds { get; set; }\n\n [VectorType(1, 256)]\n [ColumnName(\"input_mask:0\")]\n public long[] InputMask { get; set; }\n\n [VectorType(1, 256)]\n [ColumnName(\"input_ids:0\")]\n public long[] InputIds { get; set; }\n}\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Bertpredictions 類使用 BERT 輸出層"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"using Microsoft.ML.Data;\n\nnamespace BertMlNet.MachineLearning.DataModel\n{\n public class BertPredictions\n {\n [VectorType(1, 256)]\n [ColumnName(\"unstack:1\")]\n public float[] EndLogits { get; set; }\n\n [VectorType(1, 256)]\n [ColumnName(\"unstack:0\")]\n public float[] StartLogits { get; set; }\n\n [VectorType(1)]\n [ColumnName(\"unique_ids:0\")]\n public long[] UniqueIds { get; set; }\n }\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"5.2 訓練器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"codeinline","content":[{"type":"text","text":"Trainer"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(訓練器)類非常簡單,它只有一個方法 "},{"type":"codeinline","content":[{"type":"text","text":"BuildAndTrain"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",使用預訓練模型的路徑。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"using BertMlNet.MachineLearning.DataModel;\nusing Microsoft.ML;\nusing System.Collections.Generic;\n\nnamespace BertMlNet.MachineLearning\n{\n public class Trainer\n {\n private readonly MLContext _mlContext;\n\n\n public Trainer()\n {\n _mlContext = new MLContext(11);\n }\n\n public ITransformer BuidAndTrain(string bertModelPath, bool useGpu)\n {\n var pipeline = _mlContext.Transforms\n .ApplyOnnxModel(modelFile: bertModelPath, \n outputColumnNames: new[] { \"unstack:1\", \n \"unstack:0\", \n \"unique_ids:0\" }, \n inputColumnNames: new[] {\"unique_ids_raw_output___9:0\",\n \"segment_ids:0\", \n \"input_mask:0\", \n \"input_ids:0\" }, \n gpuDeviceId: useGpu ? 0 : (int?)null);\n\n\n return pipeline.Fit(_mlContext.Data.LoadFromEnumerable(new List()));\n }\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在上述方法中,我們建立了管道。在這裏,我們應用 ONNX 模型並將數據模型與 BERT ONNX 模型的各個層連接起來。請注意,我們有一個標誌,可以用來在 CPU 或 GPU 上訓練這個模型。最後,我們將該模型與空白數據進行擬合。這麼做的目的是加載數據模式,即加載模型。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"5.3 預測器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Predictor(預測器)類甚至更加簡單。它接收一個經過訓練和加載的模型,並創建一個預測引擎。然後它使用這個預測引擎爲新圖像創建預測。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":" using BertMlNet.MachineLearning.DataModel;\nusing Microsoft.ML;\n\nnamespace BertMlNet.MachineLearning\n{\n public class Predictor\n {\n private MLContext _mLContext;\n private PredictionEngine _predictionEngine;\n \n public Predictor(ITransformer trainedModel)\n {\n _mLContext = new MLContext();\n _predictionEngine = _mLContext.Model\n .CreatePredictionEngine(trainedModel);\n }\n\n public BertPredictions Predict(BertInput encodedInput)\n {\n return _predictionEngine.Predict(encodedInput);\n }\n}\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"5.4 助手和擴展"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有一個 helper(助手)類和兩個 extension(擴展)類。helper 類 FileReader 有一個讀取文本文件的方法。我們稍後用它來從文件中加載詞彙表。它非常簡單:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":" using System.Collections.Generic;\nusing System.IO;\n\nnamespace BertMlNet.Helpers\n{\n public static class FileReader\n {\n public static List ReadFile(string filename)\n {\n var result = new List();\n \n using (var reader = new StreamReader(filename))\n {\n string line;\n\n while ((line = reader.ReadLine()) != null)\n {\n if (!string.IsNullOrWhiteSpace(line))\n {\n result.Add(line);\n }\n }\n }\n\n return result;\n }\n}\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有兩個 extension 類。一個用於對元素集合進行 Softmax 操作,另一個用於分割字符串並一次處理一個結果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"using System;\nusing System.Collections.Generic;\nusing System.Linq;\n\nnamespace BertMlNet.Extensions\n{\n public static class SoftmaxEnumerableExtension\n {\n public static IEnumerable Softmax(\n this IEnumerable collection, \n Func scoreSelector)\n {\n var maxScore = collection.Max(scoreSelector);\n var sum = collection.Sum(r => Math.Exp(scoreSelector(r) - maxScore));\n\nreturn collection.Select(r => (r, (float)(Math.Exp(scoreSelector(r) - maxScore) \/ sum)));\n }\n}\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"using System.Collections.Generic;\n\nnamespace BertMlNet.Extensions\n{\n static class StringExtension\n {\n public static IEnumerable SplitAndKeep(\n \t\t\t\t\tthis string inputString, params char[] delimiters)\n {\n int start = 0, index;\n\nwhile ((index = inputString.IndexOfAny(delimiters, start)) != -1)\n {\n if (index - start > 0)\n yield return inputString.Substring(start, index - start);\n\n yield return inputString.Substring(index, 1);\n\n start = index + 1;\n }\n\n if (start < inputString.Length)\n {\n yield return inputString.Substring(start);\n }\n }\n}\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"5.5 詞法分析器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"到目前爲止,我們已經探索過解決方案的簡單部分。接下來,我們來看一看如何實現標記化,從而瞭解更復雜和重要的部分。先定義一個默認的 BERT 標記列表。舉例來說,兩個句子都應該使用 [SEP] 標記來區分。[CLS] 標記總是出現在文本的開頭,並特定於分類任"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"namespace BertMlNet.Tokenizers\n{\n public class Tokens\n {\n public const string Padding = \"\";\n public const string Unknown = \"[UNK]\";\n public const string Classification = \"[CLS]\";\n public const string Separation = \"[SEP]\";\n public const string Mask = \"[MASK]\";\n }\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在 "},{"type":"codeinline","content":[{"type":"text","text":"Tokenizer"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(詞法分析器)類中完成標記化的過程。有兩個公共方法:"},{"type":"codeinline","content":[{"type":"text","text":"Tokenize"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"Untokenize"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。第一個方法首先將接收的的文本分割成若干句子,然後對於每個句子,每個詞都被轉換爲嵌入。需要注意的是,一個詞可能會出現用多個標記表示的情況。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"舉例來說,單詞“embeddings”表示爲標記數組:['em', '##bed', '##ding', '##s']。這個詞已經被分割成更小的子詞和字符,其中一些子詞前面有兩個# 號,這只是我們的詞法分析器的方式,表示這個子詞或字符是一個大詞的一部分,前面是另一個子詞。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此,例如,'##bed' 標記與 'bed' 標記是分開的。標記方法所做的另一件事是返回詞彙索引和分割索引。這兩個都是 BERT 輸入。如果想知道更多的原因,"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"請查閱這篇文章《"},{"type":"link","attrs":{"href":"https:\/\/mccormickml.com\/2019\/05\/14\/BERT-word-embeddings-tutorial\/","title":null,"type":null},"content":[{"type":"text","text":"BERT 詞嵌入教程"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"》("},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"BERT Word Embeddings Tutorial"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":")。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"using BertMlNet.Extensions;\nusing System;\nusing System.Collections.Generic;\nusing System.Linq;\n\nnamespace BertMlNet.Tokenizers\n{\n public class Tokenizer\n {\n private readonly List _vocabulary;\n\npublic Tokenizer(List vocabulary)\n {\n _vocabulary = vocabulary;\n }\n\n public List Tokenize(params string[] texts)\n {\n IEnumerable tokens = new string[] { Tokens.Classification };\n\n foreach (var text in texts)\n {\n tokens = tokens.Concat(TokenizeSentence(text));\n tokens = tokens.Concat(new string[] { Tokens.Separation });\n }\n\n var tokenAndIndex = tokens\n .SelectMany(TokenizeSubwords)\n .ToList();\n\n var segmentIndexes = SegmentIndex(tokenAndIndex);\n\n return tokenAndIndex.Zip(segmentIndexes, (tokenindex, segmentindex) \n => (tokenindex.Token, tokenindex.VocabularyIndex, segmentindex)).ToList();\n }\n\n public List Untokenize(List tokens)\n {\n var currentToken = string.Empty;\n var untokens = new List();\n tokens.Reverse();\n\n tokens.ForEach(token =>\n {\n if (token.StartsWith(\"##\"))\n {\n currentToken = token.Replace(\"##\", \"\") + currentToken;\n }\n else\n {\n currentToken = token + currentToken;\n untokens.Add(currentToken);\n currentToken = string.Empty;\n }\n });\n\n untokens.Reverse();\n\n return untokens;\n }\n\n public IEnumerable SegmentIndex(List tokens)\n {\n var segmentIndex = 0;\n var segmentIndexes = new List();\n\n foreach (var (token, index) in tokens)\n {\n segmentIndexes.Add(segmentIndex);\n\n if (token == Tokens.Separation)\n {\n segmentIndex++;\n }\n }\n\n return segmentIndexes;\n }\n\n private IEnumerable TokenizeSubwords(string word)\n {\n if (_vocabulary.Contains(word))\n {\n return new (string, int)[] { (word, _vocabulary.IndexOf(word)) };\n }\n\n var tokens = new List();\n var remaining = word;\n\n while (!string.IsNullOrEmpty(remaining) && remaining.Length > 2)\n {\n var prefix = _vocabulary.Where(remaining.StartsWith)\n .OrderByDescending(o => o.Count())\n .FirstOrDefault();\n\n if (prefix == null)\n {\n tokens.Add((Tokens.Unknown, _vocabulary.IndexOf(Tokens.Unknown)));\n\n return tokens;\n }\n\n remaining = remaining.Replace(prefix, \"##\");\n\n tokens.Add((prefix, _vocabulary.IndexOf(prefix)));\n }\n\n if (!string.IsNullOrWhiteSpace(word) && !tokens.Any())\n {\n tokens.Add((Tokens.Unknown, _vocabulary.IndexOf(Tokens.Unknown)));\n }\n\n return tokens;\n }\n\n private IEnumerable TokenizeSentence(string text)\n {\n \/\/ remove spaces and split the , . : ; etc..\n return text.Split(new string[] { \" \", \" \", \"\\r\\n\" }, StringSplitOptions.None)\n .SelectMany(o => o.SplitAndKeep(\".,;:\\\\\/?!#$%()=+-*\\\"'–_`<>&^@{}[]|~'\".ToArray()))\n .Select(o => o.ToLower());\n }\n}\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"另一個公共方法是 "},{"type":"codeinline","content":[{"type":"text","text":"Untokenize"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。這個方法被用於逆轉這一過程。從根本上說,BERT 的輸出會產生大量的嵌入信息。這個方法的目的是把這些信息轉化成有意義的句子。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"該類具有使該過程成爲現實的多種方法。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"5.6 BERT"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Bert"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 類將所有這些東西放在一起。在構造函數中,我們讀取詞彙文件並實例化 "},{"type":"codeinline","content":[{"type":"text","text":"Train"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"、"},{"type":"codeinline","content":[{"type":"text","text":"Tokenizer"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"Predictor"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 對象。這裏只有一個公共方法:Predict。這個方法接收上下文和問題。作爲輸出,將檢索出具有概率的答案:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"using BertMlNet.Extensions;\nusing BertMlNet.Helpers;\nusing BertMlNet.MachineLearning;\nusing BertMlNet.MachineLearning.DataModel;\nusing BertMlNet.Tokenizers;\nusing System.Collections.Generic;\nusing System.Linq;\n\nnamespace BertMlNet\n{\n public class Bert\n {\n private List _vocabulary;\n\nprivate readonly Tokenizer _tokenizer;\n private Predictor _predictor;\n\n public Bert(string vocabularyFilePath, string bertModelPath)\n {\n _vocabulary = FileReader.ReadFile(vocabularyFilePath);\n _tokenizer = new Tokenizer(_vocabulary);\n\n var trainer = new Trainer();\n var trainedModel = trainer.BuidAndTrain(bertModelPath, false);\n _predictor = new Predictor(trainedModel);\n }\n\n public (List tokens, float probability) Predict(string context, string question)\n {\n var tokens = _tokenizer.Tokenize(question, context);\n var input = BuildInput(tokens);\n\n var predictions = _predictor.Predict(input);\n\n var contextStart = tokens.FindIndex(o => o.Token == Tokens.Separation);\n\n var (startIndex, endIndex, probability) = GetBestPrediction(predictions, contextStart, 20, 30);\n\n var predictedTokens = input.InputIds\n .Skip(startIndex)\n .Take(endIndex + 1 - startIndex)\n .Select(o => _vocabulary[(int)o])\n .ToList();\n\n var connectedTokens = _tokenizer.Untokenize(predictedTokens);\n\n return (connectedTokens, probability);\n }\n\n private BertInput BuildInput(List tokens)\n {\n var padding = Enumerable.Repeat(0L, 256 - tokens.Count).ToList();\n\n var tokenIndexes = tokens.Select(token => (long)token.Index).Concat(padding).ToArray();\n var segmentIndexes = tokens.Select(token => token.SegmentIndex).Concat(padding).ToArray();\n var inputMask = tokens.Select(o => 1L).Concat(padding).ToArray();\n\n return new BertInput()\n {\n InputIds = tokenIndexes,\n SegmentIds = segmentIndexes,\n InputMask = inputMask,\n UniqueIds = new long[] { 0 }\n };\n }\n\n private (int StartIndex, int EndIndex, float Probability) GetBestPrediction(BertPredictions result, int minIndex, int topN, int maxLength)\n {\n var bestStartLogits = result.StartLogits\n .Select((logit, index) => (Logit: logit, Index: index))\n .OrderByDescending(o => o.Logit)\n .Take(topN);\n\n var bestEndLogits = result.EndLogits\n .Select((logit, index) => (Logit: logit, Index: index))\n .OrderByDescending(o => o.Logit)\n .Take(topN);\n\n var bestResultsWithScore = bestStartLogits\n .SelectMany(startLogit =>\n bestEndLogits\n .Select(endLogit =>\n (\n StartLogit: startLogit.Index,\n EndLogit: endLogit.Index,\n Score: startLogit.Logit + endLogit.Logit\n )\n )\n )\n .Where(entry => !(entry.EndLogit < entry.StartLogit || entry.EndLogit - entry.StartLogit > maxLength || entry.StartLogit == 0 && entry.EndLogit == 0 || entry.StartLogit < minIndex))\n .Take(topN);\n\n var (item, probability) = bestResultsWithScore\n .Softmax(o => o.Score)\n .OrderByDescending(o => o.Probability)\n .FirstOrDefault();\n\n return (StartIndex: item.StartLogit, EndIndex: item.EndLogit, probability);\n }\n}\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"codeinline","content":[{"type":"text","text":"Predict"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 方法會執行一些步驟。讓我們來詳細討論一下。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"public (List tokens, float probability) Predict(string context, string question)\n {\n var tokens = _tokenizer.Tokenize(question, context);\n var input = BuildInput(tokens);\n\nvar predictions = _predictor.Predict(input);\n\n var contextStart = tokens.FindIndex(o => o.Token == Tokens.Separation);\n\n var (startIndex, endIndex, probability) = GetBestPrediction(predictions, \n contextStart, \n 20, \n 30);\n\n var predictedTokens = input.InputIds\n .Skip(startIndex)\n .Take(endIndex + 1 - startIndex)\n .Select(o => _vocabulary[(int)o])\n .ToList();\n\n var connectedTokens = _tokenizer.Untokenize(predictedTokens);\n\n return (connectedTokens, probability);\n }"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"首先,該方法對問題和傳遞的上下文(基於 BERT 應該給出答案的段落)進行標記化。基於這些信息,我們建立了"},{"type":"codeinline","content":[{"type":"text","text":"BertInput"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。這是在 "},{"type":"codeinline","content":[{"type":"text","text":"BertInput"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 方法中完成的。基本上,所有標記化的信息都被填充了,因此可以將其作爲 BERT 的輸入,並用於初始化 "},{"type":"codeinline","content":[{"type":"text","text":"BertInput"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 對象。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"然後我們從 "},{"type":"codeinline","content":[{"type":"text","text":"Predictor"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 獲得模型的預測結果。這些信息會得到額外的處理,並且根據上下文找到最佳預測。也就是說,BERT 從上下文中選出最有可能是答案的詞,然後我們選出最好的詞。最後,這些詞都是未標記的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"5.7 程序"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"codeinline","content":[{"type":"text","text":"Program"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(程序)是利用了我們在 Bert 類中實現的內容。首先,讓我們定義啓動設置:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"{\n \"profiles\": {\n \"BERT.Console\": {\n \"commandName\": \"Project\",\n \"commandLineArgs\": \"\\\"Jim is walking through the woods.\\\" \\\"What is his name?\\\"\"\n }\n }\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們定義了兩個命令行"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"參數"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":":“Jim is walking throught the woods.”和“What is his name?”。正如我們已經提到的,第一個參數是"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"上下文"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",第二個參數是"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"問題"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"},{"type":"codeinline","content":[{"type":"text","text":"Main"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 方法是最小的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"using System;\nusing System.Text.Json;\n\n\nnamespace BertMlNet\n{\n class Program\n {\n static void Main(string[] args)\n {\n var model = new Bert(\"..\\\\BertMlNet\\\\Assets\\\\Vocabulary\\\\vocab.txt\",\n \"..\\\\BertMlNet\\\\Assets\\\\Model\\\\bertsquad-10.onnx\");\n\n\n var (tokens, probability) = model.Predict(args[0], args[1]);\n\n\n Console.WriteLine(JsonSerializer.Serialize(new\n {\n Probability = probability,\n Tokens = tokens\n }));\n }\n }"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在技術上,我們用詞彙表文件的路徑和模型的路徑創建 "},{"type":"codeinline","content":[{"type":"text","text":"Bert"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 對象。然後我們用命令行參數調用 "},{"type":"codeinline","content":[{"type":"text","text":"Predict"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 方法。我們得到的輸出是這樣的:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"{\"Probability\":0.9111285,\"Tokens\":[\"jim\"]}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們可以看到,BERT 有 91% 的把握認爲問題的答案是“Jim”,而且是正確的。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"結語"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"通過這篇文章,我們瞭解了 BERT 的工作原理。更具體地說,我們有機會探索 Transformer 架構的工作原理,並瞭解 BERT 如何利用該架構來理解語言。最後,我們學習了 ONNX 模型格式,以及如何將它用於 ML.NET。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"作者介紹:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Nikola M. Zivkovic 是 Rubik's Code 的首席人工智能官,也是《Deep Learning for Programmers》(尚無中譯本)一書的作者。熱愛知識分享,是一位經驗豐富的演講者,也是塞爾維亞諾維薩德大學的客座講師。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"https:\/\/rubikscode.net\/2021\/04\/19\/machine-learning-with-ml-net-nlp-with-bert\/"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章