谷歌正式開源Model Search!自動優化並識別AI模型,最佳模版唾手可得

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"日前,谷歌開源Model Search,幫助研發人員開發最佳機器學習學習模型。"}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Model Search:查找最佳機器學習模型的開源平臺"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2月19日,谷歌"},{"type":"link","attrs":{"href":"https:\/\/ai.googleblog.com\/2021\/02\/introducing-model-search-open-source.html","title":"","type":null},"content":[{"type":"text","text":"宣佈"}]},{"type":"text","text":"發佈了"},{"type":"link","attrs":{"href":"http:\/\/github.com\/google\/model_search","title":"","type":null},"content":[{"type":"text","text":"Model Search"}]},{"type":"text","text":",這是一個開源平臺,旨在幫助研究人員高效、自動地開發和創建機器學習模型。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"神經網絡(NN)技術是否成功,往往取決於此類模型能否在多種任務中實現良好泛化。但這類高泛化能力模型的設計往往極爲困難,學術界甚至還沒有就神經網絡的泛化思路達成統一:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於特定問題,合適的神經網絡應該是個什麼樣子?"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最好選擇怎樣的深度?"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"應該使用哪些層類型?"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"LSTM就夠了,還是說Transformer層會更好?或者說應該二者相結合?"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"裝配或蒸餾等方法會提高模型性能嗎?"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"正是這種種棘手問題的存在,才導致機器學習成爲一個嚴重依賴於工程師個人理解與直覺判斷的領域。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"近年來,AutoML算法開始快速興起,旨在幫助研究人員以無需手動實驗的方式快速找到合適的神經網絡。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"神經架構搜索(NAS)等技術能夠使用強化學習(RL)、進化算法與組合搜索等方法在給定的搜索空間之內構建起神經網絡。只要得到正確設置,這些技術已經可以帶來超越手動設計的性能表現。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但這類算法通常涉及巨大的運算量,而且在實際收斂之前往往需要藉助成千上萬套模型進行訓練。此外,這類算法只能探索特定領域的搜索空間,而且需要藉助大量難以跨域傳播的先驗性知識。以圖像分類爲例,傳統的NAS會搜索出兩個良好的構建塊(卷積與下采樣塊),再根據以往慣例對構建塊進行排列以創建完整網絡。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了克服這些不足,並將AutoML解決方案的適用範圍推向更廣泛的研究社區,我們在這裏正式公佈Model Search的開源版本。這套平臺能夠幫助研究人員高效自動開發出最佳機器學習模型。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Model Search不受特定領域限制,擁有良好的領域中立性,且能夠靈活找到最適合給定數據集與待解決問題的理想架構,同時儘可能降低編碼時長、工作量與算力需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Model Search以Tensorflow爲基礎構建而成,能夠在單設備或分佈式設備集羣當中運行。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"概述"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Model Search系統由多個訓練器、一項搜索算法、一項遷移母算法以及一套擁有大量已評估模型的數據庫。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Model Search能夠以自適應、異步方式運行多種機器學習模型(採用不同架構與訓練方法)的指令與評估實驗,確保所有訓練器都能從實驗中共享專業知識,並據此獨立完成更多實驗。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在每個輪次開始時,搜索算法都會介入全部已經完成的實驗,並使用beam search決定接下來該做出哪些嘗試。之後,搜索算法會通過“變異”識別出多種最佳架構,並將結果模型返回至訓練器處。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":""}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}}],"text":"Model Search示意圖中展示了分佈式搜索與裝配過程。每個訓練器獨立運行以訓練及評估特定模型。得到的結果將與搜索算法共享。在此之後,搜索算法會調用最佳架構之一的變異形式,再將新模型發送回訓練器以進行下一輪迭代。其中的S代表訓練與驗證樣本集合,A則代表訓練與探索過程中所使用的全體候選對象。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這套系統使用一組預定義的構建塊生成神經網絡模型,每個塊代表一個已知的微架構元素,例如LSTM、ResNet或者Transformer層。通過使用這些預先存在的架構組件,Model Search可以利用跨域NAS研究中得出的現有最佳知識。之所以如此高效,是因爲Model Search探索的是結構— 而非結構中其他更基礎、更細化的組件 — 所以能夠極大縮小搜索空間規模。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"
\"圖片\"
"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}}],"text":"運行良好的各神經網絡微架構塊,例如ResNet塊。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於Model Search框架以Tensorflow爲基礎構建而成,因此各塊也可以接受張量作爲輸入函數。例如,如果我們希望引入新的搜索空間,且此空間由一系列微架構構建塊組成,則Model Search框架會將新定義的塊合併至搜索過程當中,確保算法能夠藉此構建起最佳神經網絡。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏的構建塊,甚至可以是能夠獨立解決某些特定問題的完全定義神經網絡。如果選擇這種方式,那麼Model Search又可以作爲一種強大的神經網絡裝配機。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Model Search中實現的搜索算法具有自適應性、貪婪性與增量性,因此其收斂速度要比強化學習算法更快。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"更重要的是,Model Search同樣會模仿強化學習算法的“探索與利用”性質,即先分離出一項針對良好候選對象的搜索(探索部分),再通過裝配這些良好候選對象以提升準確率(利用部分)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在對架構或訓練方法做出隨機更改(例如加大架構的深度)之後,主搜索算法會做出自適應修改,據此執行效果最好的k項實驗之一(其中的k可由用戶靈活指定)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"
\"圖片\"
"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}}],"text":"網絡在多項實驗中不斷進化的示例。每種顏色代表不同類型的架構塊。最終網絡是由高性能候選網絡變異而來,在此示例中的變異爲增加網絡深度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了進一步提高效率與準確性,還可以在不同內部實驗之間進行遷移學習。Model Search通過兩種方式實現遷移學習 — 知識蒸餾或權重共享,從先前訓練完成的模型中推導出可供後續模型使用的某些變量。以此爲基礎,即可顯著加快學習速度並有望快速擴展出性能更強的架構選項。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"知識蒸餾旨在增加與基礎模型實際情況相匹配的損失項以提高候選對象的準確性,這些損失項應與高性能模型的預測結果相匹配。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一方面,權重共享則直接從以往訓練完成的模型中複製合適的權重,並對其餘權重部分進行隨機初始化,藉此爲變異後的新網絡提供參數搭配指引。這種方式不僅有助於加快訓練速度,同時也能帶來更多性能更好的架構選項。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在Model Search運行完成之後,客戶即可檢查搜索得出的大量模型選項。此外,客戶也可以生成自己的個人搜索空間,藉此進一步選取模型中的自定義架構組件。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"實驗結果"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Model Search成功以最少的迭代次數實現了生產模型改進。在最近發表的論文中,谷歌以關鍵詞發現與語言識別模型爲例,演示了Model Search在語音領域的實際效能。只需要不到200次迭代,其生成的模型就已經略優於專家設計的內部最新生產模型,且訓練參數的數量也由後者的31萬5千條降低至18萬4千條。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"
\"圖片\"
"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}}],"text":"與原有關鍵字發現生產模型相比,我們通過系統迭代得到的模型擁有更高的準確性。在同一篇論文中,語言識別測試也得出了類似的結論。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"谷歌還在得到廣泛使用的CIFAR-10成像數據集上使用Model Search,希望找到最適合的圖像分類架構。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過一組已知的卷積塊——包括卷積、Resnet塊(即兩個卷積加一條skip連接)、NAS-A單元、全連接層等,谷歌觀察到Model Search能夠在短短209次試驗(僅探索209個模型)後迅速達成91.83的基準準確性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"相比之下,NasNet算法(強化學習)需要5807次試驗、PNAS(強化學習加漸進式學習)需要1160次試驗,才能達到相同的模型性能。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"谷歌公司調查工程師Hanna Mazzawi與調查科學家Xavi Gonzalvo在博文中寫道,“我們希望Model Search代碼能夠爲研究人員們提供一套靈活且具有領域中立性的框架,幫助他們輕鬆發現良好的機器學習模型。以特定領域的已有知識爲基礎,我們相信這套框架將迸發出巨大的能量,在由標準構建塊組成的搜索空間之內爲各類現實問題選取性能最強的模型選項。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"size","attrs":{"size":10}}],"text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/ai.googleblog.com\/2021\/02\/introducing-model-search-open-source.html","title":"","type":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"https:\/\/ai.googleblog.com\/2021\/02\/introducing-model-search-open-source.html"}],"marks":[{"type":"italic"},{"type":"size","attrs":{"size":10}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/venturebeat.com\/2021\/02\/19\/googles-model-search-automatically-optimizes-and-identifies-ai-models\/","title":"","type":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"https:\/\/venturebeat.com\/2021\/02\/19\/googles-model-search-automatically-optimizes-and-identifies-ai-models\/"}],"marks":[{"type":"italic"},{"type":"size","attrs":{"size":10}}]}]},{"type":"heading","attrs":{"align":null,"level":4}}]}
\"圖片\"
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章