改善圖像處理效果的五大生成對抗網絡

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"本文最初發表於 Towards Data Science 博客,經原作者公司 PerceptiLabs 授權,InfoQ 中文站翻譯並分享。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在圖像處理方面,機器學習實踐者們正在逐漸轉向藉助生成對抗網絡(Generative Adversarial Network,GAN)的力量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實際受益於使用生成對抗網絡的應用包括:從基於文本的描述生成藝術品和照片、放大圖像、跨域翻譯圖像 (例如,將白天的場景改爲夜間)及許多其他應用。爲實現這一效果,人們設計了許多增強的生成對抗網絡架構,它們具有獨特的功能,可用於解決特定的圖像處理問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在本文中,我們選擇五種生成對抗網絡進行深入討論,因爲它們提供了廣泛的功能,從放大圖像到創建基於文本的全新圖像。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Conditional GAN"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Stacked GAN"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Information Maximizing GAN"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Super Resolution GAN"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pix2Pix"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果你需要快速回顧生成對抗網絡,請查閱博文《"},{"type":"link","attrs":{"href":"https:\/\/blog.perceptilabs.com\/exploring-generative-adversarial-networks-gans?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"探索生成對抗網絡"}]},{"type":"text","text":"》(Exploring Generative Adversarial Networks,"},{"type":"link","attrs":{"href":"https:\/\/blog.perceptilabs.com\/exploring-generative-adversarial-networks-gans?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"https:\/\/blog.perceptilabs.com\/exploring-generative-adversarial-networks-gans"}]},{"type":"text","text":"),這篇文章介紹了"},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Generative_adversarial_network?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"生成對抗網絡"}]},{"type":"text","text":"如何訓練兩個神經網絡:生成器和判別器,它們可以學習生成越來越逼真的圖像,同時提高其將圖像分類爲真或假的能力。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Conditional GAN"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Conditional GAN 面臨的挑戰之一是無法控制圖像生成類型。生成器只是簡單地從隨機噪聲開始,並反覆創建圖像,希望這些圖像能隨着時間的推移趨向於表示訓練圖像。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Conditional GAN(cGAN),通過利用額外信息,例如標籤數據(也就是類標籤)解決了這個問題。這樣還能使訓練更加穩定和快速,同時提高生成圖像的質量。舉例來說,cGAN 呈現的不同類型的蘑菇圖片及標籤,可以通過訓練來產生和識別那些準備採摘的蘑菇。該模型可作爲工業機器人計算機視覺的基礎,通過編程實現蘑菇的搜尋與採摘。當不具備這些條件時,標準的生成對抗網絡(有時也稱爲無條件生成對抗網絡)僅僅依賴於將來自潛在空間的數據映射到產生的圖像上。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"cGAN 的實現方法有很多,有一種方法是將類標籤輸入判別器和生成器,從而對這兩者進行調節。下圖示例展示了一種標準的生成對抗網絡生成手寫數字圖像,該網絡通過增強標籤數據,只生成數字 8 和 0 的圖像。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/86\/17\/866d8f1ff42b7e1a97d2797ebc1b4917.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖 1:一種 cGAN,類標籤同時輸入到生成器和判別器,以控制輸出。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其中,可以對標籤進行"},{"type":"link","attrs":{"href":"https:\/\/hackernoon.com\/what-is-one-hot-encoding-why-and-when-do-you-have-to-use-it-e3c6186d008f?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"獨熱"}]},{"type":"text","text":"編碼以去除序類型(ordinality),將標籤作爲附加層輸入到判別器和生成器中,再將它們與各自的圖像輸入進行連接(即對生成器來說,與噪聲連接起來,對生成器來說,與訓練集連接起來)。因此,這兩個神經網絡在訓練過程中都是以圖像類標籤爲條件。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"總結"},{"type":"text","text":":當你需要控制生成的內容時(例如,生成訓練數據的子集),使用 cGAN。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Stacked GAN"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要是我們能夠直接讓電腦畫幅圖,是不是很酷?這正是 Stacked GAN(StackGAN)背後的靈感所在,在論文《StackGAN:基於堆疊式生成對抗網絡的文本到逼真圖像合成》(StackGAN:Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks) 中,作者對此進行了描述。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"StackGAN 主要是一個兩階段的草圖細化過程,與畫家作畫的方法相似,即先畫出一般元素,然後再進行細化:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"階段一,生成對抗網絡:它以給定的文字描述爲條件,勾畫出對象的原始形狀和基本顏色,並根據隨機噪聲矢量繪製出背景佈局,得到低分辨率圖像。階段二,生成對抗網絡:糾正階段一低分辨率圖像中的缺陷,通過再次閱讀文字說明來完善對象的細節,從而生成高分辨率的逼真圖像。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作者對其模型的架構作了如下概述:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/27\/29\/272f5fa64a5871090feebf10062eb129.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖 2:StackGAN 模型架構概述。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"儘管使用普通的生成對抗網絡也可以解決這一問題,但輸出的圖像可能缺少細節,且可能限制在較低的分辨率。StackGAN 的兩階段架構基於 cGAN 的思想來解決這一問題,就像作者在論文中說的那樣:通過對階段一結果和文本的再次調節,階段二生成對抗網絡學習捕捉階段一生成對抗網絡遺漏的文本信息,併爲對象繪製更多細節。模型分佈支持通過粗對齊得到的低分辨率圖像與圖像分佈支持得到了較好的交叉概率。而這正是階段二生成對抗網絡能夠產生更好高分辨率圖像的根本原因。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要了解更多關於 StackGAN 的信息,請查看作者的"},{"type":"link","attrs":{"href":"https:\/\/github.com\/hanzhanggit\/StackGAN?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"GitHub 倉庫"}]},{"type":"text","text":"("},{"type":"link","attrs":{"href":"https:\/\/github.com\/hanzhanggit\/StackGAN?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"https:\/\/github.com\/hanzhanggit\/StackGAN"}]},{"type":"text","text":"),他提供了一些模型,以及鳥類和花卉的圖片。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"總結"},{"type":"text","text":":當你需要從完全不同的表示方式(例如,基於文本的描述)來生成圖像時,請使用 StackGAN。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Information Maximizing GAN"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"類似於 cGAN,Information Maximizing GAN(InfoGAN)利用額外的信息對生成的內容進行更多的控制。通過這樣做,它可以學習分解圖像的各個方面,比如人的髮型、物體或者情感,所有這些都是通過無監督訓練。然後,這些信息可以用於控制生成圖像的某些方面。舉例來說,給定的人臉圖像中,有些人戴着眼鏡,InfoGAN 就可以被訓練成對眼鏡的像素進行拆分,然後用它來生成戴眼鏡的新人臉。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 InfoGAN 中,一個或多個控制變量與噪聲一起被輸入到生成器中。生成器的訓練使用了一種稱爲輔助模型的附加模型中包含的"},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Mutual_information?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"互信息"}]},{"type":"text","text":"(mutual information)進行的,該模型與判別器擁有相同的權重,但預測用於生成圖像的控制變量的值。這種互信息是通過對生成器生成的圖像的觀察獲得的。與判別器一起,輔助模型對生成器進行訓練,使 InfoGAN 既能學會生成 \/ 識別假圖像與真圖像,又能捕捉生成圖像的顯著屬性,從而學會改進圖像生成。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個架構總結如下圖所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/86\/17\/866d8f1ff42b7e1a97d2797ebc1b4917.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖 3:InfoGAN 架構概要"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要了解關於 InfoGAN 的更多信息,請查看博文:《"},{"type":"link","attrs":{"href":"https:\/\/towardsdatascience.com\/infogan-generative-adversarial-networks-part-iii-380c0c6712cd?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"InfoGAN:生成對抗網絡第三部分"}]},{"type":"text","text":"》("},{"type":"text","marks":[{"type":"italic"}],"text":"InfoGAN — Generative Adversarial Networks Part III"},{"type":"text","text":"**"},{"type":"link","attrs":{"href":"https:\/\/towardsdatascience.com\/infogan-generative-adversarial-networks-part-iii-380c0c6712cd?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"https:\/\/towardsdatascience.com\/infogan-generative-adversarial-networks-part-iii-380c0c6712cd"}]},{"type":"text","text":")"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"總結"},{"type":"text","text":":當你需要將圖像的某些特徵分離出來,以便合成到新生成的圖像中時,請使用 InfoGAN。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Super Resolution GAN"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖像增強領域正在不斷髮展,與雙三次插值等傳統統計方法相比,它更依賴於機器學習算法。Super Resolution GAN(SRGAN)就是這樣一種機器學習方法,它可以將圖像提升到超高分辨率。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"SRGAN 利用生成對抗網絡的對抗性,與深度神經網絡相結合,學習如何生成放大的圖像(最高可達到原始分辨率的四倍)。這些生成的超分辨率圖像準確性更好,且通常會獲得較高的"},{"type":"link","attrs":{"href":"https:\/\/en.wikipedia.org\/wiki\/Mean_opinion_score?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"平均意見分"}]},{"type":"text","text":"(mean opinion scores,MOS)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在對 SRGAN 進行訓練時,首先將高分辨率的圖像下采樣到低分辨率的圖像,然後輸入到生成器中。然後,生成器嘗試將該圖像上採樣到超分辨率圖像。判別器用來比較生成的超分辨率圖像和原始高分辨率圖像。判別器的生成對抗網絡損耗隨後反向傳播到判別器和生成器,如圖所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/e4\/48\/e4627e29bbfb7bf7400832b781730548.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖 4:SRGAN 架構。LR= 低分辨率圖像,HR= 高分辨率圖像,SR= 超分辨率圖像,X= 判別器的輸入,D(X)= 判別器對 HR 和 SR 的分類。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"生成器使用了許多"},{"type":"link","attrs":{"href":"https:\/\/www.perceptilabs.com\/docs\/convolution_tutorial?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"卷積神經網絡"}]},{"type":"text","text":"(convolution neural networks,CNN)和"},{"type":"link","attrs":{"href":"https:\/\/www.perceptilabs.com\/docs\/textile_usecase?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"ResNet"}]},{"type":"text","text":",以及批歸一化層和激活函數 ParametricReLU。這些首先對圖像進行下采樣,然後再進行上採樣,生成超分辨率圖像。同樣,判別器使用一系列卷積神經網絡,以及密集層、Leaky ReLU 和 sigmoid 激活,以確定圖像是原始的高分辨率圖像,還是由生成器輸出的超分辨率圖像。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要了解更多關於 SRGAN 的信息,請參閱這篇博文《"},{"type":"link","attrs":{"href":"https:\/\/jonathan-hui.medium.com\/gan-super-resolution-gan-srgan-b471da7270ec?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"生成對抗網絡:超分辨率生成對抗網絡(SRGAN)"}]},{"type":"text","text":"》(GAN — Super Resolution GAN (SRGAN))。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"總結"},{"type":"text","text":":當你需要在恢復或保留細粒度、高保真細節的同時放大圖片,請使用 SRGAN。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Pix2Pix"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"正如我們在博客中討論的《"},{"type":"link","attrs":{"href":"https:\/\/blog.perceptilabs.com\/top-five-ways-that-machine-learning-is-being-used-for-image-processing-and-computer-vision#Object_Instance?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"機器學習用於圖像處理和計算機視覺的五大方法"}]},{"type":"text","text":"》("},{"type":"text","marks":[{"type":"italic"}],"text":"Top Five Ways That Machine Learning is Being Used for Image Processing and Computer Vision"},{"type":"text","text":"*,*"},{"type":"link","attrs":{"href":"https:\/\/blog.perceptilabs.com\/top-five-ways-that-machine-learning-is-being-used-for-image-processing-and-computer-vision#Object_Instance?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"https:\/\/blog.perceptilabs.com\/top-five-ways-that-machine-learning-is-being-used-for-image-processing-and-computer-vision#Object_Instance"}]},{"type":"text","text":"),對象分割是一種方法,將數字圖像中的像素組分割成片段,然後可以在一個或多個圖像中作爲對象進行標記、定位,甚至跟蹤。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分割也可以用來將輸入圖像轉化爲輸出圖像,以達到各種目的,如從標籤圖合成照片,從邊緣圖重建物體,及對黑白圖像進行着色。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分割可以使用 Pix2Pix 來完成,Pix2Pix是一種 cGAN,用於圖像到圖像的翻譯,首先訓練一個"},{"type":"link","attrs":{"href":"https:\/\/paperswithcode.com\/method\/patchgan?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"PatchGAN"}]},{"type":"text","text":"判別器來對翻譯的圖像進行分類,判斷這些圖像的真假,然後用來訓練一個基於"},{"type":"link","attrs":{"href":"https:\/\/perceptilabs.com\/docs\/u-net_usecase?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"U-Net"}]},{"type":"text","text":"的生成器來產生越來越可信的翻譯。使用 cGAN 意味着該模型可以用於多種翻譯,而無條件生成對抗網絡則需要額外的元素,如 L2 迴歸,以調節不同類型翻譯的輸出。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/a4\/78\/a4bb6f2e1962f3bb3f1d2ec445543578.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖 5:使用 Pix2Pix 進行着色的示例。此處顯示了鞋子的黑白圖畫(輸入)及其訓練數據(基準真相),以及 Pix2Pix 生成的圖像(輸出)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖顯示了 Pix2Pix 中的判別器如何在對黑白圖像進行着色的情況下首先進行訓練。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/df\/df15495cc1b75b67c0d7b9b75e611904.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖 6:首先在 Pix2Pix 架構中對判別器進行訓練。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在此,將黑白圖像作爲輸入提供給生成器,生成器會生成一個彩色版本(輸出)。判別器隨後進行兩次比較:第一次將輸入與目標圖像(即,代表基準真相的訓練數據)進行比較,第二次將輸入與輸出(即,生成的圖像)進行比較。然後,優化器根據兩次比較的分類誤差調整判別器的權重。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現在已經訓練好了判別器,就可以用來訓練生成器了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/ab\/49\/abb7ae0162c41a1265fd15992bb32c49.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖 7:使用訓練好的判別器在 Pix2Pix GAN 中訓練生成器。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這裏,輸入的圖像被同時饋送到生成器和判別器中。(訓練好的)判別器將輸入圖像與生成器的輸出進行比較,並將輸出與目標圖像相比較。隨後,優化器調整生成器的權重,直到訓練到生成器可以在大多數時間對判別器進行欺騙。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要了解更多關於Pix2Pix的信息,請參閱這篇文章《"},{"type":"link","attrs":{"href":"https:\/\/neurohive.io\/en\/popular-networks\/pix2pix-image-to-image-translation\/?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"Pix2Pix:圖像到圖像的翻譯神經網絡"}]},{"type":"text","text":"》("},{"type":"text","marks":[{"type":"italic"}],"text":"Pix2Pix – Image-to-Image Translation Neural Network"},{"type":"text","text":")。此外,請務必查看這個"},{"type":"link","attrs":{"href":"https:\/\/github.com\/phillipi\/pix2pix?fileGuid=GkVCpHfGVegOPUZN","title":"","type":null},"content":[{"type":"text","text":"GitHub 倉庫"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"總結"},{"type":"text","text":":當你需要將源圖像的某些方面翻譯成生成的圖像時,請使用 Pix2Pix GAN。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"結語"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"生成對抗網絡,更具體地說,是它們的判別器和生成器,可以用各種方式來構建,以解決廣泛的圖像處理問題。以下總結可以幫助你選擇適合你的應用的生成對抗網絡。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"cGAN"},{"type":"text","text":":控制(如限制)生成對抗網絡的分類應進行訓練。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"StackGAN"},{"type":"text","text":":將基於文本的描述用作創建圖像的命令。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Infogan"},{"type":"text","text":":解析你想要生成的圖像的特定方面。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"SRGAN"},{"type":"text","text":":在保持細粒度的細節的同時,放大圖片。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"pix2pix"},{"type":"text","text":":對圖像進行分割和翻譯(例如,對圖像進行着色)。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"作者介紹:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Martin Isaksson,PerceptiLabs 的聯合創始人兼 CEO,這是一家專注於讓機器學習變得簡單的創業公司。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https:\/\/towardsdatascience.com\/five-gans-for-better-image-processing-fabab88b370b"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章