patch-GAN(pixel2pixel):Image-to-Image Translation with Conditional Adversarial Networks

Image-to-Image Translation with Conditional Adversarial Networks

Paper:https://arxiv.org/pdf/1611.07004.pdf
Code:https://github.com/affinelayer/Pix2Pix-tensorflow
Tips:CVPR2017的一篇paper。
(閱讀筆記)

1.Main idea

  • 用條件GAN解決圖像到圖像的轉換問題。a general-purpose solution to image-to-image translation problems.
  • 去學習損失函數來實現圖像到圖像的映射關係。learn a loss function to train this mapping.

2.Intro

  • 類似於語言翻譯,給出了圖像到圖像的定義解釋。we define automatic image-to-image translation as the task of translating one possible representation of a scene into another.
  • 雖然CNN已經取得了很優秀的結果,但還是需要一個目標函數。In other words, we still have to tell the CNN what we wish it to minimize.
    得益於GAN,所以可以直接學到一個高維的Loss function。
  • 之前的大多相關工作都是學習圖像與圖像之間的結構形式的損失,然後介紹了條件GAN的發展。

3.Details

  • 目標函數與原始GAN的目標函數差不多,只是添加了L1損失,如下式:
    LL1(G)=Ex,y,z[yG(x,z)1] \begin{aligned} \mathcal{L}_{L1}(G) &= \mathbb{E}_{x,y,z} \left[ \|y-G(x,z) \|_1 \right] \\ \end{aligned}
    argminGmaxDLcGAN(G,D)+λLL1(G) \begin{aligned} \arg \min_G & \max_D \mathcal{L}_{cGAN}(G,D) + \lambda \mathcal{L}_{L1}(G) \\ \end{aligned}
    注意到如果不加噪聲zz,那麼生成器只會學習到定式的函數(只會輸出與輸入xx很類似的結果),這樣的結果是不夠好的。
  • 生成器和U-net類似,自編碼器並有跳躍連接的形式。
    判別器是一個馬爾科夫過程(patchGAN),並不是整張圖片進行判別,而是一個區域一個區域(patch)的判別,最後結果求平均得分。This discriminator tries to classify if each N×NN \times N patch in an image is real or fake.
    這樣以後,運行速度更快,參數更少,也能得到很好的結果。produce high quality results; has fewer parameters, runs faster, and can be applied to arbitrarily large images.
  • 但是代碼的實現卻還是和其他GAN一樣,並沒有發現patch的具體設置,於是:
    The difference between a PatchGAN and regular GAN discriminator is that rather the regular GAN maps from a 256x256 image to a single scalar output, which signifies “real” or “fake”, whereas the PatchGAN maps from 256x256 to an N×NN \times N array of outputs XX, where each XijX_{ij} signifies whether the patch i,ji,j in the image is real or fake.
    參考:https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/issues/39
    Maybe it would have been better if we called it a “Fully Convolutional GAN” like in FCNs, it is the same idea.
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章