一份簡短的風格遷移調研結果

style_transfer_survey

A survey on style_transfer from the original fantasy paper till now.

Contents:

Papers

  • A Neural Algorithm of Artistic Style
    • arxiv: 1508.06576
    • github: https://github.com/jcjohnson/neural-style
    • translation: https://www.jianshu.com/p/9f03b61fdeac
  • Texture Networks: Feed-forward Synthesis of Textures and Stylized Images
  • Perceptual Losses for Real-Time Style Transfer and Super-Resolution
  • Incorporating long-range consistency in CNN-based texture generation
  • Instance Normalization: The missing Ingredient for Fast Stylization
  • Image Style Transfer Using Convolutional Neural Networks
  • A Learned Representation For Artistic Style
  • Controlling Perceptual Factors in Neural Style Transfer
  • Fast Patch-based Style Transfer of Arbitrary Style
    • arxiv: 1612.04337
    • github: author torch
    • reference
      • http://blog.csdn.net/wyl1987527/article/details/70476044
      • http://blog.csdn.net/Hungryof/article/details/61195783
      • http://mathworld.wolfram.com/FrobeniusNorm.html
  • Demystifying Neural Style Transfer
  • Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization
  • Deep Photo Style Transfer
    • arxiv: 1703.07511
    • github: author torch, https://github.com/LouieYang/deep-photo-styletransfer-tf
    • translation: http://blog.csdn.net/cicibabe/article/details/70868746
  • Neural Style Transfer: A Review
    • arxiv: 1705.04058
    • github: https://github.com/ycjing/Neural-Style-Transfer-Papers
  • Universal Style Transfer via Feature Transforms

Practice

First, pytorch has a official example fast_neural_style.

Points:

  • training phase
    • content image x
    • style image s
    • pretrained model F[1,2,3,4], different middle-level feature representations on high dimension,
      (VGG16). Freezed weight parameters
    • Style Transfer Model T , a FCN model with size invariant
    • loss = weight_content * loss_content + weight_style * loss_style
  • evaluating phase
    • content image x
    • trained Style Transfer Model
    • styled image y = T(x)

More Details:

  • primary criterion is MSELoss.
  • loss_content is criterion(F2(x), F2(y))
  • GramMatrix G has nothing with the image size, is :
def gram_matrix(y):
    (b, ch, h, w) = y.size()
    features = y.view(b, ch, w * h)
    features_t = features.transpose(1, 2)
    gram = features.bmm(features_t) / (ch * h * w)
    return gram
  • gm_s = [G(F1(s)), G(F2(s)), G(F3(s)), G(F4(s))]
  • gm_y = [G(F1(y)), G(F2(y)), G(F3(y)), G(F4(y))]
  • loss_style = sum([MSELoss(gm_s[i], gm_y[i]) for i in range(len(gm_s))])
  • padding is reflection, not constant 0

Size Analysis:

  • x.shape=(m1, n1, 3)
  • s.shape=(m2, n2, 3)
  • batch_size = b
  • T downsample two times, both int(ceil(x/2)), this will bring size difference.
    For example, input image is size(3,33,33), output size is (3,36,36).
    Same as F. Saying proper.
  • gm_s[i] size is (b, ch[i], ch[i])

Think About The Model:

  • The VGG16 is just a representation on high dimension. It can be replaced by any other
    similar pretrained model.
  • The four middle-level representations can also be chosen as other.
  • Init convolution layer has a big kernel size to have a bigger receptive field.

Paper Reading Notes

Code Myself

References

更多版本記錄可以見github oneTaken/style_transfer_survey

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章