Universal Style Transfer via Feature Transforms (WCT，風格遷移，NIPS2017)

Li Y, Fang C, Yang J, et al. Universal Style Transfer via Feature Transforms. NIPS 2017

風格遷移的關鍵問題是如何提取有效果的風格特徵並且讓輸入的內容圖像去匹配這種風格。前人的工作證明了協方差矩陣和Gram矩陣能較好地反映視覺風格。

基於優化的風格遷移方法，可以處理任意風格並且達到滿意的效果但是計算代價太大(時間太長)；基於前饋網絡的方法可以高效運行，但是侷限於固定種類的風格或者不太好的視覺效果。

同時實現generalization ， quality ，efficiency的任意風格轉換時一個挑戰性的任務。

generalization：說的應該是處理多種甚至任意風格圖像的能力。

這篇工作提出的方法是用whitening and coloring變換來實現風格遷移, 思想是將風格遷移任務看做是一個圖像重構並進行特徵變換的任務。WCT操作：輸入內容圖像特徵和風格特徵，輸出調整後的內容特徵使得輸入內容特徵的協方差矩陣和輸入風格特徵的協方差矩陣相匹配。

事先需要預訓練一個VGG作爲encoder和一個與之對稱的decoder。它們的參數在後續實驗中保持固定。

輸入的內容圖像C和風格圖像S經過VGG提取出high-level高層特徵（Relu5_1)，進行Whitening and Coloring Transform （WCT），然後通過decoder恢復到原圖大小。

再將變換後的圖像C和風格圖像S經過VGG提取出較低一層的特徵（Relu4_1)，進行WCT，然後通過decoder恢復到原圖大小。
…
再將變換後的圖像C和風格圖像S經過VGG提取出低層的特徵（Relu1_1)，進行WCT，然後通過decoder恢復到原圖大小。

The result obtained by matching higher level statistics of the style is treated as the new content to continue to match lower-level information of the style.

也就是說經過匹配了風格圖像高層統計量的圖像作爲新的內容圖像繼續去匹配風格圖像的低層信息。這樣各層統計量都得到了匹配。（Multi-level coarse-to-fine stylization）

重構誤差：

特徵進行whitening，之後decode 操作效果：

保持了圖像的全局內容，移除了與風格有關的信息。細節處的筆畫圖案被移除了

It clearly shows that the higher layer features capture more complicated local structures, while lower layer features carry more low-level information (e.g., colors).

高層特徵反映了複雜的局部結構，低層特徵具有更多的低層(細節)信息。工作的優化順序是先匹配高層統計量，再匹配低層統計量。

which indicates that the higher layer features first capture salient patterns of the style and lower layer features further improve details
高層特徵捕捉到了風格的顯著模式，低層特徵進一步優化細節。實驗證明：如果先匹配低層統計量，再匹配高層統計量，低層細節信息在高層特徵經過操作之後不能被保留。

控制風格化程度：

$\alpha$ 被稱之爲style weight.
where $\alpha$ serves as the style weight for users to control the transfer effect.

拓展到紋理生成 Texture synthesis：

將輸入內容圖像替換成隨機噪聲圖像

生成新紋理的插值方式：

之後將 $\hat{f_{cs}}$ 送入decoder

In contrast, our approach explains each input noise better because the network is unlikely to absorb the variations in input noise since it is never trained for learning textures.
對於紋理生成一個重要的評價指標是diversity, [27]通過輸入噪聲生成不同的結果圖像。但是實驗結果證明噪聲被吸收/發揮出了微不足道的效果，因此難以驅動網絡去生成較大的視覺變化/差異。相比較而言，由於proposed mehod對於紋理生成不需要經過訓練過程，所以proposed method吸收噪聲的能力並不那麼強。

WCT操作：

目標：

其中 $\hat{f_{cs}}$ 是指匹配之後的特徵， $f_s$ 是指風格特徵。

1） Whitening操作:

先對 $f_c$ 中心化 (We first center $f_s$ by subtracting its mean vector $m_s$ )

之後：

其中 $E_c$ 爲 $f_cf_c^T$ 進行特徵分解後的特徵向量構成的方陣， $D_c$ 爲特徵值構成的對角陣。

這樣變換後的 $\hat{f_c}$ 正交：

$\hat{f_c} \hat{f_c}^T= I$

證明：

因爲，對於所有特徵向量 $x$ 有
$f_c f_c^T x=\lambda x$ 。

所以 $ f_c f_c^T E_c=E_c D_c$。

因爲 $E_c$ 爲正交方陣，故 $E_c E_c^T=I$ ，並且 $E_c^T E_c=I$ 。

所以 $f_c f_c^T = E_c D_c E_c^T$ 。
所以 $E_c^T f_c f_c^T E_c = D_c$ 。

所以
$\begin{aligned} \hat{f_c}\hat{f_c}^T&=E_c D_c^{-\frac{1}{2}} E_c^T f_c f_c^T E_c D_c^{-\frac{1}{2}T}E_c^T \\ &=E_c D_c^{-\frac{1}{2}} D_c D_c^{-\frac{1}{2}T}E_c^T \\ &=E_c D_c^{-\frac{1}{2}} D_c D_c^{-\frac{1}{2}}E_c^T \\ &=E_c E_c^T \\ &= I \end{aligned}$

2） Coloring操作:

先對 $f_s$ 中心化,再進行變換：
其中 $E_s$ 爲 $f_sf_s^T$ 進行特徵分解後的特徵向量構成的方陣， $D_s$ 爲特徵值構成的對角陣。

最後再對 $\hat{f_{cs}}$ 去中心化

這樣先進行Whitening再進行Coloring之後，能夠使變換後的特徵和風格圖特徵的協方差矩陣匹配：
證明：
$\begin{aligned} \hat{f_{cs}}\hat{f_{cs}}^T &= E_s D_s^{\frac{1}{2}} E_s^T \hat{f_c} \hat{f_c}^T E_s D_s^{\frac{1}{2}T} E_s^T \\ &=E_s D_s^{\frac{1}{2}} E_s^T E_s D_s^{\frac{1}{2}} E_s^T \\ &=E_s D_s^{\frac{1}{2}} D_s^{\frac{1}{2}T} E_s \\ &=E_s D_s^{\frac{1}{2}} D_s^{\frac{1}{2}} E_s \\ &=E_s D_s E_s \\ &= f_s f_s^T \end{aligned}$

Universal Style Transfer via Feature Transforms (WCT，風格遷移，NIPS2017)

AI 畫圖真刺激，手把手教你如何用 ComfyUI 來畫出刺激的圖

公司剛入職了一名 Java 中級開發，短短 4 行代碼居然湊齊了 3 個 bug！我哭了~~

公衆號5月C#/.NET熱文一覽

git 下載大陸鏡像地址

Learning disentangling and fusing networks for face completion under structured occlusions

Deep coral: Correlation alignment for deep domain adaptation. ECCV 2016. Domain Adaptation

Domain Separation Networks (NIPS 2016)

Unsupervised domain adaptation with residual transfer networks(NIPS 2016)

Informative Sample Mining Network for Multi-Domain Image-to-Image Translation

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結