實戰二:手把手教你圖像風格遷移

                    實戰二:手把手教你圖像風格遷移

一、簡介

圖像風格遷移是指,將一幅內容圖的內容,和一幅或多幅風格圖融合在一起,從而生成一些有意思的圖片。

有興趣的可以看一下外文文獻 Leon A. Gatys' paper, A Neural Algorithm of Artistic Style

https://arxiv.org/abs/1508.06576(ps訪問需外網,可參考後面的操作)

例子:

二、方法一TensorFlow實現圖像風格遷移

1. 準備

安裝包:

pip install numpy scipy tensorflow keras

圖片準備:

準備好內容圖片以及風格圖片

2. 原理

爲了將風格圖的風格和內容圖的內容進行融合,所生成的圖片,在內容應當儘可能接近內容圖,在風格中應當儘可能接近風格圖。

因此需要定義內容損失函數和風格損失函數,經過加權後作爲總的損失函數。

           實現步驟如下:

   a.隨機產生一張圖片

   b.在每輪迭代中,根據總的損失函數,調整圖片的像素值

   c.經過多輪迭代,得到優化後的圖片

 

3.內容損失函數

 兩張圖片在內容中相似,不能僅僅靠簡單的純像素比較

 CNN具有抽象和理解圖像的能力,因此可以考慮將各個卷積層的輸出作爲圖像的內容。

 以VGG19爲例,其中包括了多個卷積層、池化層,以及最後的全連接層。

這裏使用conv4_2的輸出作爲圖像的內容表示,定義內容損失函數如下

4.風格損失函數

風格是一個很難說清楚的概念,可能是筆觸、紋理、結構、佈局、用色等等。

這裏使用卷積層各個特徵圖之間的互相關作爲圖像的風格,以conv1_1爲例

  • 共包含64個特徵圖即feature map,或者說圖像的深度、通道的個數
  • 每個特徵圖都是對上一層輸出的一種理解,可以類比成64個人對同一幅畫的不同理解
  • 這些人可能分別偏好印象派、現代主義、超現實主義、表現主義等不同風格
  • 當圖像是某一種風格時,可能這一部分人很欣賞,但那一部分人不喜歡
  • 當圖像是另一種風格時,可能這一部分人不喜歡,但那一部分人很欣賞
  • 64個人之間理解的差異,可以用特徵圖的互相關表示,這裏使用Gram矩陣計算互相關
  • 不同的風格會導致差異化的互相關結果

Gram矩陣的計算如下,如果有64個特徵圖,那麼Gram矩陣的大小便是64 × 64,第i行第j列的值表示第i個特徵圖和第j個特徵圖之間的互相關,用內積計算

風格損失函數定義如下,對多個卷積層的風格表示差異進行加權

 

5.損失函數

總的損失函數即內容損失函數和風格損失函數的加權,不同的權重會導致不同的遷移效果。

6.TensorFlow代碼實現

# -*- coding: utf-8 -*-
# 加載庫
import tensorflow as tf
import numpy as np
import scipy.io
import scipy.misc
import os
import time

def the_current_time():
	print(time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(int(time.time()))))

# 定義一些變量
CONTENT_IMG = 'content.jpg'
STYLE_IMG = 'style5.jpg'
OUTPUT_DIR = 'neural_style_transfer_tensorflow/'

if not os.path.exists(OUTPUT_DIR):
	os.mkdir(OUTPUT_DIR)

IMAGE_W = 800
IMAGE_H = 600
COLOR_C = 3

NOISE_RATIO = 0.7
BETA = 5
ALPHA = 100
VGG_MODEL = 'imagenet-vgg-verydeep-19.mat'
MEAN_VALUES = np.array([123.68, 116.779, 103.939]).reshape((1, 1, 1, 3))

# 加載VGG19模型
def load_vgg_model(path):
	'''
	Details of the VGG19 model:
	- 0 is conv1_1 (3, 3, 3, 64)
	- 1 is relu
	- 2 is conv1_2 (3, 3, 64, 64)
	- 3 is relu    
	- 4 is maxpool
	- 5 is conv2_1 (3, 3, 64, 128)
	- 6 is relu
	- 7 is conv2_2 (3, 3, 128, 128)
	- 8 is relu
	- 9 is maxpool
	- 10 is conv3_1 (3, 3, 128, 256)
	- 11 is relu
	- 12 is conv3_2 (3, 3, 256, 256)
	- 13 is relu
	- 14 is conv3_3 (3, 3, 256, 256)
	- 15 is relu
	- 16 is conv3_4 (3, 3, 256, 256)
	- 17 is relu
	- 18 is maxpool
	- 19 is conv4_1 (3, 3, 256, 512)
	- 20 is relu
	- 21 is conv4_2 (3, 3, 512, 512)
	- 22 is relu
	- 23 is conv4_3 (3, 3, 512, 512)
	- 24 is relu
	- 25 is conv4_4 (3, 3, 512, 512)
	- 26 is relu
	- 27 is maxpool
	- 28 is conv5_1 (3, 3, 512, 512)
	- 29 is relu
	- 30 is conv5_2 (3, 3, 512, 512)
	- 31 is relu
	- 32 is conv5_3 (3, 3, 512, 512)
	- 33 is relu
	- 34 is conv5_4 (3, 3, 512, 512)
	- 35 is relu
	- 36 is maxpool
	- 37 is fullyconnected (7, 7, 512, 4096)
	- 38 is relu
	- 39 is fullyconnected (1, 1, 4096, 4096)
	- 40 is relu
	- 41 is fullyconnected (1, 1, 4096, 1000)
	- 42 is softmax
	'''
	vgg = scipy.io.loadmat(path)
	vgg_layers = vgg['layers']

	def _weights(layer, expected_layer_name):
		W = vgg_layers[0][layer][0][0][2][0][0]
		b = vgg_layers[0][layer][0][0][2][0][1]
		layer_name = vgg_layers[0][layer][0][0][0][0]
		assert layer_name == expected_layer_name
		return W, b

	def _conv2d_relu(prev_layer, layer, layer_name):
		W, b = _weights(layer, layer_name)
		W = tf.constant(W)
		b = tf.constant(np.reshape(b, (b.size)))
		return tf.nn.relu(tf.nn.conv2d(prev_layer, filter=W, strides=[1, 1, 1, 1], padding='SAME') + b)

	def _avgpool(prev_layer):
		return tf.nn.avg_pool(prev_layer, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

	graph = {}
	graph['input']    = tf.Variable(np.zeros((1, IMAGE_H, IMAGE_W, COLOR_C)), dtype='float32')
	graph['conv1_1']  = _conv2d_relu(graph['input'], 0, 'conv1_1')
	graph['conv1_2']  = _conv2d_relu(graph['conv1_1'], 2, 'conv1_2')
	graph['avgpool1'] = _avgpool(graph['conv1_2'])
	graph['conv2_1']  = _conv2d_relu(graph['avgpool1'], 5, 'conv2_1')
	graph['conv2_2']  = _conv2d_relu(graph['conv2_1'], 7, 'conv2_2')
	graph['avgpool2'] = _avgpool(graph['conv2_2'])
	graph['conv3_1']  = _conv2d_relu(graph['avgpool2'], 10, 'conv3_1')
	graph['conv3_2']  = _conv2d_relu(graph['conv3_1'], 12, 'conv3_2')
	graph['conv3_3']  = _conv2d_relu(graph['conv3_2'], 14, 'conv3_3')
	graph['conv3_4']  = _conv2d_relu(graph['conv3_3'], 16, 'conv3_4')
	graph['avgpool3'] = _avgpool(graph['conv3_4'])
	graph['conv4_1']  = _conv2d_relu(graph['avgpool3'], 19, 'conv4_1')
	graph['conv4_2']  = _conv2d_relu(graph['conv4_1'], 21, 'conv4_2')
	graph['conv4_3']  = _conv2d_relu(graph['conv4_2'], 23, 'conv4_3')
	graph['conv4_4']  = _conv2d_relu(graph['conv4_3'], 25, 'conv4_4')
	graph['avgpool4'] = _avgpool(graph['conv4_4'])
	graph['conv5_1']  = _conv2d_relu(graph['avgpool4'], 28, 'conv5_1')
	graph['conv5_2']  = _conv2d_relu(graph['conv5_1'], 30, 'conv5_2')
	graph['conv5_3']  = _conv2d_relu(graph['conv5_2'], 32, 'conv5_3')
	graph['conv5_4']  = _conv2d_relu(graph['conv5_3'], 34, 'conv5_4')
	graph['avgpool5'] = _avgpool(graph['conv5_4'])
	return graph

# 定義內容損失函數
def content_loss_func(sess, model):
	def _content_loss(p, x):
		N = p.shape[3]
		M = p.shape[1] * p.shape[2]
		return (1 / (4 * N * M)) * tf.reduce_sum(tf.pow(x - p, 2))
	return _content_loss(sess.run(model['conv4_2']), model['conv4_2'])

# 定義風格損失函數
STYLE_LAYERS = [('conv1_1', 0.5), ('conv2_1', 1.0), ('conv3_1', 1.5), ('conv4_1', 3.0), ('conv5_1', 4.0)]

def style_loss_func(sess, model):
	def _gram_matrix(F, N, M):
		Ft = tf.reshape(F, (M, N))
		return tf.matmul(tf.transpose(Ft), Ft)

	def _style_loss(a, x):
		N = a.shape[3]
		M = a.shape[1] * a.shape[2]
		A = _gram_matrix(a, N, M)
		G = _gram_matrix(x, N, M)
		return (1 / (4 * N ** 2 * M ** 2)) * tf.reduce_sum(tf.pow(G - A, 2))

	return sum([_style_loss(sess.run(model[layer_name]), model[layer_name]) * w for layer_name, w in STYLE_LAYERS])

# 隨機產生一張初始照片
def generate_noise_image(content_image, noise_ratio=NOISE_RATIO):
	noise_image = np.random.uniform(-20, 20, (1, IMAGE_H, IMAGE_W, COLOR_C)).astype('float32')
	input_image = noise_image * noise_ratio + content_image * (1 - noise_ratio)
	return input_image

# 加載圖片
def load_image(path):
	image = scipy.misc.imread(path)
	image = scipy.misc.imresize(image, (IMAGE_H, IMAGE_W))
	image = np.reshape(image, ((1, ) + image.shape))
	image = image - MEAN_VALUES
	return image

# 保存圖片
def save_image(path, image):
	image = image + MEAN_VALUES
	image = image[0]
	image = np.clip(image, 0, 255).astype('uint8')
	scipy.misc.imsave(path, image)

# 調用以上函數並訓練模型
the_current_time()

with tf.Session() as sess:
	content_image = load_image(CONTENT_IMG)
	style_image = load_image(STYLE_IMG)
	model = load_vgg_model(VGG_MODEL)

	input_image = generate_noise_image(content_image)
	sess.run(tf.global_variables_initializer())

	sess.run(model['input'].assign(content_image))
	content_loss = content_loss_func(sess, model)

	sess.run(model['input'].assign(style_image))
	style_loss = style_loss_func(sess, model)

	total_loss = BETA * content_loss + ALPHA * style_loss
	optimizer = tf.train.AdamOptimizer(2.0)
	train = optimizer.minimize(total_loss)

	sess.run(tf.global_variables_initializer())
	sess.run(model['input'].assign(input_image))

	ITERATIONS = 2000
	for i in range(ITERATIONS):
		sess.run(train)
		if i % 100 == 0:
			output_image = sess.run(model['input'])
			the_current_time()
			print('Iteration %d' % i)
			print('Cost: ', sess.run(total_loss))

			save_image(os.path.join(OUTPUT_DIR, 'output_%d.jpg' % i), output_image)

7.結果展示

(1)運行

python main_tf

(2)運行過程

可以看出在本機的cpu上跑非常慢,很慢,,,,,

(3)結果展示

三、方法二TensorFlow實現快速圖像風格遷移(Fast Neural Style Transfer)

1.簡介

使用TensorFlow實現快速圖像風格遷移(Fast Neural Style Transfer)

2.準備

風格圖及原圖

3.原理

在之前介紹的圖像風格遷移中,可以根據內容圖片和風格圖片優化輸入圖片,使內容損失函數和風格損失函數儘可能小。

使用快速圖像風格遷移可大大縮短生成一張遷移圖片所需的時間,其模型結構如下,包括轉換網絡和損失網絡。

  • 風格圖片是固定的,而內容圖片是可變的輸入,因此以上模型用於將任意圖片快速轉換爲指定風格的圖片
  • 轉換網絡:參數需要訓練,將內容圖片轉換成遷移圖片
  • 損失網絡:計算遷移圖片和風格圖片之間的風格損失,以及遷移圖片和原始內容圖片之間的內容損失。
  • 經過訓練後,轉換網絡所生成的遷移圖片,在內容和輸入的內容圖片相似,在風格上和指定的風格圖片相似
  • 進行推斷時,僅使用轉換網絡,輸入內容圖片,即可得到對應的遷移圖片
  • 如果有多個風格圖片,對每個風格分別訓練一個模型即可。

4.實現代碼

# -*- coding: utf-8 -*-

import tensorflow as tf
import numpy as np
from imageio import imread, imsave
import os
import time

def the_current_time():
    print(time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(int(time.time()))))

style = 'udnie'
model = 'samples_%s' % style
content_image = 'sjtu.jpg'
result_image = 'sjtu_%s.jpg' % style
X_image = imread(content_image)

sess = tf.Session()
sess.run(tf.global_variables_initializer())

saver = tf.train.import_meta_graph(os.path.join(model, 'fast_style_transfer.meta'))
saver.restore(sess, tf.train.latest_checkpoint(model))

graph = tf.get_default_graph()
X = graph.get_tensor_by_name('X:0')
g = graph.get_tensor_by_name('transformer/g:0')

the_current_time()

gen_img = sess.run(g, feed_dict={X: [X_image]})[0]
gen_img = np.clip(gen_img, 0, 255) / 255.
imsave(result_image, gen_img)

the_current_time()

 

5.運行結果展示

(1)過程展示

速度很快,在本機CPU中5s左右就完成了

(2)結果展示

四、方法三TensorFlow實現圖像風格遷移

前面都是在CPU上跑的程序,可我就想試試在GPU上跑,怎麼辦?怎麼辦?

那麼方法來了,谷歌提供了Google Colob可以免費試用GPU來跑代碼,那麼趕快來試試吧!!!

1.外網訪問

谷歌訪問助手破解版,無需設置推廣網站的地址

https://github.com/haotian-wang/google-access-helper

2.Google Colab免費GPU服務器使用教程

參考一個博主提供的文章

https://blog.csdn.net/cocoaqin/article/details/79184540

3.例子

在此過程中,我們按照下列步驟中進行

a.可視化數據

b.基本預處理/準備數據

c.設置損失功能

d.創建模型

e.優化損失功能

4.代碼展示

(1)下載圖片到文件中

#下載圖片到文件中
import os
img_dir = '/tmp/nst'
if not os.path.exists(img_dir):
    os.makedirs(img_dir)
!wget --quiet -P /tmp/nst/ https://upload.wikimedia.org/wikipedia/commons/d/d7/Green_Sea_Turtle_grazing_seagrass.jpg
!wget --quiet -P /tmp/nst/ https://upload.wikimedia.org/wikipedia/commons/0/0a/The_Great_Wave_off_Kanagawa.jpg
!wget --quiet -P /tmp/nst/ https://upload.wikimedia.org/wikipedia/commons/b/b4/Vassily_Kandinsky%2C_1913_-_Composition_7.jpg
!wget --quiet -P /tmp/nst/ https://upload.wikimedia.org/wikipedia/commons/0/00/Tuebingen_Neckarfront.jpg
!wget --quiet -P /tmp/nst/ https://upload.wikimedia.org/wikipedia/commons/6/68/Pillars_of_creation_2014_HST_WFC3-UVIS_full-res_denoised.jpg
!wget --quiet -P /tmp/nst/ https://upload.wikimedia.org/wikipedia/commons/thumb/e/ea/Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg/1024px-Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg

(2)運行代碼

# 導入和配置模塊
import matplotlib.pyplot as plt
import matplotlib as mpl
mpl.rcParams['figure.figsize'] = (10,10)
mpl.rcParams['axes.grid'] = False

import numpy as np
from PIL import Image
import time
import functools

import IPython.display

import tensorflow as tf
import tensorflow.contrib.eager as tfe

from tensorflow.python.keras.preprocessing import image as kp_image
from tensorflow.python.keras import models 
from tensorflow.python.keras import losses
from tensorflow.python.keras import layers
from tensorflow.python.keras import backend as K


tf.enable_eager_execution()
print("Eager execution: {}".format(tf.executing_eagerly()))

# 定義全局變量
content_path = '/tmp/nst/Green_Sea_Turtle_grazing_seagrass.jpg'
style_path = '/tmp/nst/The_Great_Wave_off_Kanagawa.jpg'

# 定義加載圖片函數
def load_img(path_to_img):
  max_dim = 512
  img = Image.open(path_to_img)
  long = max(img.size)
  scale = max_dim/long
  img = img.resize((round(img.size[0]*scale), round(img.size[1]*scale)), Image.ANTIALIAS)
  
  img = kp_image.img_to_array(img)
  
  # We need to broadcast the image array such that it has a batch dimension 
  img = np.expand_dims(img, axis=0)
  return img

# 定義顯示圖片的函數
def imshow(img, title=None):
  # Remove the batch dimension
  out = np.squeeze(img, axis=0)
  # Normalize for display 
  out = out.astype('uint8')
  plt.imshow(out)
  if title is not None:
    plt.title(title)
  plt.imshow(out)
  
  
# 定義顯示原內容圖像和風格圖像的函數
def display():
  plt.figure(figsize=(10,10))

  content = load_img(content_path).astype('uint8')
  style = load_img(style_path).astype('uint8')

  plt.subplot(1, 2, 1)
  imshow(content, 'Content Image')

  plt.subplot(1, 2, 2)
  imshow(style, 'Style Image')
  plt.show()

  
# 定義加載和處理圖像函數
def load_and_process_img(path_to_img):
  img = load_img(path_to_img)
  img = tf.keras.applications.vgg19.preprocess_input(img)
  return img


# 定義優化輸出的函數
def deprocess_img(processed_img):
  x = processed_img.copy()
  if len(x.shape) == 4:
    x = np.squeeze(x, 0)
  assert len(x.shape) == 3, ("Input to deprocess image must be an image of "
                             "dimension [1, height, width, channel] or [height, width, channel]")
  if len(x.shape) != 3:
    raise ValueError("Invalid input to deprocessing image")
  
  # perform the inverse of the preprocessiing step
  x[:, :, 0] += 103.939
  x[:, :, 1] += 116.779
  x[:, :, 2] += 123.68
  x = x[:, :, ::-1]

  x = np.clip(x, 0, 255).astype('uint8')
  return x

# 內容層
# Content layer where will pull our feature maps
content_layers = ['block5_conv2'] 
# 風格層
# Style layer we are interested in
style_layers = ['block1_conv1',
                'block2_conv1',
                'block3_conv1', 
                'block4_conv1', 
                'block5_conv1'
               ]

num_content_layers = len(content_layers)
num_style_layers = len(style_layers)

# 構建模型
def get_model():
  """ Creates our model with access to intermediate layers. 
  
  This function will load the VGG19 model and access the intermediate layers. 
  These layers will then be used to create a new model that will take input image
  and return the outputs from these intermediate layers from the VGG model. 
  
  Returns:
    returns a keras model that takes image inputs and outputs the style and 
      content intermediate layers. 
  """
  # Load our model. We load pretrained VGG, trained on imagenet data
  vgg = tf.keras.applications.vgg19.VGG19(include_top=False, weights='imagenet')
  vgg.trainable = False
  # Get output layers corresponding to style and content layers 
  style_outputs = [vgg.get_layer(name).output for name in style_layers]
  content_outputs = [vgg.get_layer(name).output for name in content_layers]
  model_outputs = style_outputs + content_outputs
  # Build model 
  return models.Model(vgg.input, model_outputs)


# 定義內容損失函數
def get_content_loss(base_content, target):
  return tf.reduce_mean(tf.square(base_content - target))

# 定義函數
def gram_matrix(input_tensor):
  # We make the image channels first 
  channels = int(input_tensor.shape[-1])
  a = tf.reshape(input_tensor, [-1, channels])
  n = tf.shape(a)[0]
  gram = tf.matmul(a, a, transpose_a=True)
  return gram / tf.cast(n, tf.float32)

# 定義風格損失函數
def get_style_loss(base_style, gram_target):
  """Expects two images of dimension h, w, c"""
  # height, width, num filters of each layer
  # We scale the loss at a given layer by the size of the feature map and the number of filters
  height, width, channels = base_style.get_shape().as_list()
  gram_style = gram_matrix(base_style)
  
  return tf.reduce_mean(tf.square(gram_style - gram_target))# / (4. * (channels ** 2) * (width * height) ** 2)


# 定義一個反饋函數
def get_feature_representations(model, content_path, style_path):
  """Helper function to compute our content and style feature representations.

  This function will simply load and preprocess both the content and style 
  images from their path. Then it will feed them through the network to obtain
  the outputs of the intermediate layers. 
  
  Arguments:
    model: The model that we are using.
    content_path: The path to the content image.
    style_path: The path to the style image
    
  Returns:
    returns the style features and the content features. 
  """
  # Load our images in 
  content_image = load_and_process_img(content_path)
  style_image = load_and_process_img(style_path)
  
  # batch compute content and style features
  style_outputs = model(style_image)
  content_outputs = model(content_image)
  
  
  # Get the style and content feature representations from our model  
  style_features = [style_layer[0] for style_layer in style_outputs[:num_style_layers]]
  content_features = [content_layer[0] for content_layer in content_outputs[num_style_layers:]]
  return style_features, content_features

# 定義計算損失函數
def compute_loss(model, loss_weights, init_image, gram_style_features, content_features):
  """This function will compute the loss total loss.
  
  Arguments:
    model: The model that will give us access to the intermediate layers
    loss_weights: The weights of each contribution of each loss function. 
      (style weight, content weight, and total variation weight)
    init_image: Our initial base image. This image is what we are updating with 
      our optimization process. We apply the gradients wrt the loss we are 
      calculating to this image.
    gram_style_features: Precomputed gram matrices corresponding to the 
      defined style layers of interest.
    content_features: Precomputed outputs from defined content layers of 
      interest.
      
  Returns:
    returns the total loss, style loss, content loss, and total variational loss
  """
  style_weight, content_weight = loss_weights
  
  # Feed our init image through our model. This will give us the content and 
  # style representations at our desired layers. Since we're using eager
  # our model is callable just like any other function!
  model_outputs = model(init_image)
  
  style_output_features = model_outputs[:num_style_layers]
  content_output_features = model_outputs[num_style_layers:]
  
  style_score = 0
  content_score = 0
  
  # Accumulate style losses from all layers
  # Here, we equally weight each contribution of each loss layer
  weight_per_style_layer = 1.0 / float(num_style_layers)
  for target_style, comb_style in zip(gram_style_features, style_output_features):
    style_score += weight_per_style_layer * get_style_loss(comb_style[0], target_style)
    
  # Accumulate content losses from all layers 
  weight_per_content_layer = 1.0 / float(num_content_layers)
  for target_content, comb_content in zip(content_features, content_output_features):
    content_score += weight_per_content_layer* get_content_loss(comb_content[0], target_content)
  
  style_score *= style_weight
  content_score *= content_weight

  # Get total loss
  loss = style_score + content_score 
  return loss, style_score, content_score

# 定義計算漸變函數
def compute_grads(cfg):
  with tf.GradientTape() as tape: 
    all_loss = compute_loss(**cfg)
  # Compute gradients wrt input image
  total_loss = all_loss[0]
  return tape.gradient(total_loss, cfg['init_image']), all_loss

# 定義優化循環函數
def run_style_transfer(content_path, 
                       style_path,
                       num_iterations=1000,
                       content_weight=1e3, 
                       style_weight=1e-2): 
  # We don't need to (or want to) train any layers of our model, so we set their
  # trainable to false. 
  model = get_model() 
  for layer in model.layers:
    layer.trainable = False
  
  # Get the style and content feature representations (from our specified intermediate layers) 
  style_features, content_features = get_feature_representations(model, content_path, style_path)
  gram_style_features = [gram_matrix(style_feature) for style_feature in style_features]
  
  # Set initial image
  init_image = load_and_process_img(content_path)
  init_image = tfe.Variable(init_image, dtype=tf.float32)
  # Create our optimizer
  opt = tf.train.AdamOptimizer(learning_rate=5, beta1=0.99, epsilon=1e-1)

  # For displaying intermediate images 
  iter_count = 1
  
  # Store our best result
  best_loss, best_img = float('inf'), None
  
  # Create a nice config 
  loss_weights = (style_weight, content_weight)
  cfg = {
      'model': model,
      'loss_weights': loss_weights,
      'init_image': init_image,
      'gram_style_features': gram_style_features,
      'content_features': content_features
  }
    
  # For displaying
  num_rows = 2
  num_cols = 5
  display_interval = num_iterations/(num_rows*num_cols)
  start_time = time.time()
  global_start = time.time()
  
  norm_means = np.array([103.939, 116.779, 123.68])
  min_vals = -norm_means
  max_vals = 255 - norm_means   
  
  imgs = []
  for i in range(num_iterations):
    grads, all_loss = compute_grads(cfg)
    loss, style_score, content_score = all_loss
    opt.apply_gradients([(grads, init_image)])
    clipped = tf.clip_by_value(init_image, min_vals, max_vals)
    init_image.assign(clipped)
    end_time = time.time() 
    
    if loss < best_loss:
      # Update best loss and best image from total loss. 
      best_loss = loss
      best_img = deprocess_img(init_image.numpy())

    if i % display_interval== 0:
      start_time = time.time()
      
      # Use the .numpy() method to get the concrete numpy array
      plot_img = init_image.numpy()
      plot_img = deprocess_img(plot_img)
      imgs.append(plot_img)
      IPython.display.clear_output(wait=True)
      IPython.display.display_png(Image.fromarray(plot_img))
      print('Iteration: {}'.format(i))        
      print('Total loss: {:.4e}, ' 
            'style loss: {:.4e}, '
            'content loss: {:.4e}, '
            'time: {:.4f}s'.format(loss, style_score, content_score, time.time() - start_time))
  print('Total time: {:.4f}s'.format(time.time() - global_start))
  IPython.display.clear_output(wait=True)
  plt.figure(figsize=(14,4))
  for i,img in enumerate(imgs):
      plt.subplot(num_rows,num_cols,i+1)
      plt.imshow(img)
      plt.xticks([])
      plt.yticks([])
      
  return best_img, best_loss

# 定義結果展示函數
def show_results(best_img, content_path, style_path, show_large_final=True):
  plt.figure(figsize=(10, 5))
  content = load_img(content_path) 
  style = load_img(style_path)

  plt.subplot(1, 2, 1)
  imshow(content, 'Content Image')

  plt.subplot(1, 2, 2)
  imshow(style, 'Style Image')

  if show_large_final: 
    plt.figure(figsize=(10, 10))

    plt.imshow(best_img)
    plt.title('Output Image')
    plt.show()
    

# 顯示結果
best, best_loss = run_style_transfer(content_path, 
                                     style_path, num_iterations=1000)

Image.fromarray(best)
show_results(best, content_path, style_path)

5.結果展示

(1)過程展示

 

(2)最終結果

五、項目地址

鏈接:https://pan.baidu.com/s/1MydlfdfLRYUigZVhKwj7kw

提取碼:utmt

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章