用 tf.data 加載圖片

本教程提供一個如何使用 tf.data 加載圖片的簡單例子。

本例中使用的數據集分佈在圖片文件夾中，一個文件夾含有一類圖片。

1. 配置

import tensorflow as tf

AUTOTUNE = tf.data.experimental.AUTOTUNE

2. 下載並檢查數據集

2.1 檢索圖片

在你開始任何訓練之前，你將需要一組圖片來教會網絡你想要訓練的新類別。你已經創建了一個文件夾，存儲了最初使用的擁有創作共用許可的花卉照片。

import pathlib
data_root_orig = tf.keras.utils.get_file(origin='https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',
                                         fname='flower_photos', untar=True)
data_root = pathlib.Path(data_root_orig)
print(data_root)

Downloading data from https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz
228818944/228813984 [==============================] - 55s 0us/step
C:\Users\lenovo\.keras\datasets\flower_photos

下載了 218 MB 之後，你現在應該有花卉照片副本：

for item in data_root.iterdir():
  print(item)

C:\Users\lenovo\.keras\datasets\flower_photos\daisy
C:\Users\lenovo\.keras\datasets\flower_photos\dandelion
C:\Users\lenovo\.keras\datasets\flower_photos\LICENSE.txt
C:\Users\lenovo\.keras\datasets\flower_photos\roses
C:\Users\lenovo\.keras\datasets\flower_photos\sunflowers
C:\Users\lenovo\.keras\datasets\flower_photos\tulips

import random
all_image_paths = list(data_root.glob('*/*'))
all_image_paths = [str(path) for path in all_image_paths]
random.shuffle(all_image_paths)

image_count = len(all_image_paths)
image_count

all_image_paths[:10]

['C:\\Users\\lenovo\\.keras\\datasets\\flower_photos\\dandelion\\4714026966_93846ddb74_m.jpg',
 'C:\\Users\\lenovo\\.keras\\datasets\\flower_photos\\sunflowers\\8265023280_713f2c69d0_m.jpg',
 'C:\\Users\\lenovo\\.keras\\datasets\\flower_photos\\daisy\\3711892138_b8c953fdc1_z.jpg',
 'C:\\Users\\lenovo\\.keras\\datasets\\flower_photos\\daisy\\2641979584_2b21c3fe29_m.jpg',
 'C:\\Users\\lenovo\\.keras\\datasets\\flower_photos\\dandelion\\2569516382_9fd7097b9b.jpg',
 'C:\\Users\\lenovo\\.keras\\datasets\\flower_photos\\dandelion\\3372748508_e5a4eacfcb_n.jpg',
 'C:\\Users\\lenovo\\.keras\\datasets\\flower_photos\\tulips\\142235017_07816937c6.jpg',
 'C:\\Users\\lenovo\\.keras\\datasets\\flower_photos\\roses\\12395698413_c0388278f7.jpg',
 'C:\\Users\\lenovo\\.keras\\datasets\\flower_photos\\tulips\\3511776685_3635087b12_n.jpg',
 'C:\\Users\\lenovo\\.keras\\datasets\\flower_photos\\sunflowers\\20621698991_dcb323911d.jpg']

2.2 檢查圖片

現在讓我們快速瀏覽幾張圖片，這樣你知道你在處理什麼：

import os
attributions = (data_root/"LICENSE.txt").open(encoding='utf-8').readlines()[4:]
attributions = [line.split(' CC-BY') for line in attributions]
attributions = dict(attributions)

import IPython.display as display

def caption_image(image_path):
    image_rel = pathlib.Path(image_path).relative_to(data_root)
    return "Image (CC BY 2.0) " + ' - '.join(attributions[str(image_rel)].split(' - ')[:-1])

for n in range(3):
  image_path = random.choice(all_image_paths)
  display.display(display.Image(image_path))
#   print(caption_image(image_path))
  print()

2.3 確定每張圖片的標籤

列出可用的標籤：

label_names = sorted(item.name for item in data_root.glob('*/') if item.is_dir())
label_names

['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips']

爲每個標籤分配索引：

label_to_index = dict((name, index) for index, name in enumerate(label_names))
label_to_index

{'daisy': 0, 'dandelion': 1, 'roses': 2, 'sunflowers': 3, 'tulips': 4}

創建一個列表，包含每個文件的標籤索引：

all_image_labels = [label_to_index[pathlib.Path(path).parent.name]
                    for path in all_image_paths]

print("First 10 labels indices: ", all_image_labels[:10])

First 10 labels indices:  [1, 3, 0, 0, 1, 1, 4, 2, 4, 3]

2.4 加載和格式化圖片

TensorFlow 包含加載和處理圖片時你需要的所有工具：

img_path = all_image_paths[0]
img_path

'C:\\Users\\lenovo\\.keras\\datasets\\flower_photos\\dandelion\\4714026966_93846ddb74_m.jpg'

以下是原始數據：

img_raw = tf.io.read_file(img_path)
print(repr(img_raw)[:100]+"...")

<tf.Tensor: shape=(), dtype=string, numpy=b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x00\x00\x01\x00...

將它解碼爲圖像 tensor（張量）：

img_tensor = tf.image.decode_image(img_raw)

print(img_tensor.shape)
print(img_tensor.dtype)

(240, 180, 3)
<dtype: 'uint8'>

根據你的模型調整其大小：

img_final = tf.image.resize(img_tensor, [192, 192])
img_final = img_final/255.0
print(img_final.shape)
print(img_final.numpy().min())
print(img_final.numpy().max())

(192, 192, 3)
0.0
0.9996783

將這些包裝在一個簡單的函數裏，以備後用。

def preprocess_image(image):
  image = tf.image.decode_jpeg(image, channels=3)
  image = tf.image.resize(image, [192, 192])
  image /= 255.0  # normalize to [0,1] range

  return image

def load_and_preprocess_image(path):
  image = tf.io.read_file(path)
  return preprocess_image(image)

import matplotlib.pyplot as plt

image_path = all_image_paths[0]
label = all_image_labels[0]

plt.imshow(load_and_preprocess_image(img_path))
plt.grid(False)
# plt.xlabel(caption_image(img_path))
plt.title(label_names[label].title())
print()

3. 構建一個 `tf.data.Dataset`

3.1 一個圖片數據集

構建 tf.data.Dataset 最簡單的方法就是使用 from_tensor_slices 方法。

將字符串數組切片，得到一個字符串數據集：

path_ds = tf.data.Dataset.from_tensor_slices(all_image_paths)

shapes（維數） 和 types（類型） 描述數據集裏每個數據項的內容。在這裏是一組標量二進制字符串。

print(path_ds)

<TensorSliceDataset shapes: (), types: tf.string>

現在創建一個新的數據集，通過在路徑數據集上映射 preprocess_image 來動態加載和格式化圖片。

image_ds = path_ds.map(load_and_preprocess_image, num_parallel_calls=AUTOTUNE)

import matplotlib.pyplot as plt

plt.figure(figsize=(8,8))
for n, image in enumerate(image_ds.take(4)):
  plt.subplot(2,2,n+1)
  plt.imshow(image)
  plt.grid(False)
  plt.xticks([])
  plt.yticks([])
#   plt.xlabel(caption_image(all_image_paths[n]))
  plt.show()

3.2 一個`(圖片, 標籤)`對數據集

使用同樣的 from_tensor_slices 方法你可以創建一個標籤數據集：

label_ds = tf.data.Dataset.from_tensor_slices(tf.cast(all_image_labels, tf.int64))

for label in label_ds.take(10):
  print(label_names[label.numpy()])

dandelion
sunflowers
daisy
daisy
dandelion
dandelion
tulips
roses
tulips
sunflowers

由於這些數據集順序相同，你可以將他們打包在一起得到一個(圖片, 標籤)對數據集：

image_label_ds = tf.data.Dataset.zip((image_ds, label_ds))

這個新數據集的 shapes（維數） 和 types（類型） 也是維數和類型的元組，用來描述每個字段：

print(image_label_ds)

<ZipDataset shapes: ((192, 192, 3), ()), types: (tf.float32, tf.int64)>

注意：當你擁有形似 all_image_labels 和 all_image_paths 的數組，tf.data.dataset.Dataset.zip 的替代方法是將這對數組切片。

ds = tf.data.Dataset.from_tensor_slices((all_image_paths, all_image_labels))

# 元組被解壓縮到映射函數的位置參數中
def load_and_preprocess_from_path_label(path, label):
  return load_and_preprocess_image(path), label

image_label_ds = ds.map(load_and_preprocess_from_path_label)
image_label_ds

<MapDataset shapes: ((192, 192, 3), ()), types: (tf.float32, tf.int32)>

3.3 訓練的基本方法

要使用此數據集訓練模型，你將會想要數據：

被充分打亂。
被分割爲 batch。
永遠重複。
儘快提供 batch。

使用 tf.data api 可以輕鬆添加這些功能。

BATCH_SIZE = 32

# 設置一個和數據集大小一致的 shuffle buffer size（隨機緩衝區大小）以保證數據
# 被充分打亂。
ds = image_label_ds.shuffle(buffer_size=image_count)
ds = ds.repeat()
ds = ds.batch(BATCH_SIZE)
# 當模型在訓練的時候，`prefetch` 使數據集在後臺取得 batch。
ds = ds.prefetch(buffer_size=AUTOTUNE)
ds

<PrefetchDataset shapes: ((None, 192, 192, 3), (None,)), types: (tf.float32, tf.int32)>

這裏有一些注意事項：

順序很重要。

在 .repeat 之後 .shuffle，會在 epoch 之間打亂數據（當有些數據出現兩次的時候，其他數據還沒有出現過）。
在 .batch 之後 .shuffle，會打亂 batch 的順序，但是不會在 batch 之間打亂數據。

你在完全打亂中使用和數據集大小一樣的 buffer_size（緩衝區大小）。較大的緩衝區大小提供更好的隨機化，但使用更多的內存，直到超過數據集大小。
在從隨機緩衝區中拉取任何元素前，要先填滿它。所以當你的 Dataset（數據集）啓動的時候一個大的 buffer_size（緩衝區大小）可能會引起延遲。
在隨機緩衝區完全爲空之前，被打亂的數據集不會報告數據集的結尾。Dataset（數據集）由 .repeat 重新啓動，導致需要再次等待隨機緩衝區被填滿。

最後一點可以通過使用 tf.data.Dataset.apply 方法和融合過的 tf.data.experimental.shuffle_and_repeat 函數來解決:

ds = image_label_ds.apply(
  tf.data.experimental.shuffle_and_repeat(buffer_size=image_count))
ds = ds.batch(BATCH_SIZE)
ds = ds.prefetch(buffer_size=AUTOTUNE)
ds

WARNING:tensorflow:From <ipython-input-34-4dc713bd4d84>:2: shuffle_and_repeat (from tensorflow.python.data.experimental.ops.shuffle_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.shuffle(buffer_size, seed)` followed by `tf.data.Dataset.repeat(count)`. Static tf.data optimizations will take care of using the fused implementation.



<PrefetchDataset shapes: ((None, 192, 192, 3), (None,)), types: (tf.float32, tf.int32)>

3.4 傳遞數據集至模型

從 tf.keras.applications 取得 MobileNet v2 副本。

該模型副本會被用於一個簡單的遷移學習例子。

設置 MobileNet 的權重爲不可訓練：

mobile_net = tf.keras.applications.MobileNetV2(input_shape=(192, 192, 3), include_top=False)
mobile_net.trainable=False

Downloading data from https://github.com/JonathanCMitchell/mobilenet_v2_keras/releases/download/v1.1/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_192_no_top.h5
9412608/9406464 [==============================] - 3s 0us/step

該模型期望它的輸出被標準化至 [-1,1] 範圍內：

help(keras_applications.mobilenet_v2.preprocess_input)

在你將輸出傳遞給 MobilNet 模型之前，你需要將其範圍從 [0,1] 轉化爲 [-1,1]：

def change_range(image,label):
  """
  該函數使用“Inception”預處理，將
  RGB 值從 [0, 255] 轉化爲 [-1, 1]
  """
  return 2*image-1, label

keras_ds = ds.map(change_range)

MobileNet 爲每張圖片的特徵返回一個 6x6 的空間網格。

傳遞一個 batch 的圖片給它，查看結果：

# 數據集可能需要幾秒來啓動，因爲要填滿其隨機緩衝區。
image_batch, label_batch = next(iter(keras_ds))

feature_map_batch = mobile_net(image_batch)
print(feature_map_batch.shape)

(32, 6, 6, 1280)

構建一個包裝了 MobileNet 的模型並在 tf.keras.layers.Dense 輸出層之前使用 tf.keras.layers.GlobalAveragePooling2D 來平均那些空間向量：

model = tf.keras.Sequential([
  mobile_net,
  tf.keras.layers.GlobalAveragePooling2D(),
  tf.keras.layers.Dense(len(label_names), activation = 'softmax')])

現在它產出符合預期 shape(維數)的輸出：

logit_batch = model(image_batch).numpy()

print("min logit:", logit_batch.min())
print("max logit:", logit_batch.max())
print()

print("Shape:", logit_batch.shape)

min logit: 0.013189439
max logit: 0.80941063

Shape: (32, 5)

編譯模型以描述訓練過程：

model.compile(optimizer=tf.keras.optimizers.Adam(),
              loss='sparse_categorical_crossentropy',
              metrics=["accuracy"])

此處有兩個可訓練的變量 —— Dense 層中的 weights（權重） 和 bias（偏差）：

len(model.trainable_variables)

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
mobilenetv2_1.00_192 (Model) (None, 6, 6, 1280)        2257984   
_________________________________________________________________
global_average_pooling2d (Gl (None, 1280)              0         
_________________________________________________________________
dense (Dense)                (None, 5)                 6405      
=================================================================
Total params: 2,264,389
Trainable params: 6,405
Non-trainable params: 2,257,984
_________________________________________________________________

你已經準備好來訓練模型了。

注意，出於演示目的每一個 epoch 中你將只運行 3 step，但一般來說在傳遞給 model.fit() 之前你會指定 step 的真實數量，如下所示：

steps_per_epoch=tf.math.ceil(len(all_image_paths)/BATCH_SIZE).numpy()
steps_per_epoch

115.0

model.fit(ds, epochs=1, steps_per_epoch=3)

Train for 3 steps
3/3 [==============================] - 36s 12s/step - loss: 1.7119 - accuracy: 0.2812



<tensorflow.python.keras.callbacks.History at 0x1e9266d4108>

4. 性能

注意：這部分只是展示一些可能幫助提升性能的簡單技巧。深入指南，請看：輸入 pipeline（管道）的性能。

上面使用的簡單 pipeline（管道）在每個 epoch 中單獨讀取每個文件。在本地使用 CPU 訓練時這個方法是可行的，但是可能不足以進行 GPU 訓練並且完全不適合任何形式的分佈式訓練。

要研究這點，首先構建一個簡單的函數來檢查數據集的性能：

import time
default_timeit_steps = 2*steps_per_epoch+1

def timeit(ds, steps=default_timeit_steps):
  overall_start = time.time()
  # 在開始計時之前
  # 取得單個 batch 來填充 pipeline（管道）（填充隨機緩衝區）
  it = iter(ds.take(steps+1))
  next(it)

  start = time.time()
  for i,(images,labels) in enumerate(it):
    if i%10 == 0:
      print('.',end='')
  print()
  end = time.time()

  duration = end-start
  print("{} batches: {} s".format(steps, duration))
  print("{:0.5f} Images/s".format(BATCH_SIZE*steps/duration))
  print("Total time: {}s".format(end-overall_start))

當前數據集的性能是：

ds = image_label_ds.apply(
  tf.data.experimental.shuffle_and_repeat(buffer_size=image_count))
ds = ds.batch(BATCH_SIZE).prefetch(buffer_size=AUTOTUNE)
ds

<PrefetchDataset shapes: ((None, 192, 192, 3), (None,)), types: (tf.float32, tf.int32)>

timeit(ds)

........................
231.0 batches: 38.075591802597046 s
194.14012 Images/s
Total time: 55.05409812927246s

4.1 緩存

使用 tf.data.Dataset.cache 在 epoch 之間輕鬆緩存計算結果。這是非常高效的，特別是當內存能容納全部數據時。

在被預處理之後（解碼和調整大小），圖片在此被緩存了：

ds = image_label_ds.cache()
ds = ds.apply(
  tf.data.experimental.shuffle_and_repeat(buffer_size=image_count))
ds = ds.batch(BATCH_SIZE).prefetch(buffer_size=AUTOTUNE)
ds

<PrefetchDataset shapes: ((None, 192, 192, 3), (None,)), types: (tf.float32, tf.int32)>

timeit(ds)

........................
231.0 batches: 1.7550482749938965 s
4211.84996 Images/s
Total time: 18.165388107299805s

使用內存緩存的一個缺點是必須在每次運行時重建緩存，這使得每次啓動數據集時有相同的啓動延遲：

timeit(ds)

........................
231.0 batches: 2.4259941577911377 s
3046.99827 Images/s
Total time: 2.4670004844665527s

如果內存不夠容納數據，使用一個緩存文件：

ds = image_label_ds.cache(filename='./cache.tf-data')
ds = ds.apply(
  tf.data.experimental.shuffle_and_repeat(buffer_size=image_count))
ds = ds.batch(BATCH_SIZE).prefetch(1)
ds

<PrefetchDataset shapes: ((None, 192, 192, 3), (None,)), types: (tf.float32, tf.int32)>

timeit(ds)

........................
231.0 batches: 19.167855262756348 s
385.64565 Images/s
Total time: 52.678998708724976s

這個緩存文件也有可快速重啓數據集而無需重建緩存的優點。注意第二次快了多少：

timeit(ds)

........................
231.0 batches: 18.943477630615234 s
390.21346 Images/s
Total time: 26.635408639907837s

4.2 TFRecord 文件

4.2.1 原始圖片數據

TFRecord 文件是一種用來存儲一串二進制 blob 的簡單格式。通過將多個示例打包進同一個文件內，TensorFlow 能夠一次性讀取多個示例，當使用一個遠程存儲服務，如 GCS 時，這對性能來說尤其重要。

首先，從原始圖片數據中構建出一個 TFRecord 文件：

image_ds = tf.data.Dataset.from_tensor_slices(all_image_paths).map(tf.io.read_file)
tfrec = tf.data.experimental.TFRecordWriter('images.tfrec')
tfrec.write(image_ds)

接着，構建一個從 TFRecord 文件讀取的數據集，並使用你之前定義的 preprocess_image 函數對圖像進行解碼/重新格式化：

image_ds = tf.data.TFRecordDataset('images.tfrec').map(preprocess_image)

壓縮該數據集和你之前定義的標籤數據集以得到期望的 (圖片,標籤) 對：

ds = tf.data.Dataset.zip((image_ds, label_ds))
ds = ds.apply(
  tf.data.experimental.shuffle_and_repeat(buffer_size=image_count))
ds=ds.batch(BATCH_SIZE).prefetch(AUTOTUNE)
ds

<PrefetchDataset shapes: ((None, 192, 192, 3), (None,)), types: (tf.float32, tf.int64)>

timeit(ds)

........................
231.0 batches: 36.933589696884155 s
200.14302 Images/s
Total time: 53.03426218032837s

這比 緩存 版本慢，因爲你還沒有緩存預處理。

4.2.2 序列化的 Tensor（張量）

要爲 TFRecord 文件省去一些預處理過程，首先像之前一樣製作一個處理過的圖片數據集：

paths_ds = tf.data.Dataset.from_tensor_slices(all_image_paths)
image_ds = paths_ds.map(load_and_preprocess_image)
image_ds

<MapDataset shapes: (192, 192, 3), types: tf.float32>

現在你有一個 tensor（張量）數據集，而不是一個 .jpeg 字符串數據集。

要將此序列化至一個 TFRecord 文件你首先將該 tensor（張量）數據集轉化爲一個字符串數據集：

ds = image_ds.map(tf.io.serialize_tensor)
ds

<MapDataset shapes: (), types: tf.string>

tfrec = tf.data.experimental.TFRecordWriter('images.tfrec')
tfrec.write(ds)

有了被緩存的預處理，就能從 TFrecord 文件高效地加載數據——只需記得在使用它之前反序列化：

ds = tf.data.TFRecordDataset('images.tfrec')

def parse(x):
  result = tf.io.parse_tensor(x, out_type=tf.float32)
  result = tf.reshape(result, [192, 192, 3])
  return result

ds = ds.map(parse, num_parallel_calls=AUTOTUNE)
ds

<ParallelMapDataset shapes: (192, 192, 3), types: tf.float32>

現在，像之前一樣添加標籤和進行相同的標準操作：

ds = tf.data.Dataset.zip((ds, label_ds))
ds = ds.apply(
  tf.data.experimental.shuffle_and_repeat(buffer_size=image_count))
ds=ds.batch(BATCH_SIZE).prefetch(AUTOTUNE)
ds

<PrefetchDataset shapes: ((None, 192, 192, 3), (None,)), types: (tf.float32, tf.int64)>

timeit(ds)

........................
231.0 batches: 24.437746047973633 s
302.48289 Images/s
Total time: 35.59574770927429s

用 tf.data 加載圖片

用 tf.data 加載圖片

1. 配置

2. 下載並檢查數據集

2.1 檢索圖片

2.2 檢查圖片

2.3 確定每張圖片的標籤

2.4 加載和格式化圖片

3. 構建一個 `tf.data.Dataset`

3.1 一個圖片數據集

3.2 一個`(圖片, 標籤)`對數據集

3.3 訓練的基本方法

3.4 傳遞數據集至模型

4. 性能

4.1 緩存

4.2 TFRecord 文件

4.2.1 原始圖片數據

4.2.2 序列化的 Tensor（張量）

vue項目獲取富文本編輯器wangEditor內容導出爲word（html轉word格式並下載）

dotnet C# 創建 X11 應用時設置窗口背景顏色

Navicat安裝與激活教程

TDengine docker安裝方法

vue3組件通信與props

sapui5

Alpine Linux apk add DNS lookup error

部分JDK版本的發佈時間

工作中用到的腳本合集

合併代碼時Beyond Compare設置

LSTM（長短期記憶網絡）

LeNet-5 – A Classic CNN Architecture

用 tf.data 加載圖片

預訓練模型遷移學習（Transfer Learning）

大話深度殘差網絡（DRN）ResNet

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

用 tf.data 加載圖片

用 tf.data 加載圖片

1. 配置

2. 下載並檢查數據集

2.1 檢索圖片

2.2 檢查圖片

2.3 確定每張圖片的標籤

2.4 加載和格式化圖片

3. 構建一個 tf.data.Dataset

3.1 一個圖片數據集

3.2 一個(圖片, 標籤)對數據集

3.3 訓練的基本方法

3.4 傳遞數據集至模型

4. 性能

4.1 緩存

4.2 TFRecord 文件

4.2.1 原始圖片數據

4.2.2 序列化的 Tensor（張量）

3. 構建一個 `tf.data.Dataset`

3.2 一個`(圖片, 標籤)`對數據集