【CV04】如何使用 Keras 可视化经典的计算机视觉数据集

原創

2020-06-22 05:30

本文通过MNIST，Fashion-MNIST，CIFAR-10和CIFAR-100计算机视觉数据集介绍Keras加载图像数据并可视化的方法。

文章目录

1. Keras计算机视觉数据集

Keras提供对四个标准计算机视觉数据集的访问，分别是：

MNIST：对10类手写数字的图片进行分类；
Fashion-MNIST：对10类衣物图片进行分类；
CIFAR-10：对10类物体图片进行分类；
CIFAR-100：对100类常见物体图片进行分类。

数据集可通过数据集加载函数keras.datasets模块下使用。调用 load 函数后，数据集将下载并存储在 C:\Users\34123\.keras\datasets 目录下，数据集以压缩格式存储。首次调用数据集加载函数并下载了数据集之后，无需再次下载数据集，后续调用将从磁盘加载数据集。

加载函数返回两个元组，第一个包含训练数据集中样本的输入和输出元素，第二个包含测试数据集中样本的输入和输出元素。训练数据集和测试数据集之间的拆分通常遵循标准拆分，该标准拆分用于对数据集进行基准测试时使用。

加载数据集的标准习惯用法如下：

...
# load dataset
(trainX, trainy), (testX, testy) = load_data()

X和y分别是像素或类值的NumPy数组。

其中两个数据集包含灰度图像，两个数据集包含彩色图像。灰度图像的形状必须从二维阵列转换为三维阵列，以匹配Keras的首选通道顺序。例如：

# 给灰度图添加通道数
width, height, channels = trainX.shape[1], trainX.shape[2], 1
trainX = trainX.reshape((trainX.shape[0], width, height, channels))
testX = testX.reshape((testX.shape[0], width, height, channels))

灰度和彩色图像像素数据均存储为0到255之间的无符号整数值。在建模之前，需要将图像数据重新缩放，例如归一化到0-1范围，例如：

# normalize pixel values
trainX = trainX.astype('float32') / 255
testX = testX.astype('float32') / 255

每个样本的输出y都存储为类整数值。对于多类分类问题，通常的做法是在建模之前对类值进行one-hot编码。可以使用Keras的 to_categorical() 函数来实现。例如：

...
# one hot encode target values
trainy = to_categorical(trainy)
testy = to_categorical(testy)

接下来，详细探究相关的数据集。

2.MNIST数据集

该数据集包含60,000个在0到9之间的手写数字图片，每张图片为28×28像素的灰度图。表现最好的模型是深度学习卷积神经网络，其分类精度达到99％以上，在保留测试数据集上的错误率在0.4％和0.2％之间。

from keras.datasets import mnist
import matplotlib.pyplot as plt
plt.rcParams['figure.dpi'] = 200

(trainX, trainy), (testX, testy) = mnist.load_data()
print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
print('Test: X=%s, y=%s' % (testX.shape, testy.shape))

for i in range(9):
    plt.subplot(330 + 1 + i)
    plt.imshow(trainX[i], cmap=plt.get_cmap('gray'))
    
plt.tight_layout()
plt.show()

输出：

Train: X=(60000, 28, 28), y=(60000,)
Test: X=(10000, 28, 28), y=(10000,)

3. Fashion-MNIST数据集

该数据集包含60,000个10种类型的衣服图片，每张图片为28×28像素的灰度图。与MNIST相比，这是一个更具挑战性的分类问题，通过深度学习卷积网络可在保持测试数据集上实现约95％至96％的分类精度，从而获得最佳结果。

from keras.datasets import fashion_mnist

(trainX, trainy), (testX, testy) = fashion_mnist.load_data()
print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
print('Test: X=%s, y=%s' % (testX.shape, testy.shape))

for i in range(9):
    plt.subplot(330 + 1 + i)
    plt.imshow(trainX[i], cmap=plt.get_cmap('gray'))
    
plt.tight_layout()
plt.show()

输出：

Downloading data from http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
8192/5148 [===============================================] - 1s 100us/step
Downloading data from http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
4423680/4422102 [==============================] - 71s 16us/step
Train: X=(60000, 28, 28), y=(60000,)
Test: X=(10000, 28, 28), y=(10000,)

3. CIFAR-10数据集

CIFAR是Canadian Institute For Advanced Research的首字母缩写，CIFAR-10数据集是由CIFAR研究所的研究人员与CIFAR-100数据集一起开发的。

该数据集包含10类物体的60,000张32×32像素彩色照片组成，如青蛙，鸟类，猫，船等。

CIFAR-10被广泛用于在机器学习领域对计算机视觉算法进行基准测试。通过深度学习卷积神经网络在测试数据集上的分类精度高于96％或97％，可以实现该问题的最佳性能。

from keras.datasets import cifar10

(trainX, trainy), (testX, testy) = cifar10.load_data()
print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
print('Test: X=%s, y=%s' % (testX.shape, testy.shape))

for i in range(9):
    plt.subplot(330 + 1 + i)
    plt.imshow(trainX[i], cmap=plt.get_cmap('gray'))
    
plt.tight_layout()
plt.show()

输出：

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
170500096/170498071 [==============================] - 135s 1us/step
Train: X=(50000, 32, 32, 3), y=(50000, 1)
Test: X=(10000, 32, 32, 3), y=(10000, 1)

4. CIFAR-100数据集

该数据集由100类物体（如鱼，花，昆虫等）的60,000张32×32像素彩色照片组成。

from keras.datasets import cifar100

(trainX, trainy), (testX, testy) = cifar100.load_data()
print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
print('Test: X=%s, y=%s' % (testX.shape, testy.shape))

for i in range(9):
    plt.subplot(330 + 1 + i)
    plt.imshow(trainX[i], cmap=plt.get_cmap('gray'))
    
plt.tight_layout()
plt.show()

输出：

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
169009152/169001437 [==============================] - 148s 1us/step
Train: X=(50000, 32, 32, 3), y=(50000, 1)
Test: X=(10000, 32, 32, 3), y=(10000, 1)

Keras返回100类的默认标签，但标签可以通过 label_mode 参数设置为“ 粗糙 ”（即只有20类标签），当使用 to_categorical() 函数对标签进行one-hot编码时，区别就很明显了，其中不是每个输出向量都具有100维，而是只有20个。如：

from keras.datasets import cifar100
from keras.utils import to_categorical

(trainX, trainy), (testX, testy) = cifar100.load_data(label_mode='coarse')

trainy = to_categorical(trainy)
testy = to_categorical(testy)

print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
print('Test: X=%s, y=%s' % (testX.shape, testy.shape))

输出：

Train: X=(50000, 32, 32, 3), y=(50000, 20)
Test: X=(10000, 32, 32, 3), y=(10000, 20)

运行示例将像以前一样加载CIFAR-100数据集，但是现在将图像分类为属于20个类之一。

参考：
https://machinelearningmastery.com/how-to-load-and-visualize-standard-computer-vision-datasets-with-keras/

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【CV04】如何使用 Keras 可视化经典的计算机视觉数据集

文章目录

1. Keras计算机视觉数据集

2.MNIST数据集

3. Fashion-MNIST数据集

3. CIFAR-10数据集

4. CIFAR-100数据集

使用c#强大的表达式树实现对象的深克隆之解决循环引用的问题

痞子衡嵌入式：恩智浦i.MX RT1xxx系列MCU启动那些事（12.A）- uSDHC eMMC启动时间(RT1170)

本地SSL证书过期输入命令在IIS自动生成

.NET周刊【5月第2期 2024-05-12】

【CV12】如何在Keras使用 Mask R-CNN 進行目標檢測

【CV13】如何在Keras中使用 YOLO v3 進行目標檢測

【CV10】經典CNN模型中圖像數據增強方法簡介

【CV09】如何可視化CNN中的卷積核和特徵圖

【CV11】如何從頭開發於CIFAR-10圖像分類的CNN

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結