基於CNN的圖像缺陷分類

1、前言  

在工業產品缺陷檢測中,基於傳統的圖像特徵的缺陷分類的準確率達不到實際生產的要求,因此想採用CNN來進行缺陷分類。

  傳統缺陷分類思路:

  1、缺陷圖片分離:先採用複雜的圖像處理方法,將缺陷從採集的圖像中分離處理;

  2、特徵向量構建:通過對不同缺陷種類的特徵進行分析,定義需要提取的n維特徵(比如缺陷長、寬、對比度、紋理特徵、熵、梯度等),構成一組描述缺陷的

            特徵向量;特徵向量的構建需要對實際的問題有很深入的分析,並且需要有很深厚的圖像處理知識;這也是傳統分類問題中最難的部分。

  3、特徵向量歸一化:由於特徵向量每個維度的度量差別很大(比如缺陷長50像素,對比度0.03),因此需要進行特徵縮放,特徵歸一化;

  4、人工標記缺陷:將缺陷圖片存儲在人工標記的文件夾內;

  5、採用SVM對缺陷進行分類,分類準確率85%左右。

2、CNN網絡構建

  在缺陷圖片分離和人工標記後,構建CNN網絡模型;由於工業檢測中對實時性要求很高,因此想採用比較簡單的網絡結構來提高訓練的速度和檢測速度;

  網絡構建:本文采用LeNet網絡結構的基本思路,構建一個簡單的網絡

  

  

圖1:Tensorflow輸出的網絡模型

3、模型訓練和測試

3.1 原始模型測試

  開始以爲模型可能會出現過擬合的問題,不過從精度和損失曲線看來,沒有過擬合問題,到是模型初始迭代的時候陷入了一個局部循環狀態,可能是沒有得到特別好的特徵或者是隨機選擇訓練模型的數據集沒有完全分散,也有可能是訓練的次數太少了。訓練集上的準確率有點低,因此需要用更好的模型,但是模型怎麼改呢??儘管CNN可以自己訓練出FIlters,但是依然不能很清晰的看到圖像被濾波後是怎麼樣的狀態(圖2,圖3),對於一直做圖像底層算法的人來說,有點很不爽。

  

圖2 :卷積第一層

圖3:Relu激活函數層

  通過分析圖2,發現濾波整體效果還不錯,缺陷的地方都能清晰的反映出來;但是本來輸入的缺陷是往下凹的,濾波後的缺陷很多是向上凸的,不符合實際情況。

  分析圖3,發現經過Relu激活函數後,只留下了很明顯向下凹的缺陷特徵圖片,但是有效的特徵圖片(FeatureMap)太少,只有2個。

 

圖4: 上凸圖片數據

圖5:下凹圖片數據

  爲了能得到更多的符合實際的缺陷特徵圖片,考慮到需要更加突出缺陷邊緣,以致不被周圍大片圖像的干擾,因此決定將卷積核變小;卷積核由默認的5x5改爲3x3.

3.2 優化卷積核大小後

  模型整體的精度有明顯的上升,經過Relu後的有效FeatureMap增加了。有點疑問的是validation數據集的準確率比訓練還高5-8個點???

  

 

  

4、Code

# -*- coding: utf-8 -*-
# @Time    : 18-7-25 下午2:33
# @Author  : DuanBin
# @Email   : [email protected]
# @File    : catl_train.py
# @Software: PyCharm

# USAGE
# python catl_train.py --dataset data --model catl.model

# import the necessary packages
from keras.preprocessing.image import ImageDataGenerator
from keras.optimizers import Adam
from sklearn.model_selection import train_test_split
from keras.preprocessing.image import img_to_array
from keras.utils import to_categorical
from keras.models import Model
from keras.models import load_model
from lenet import LeNet
from imutils import paths
import matplotlib.pyplot as plt
import numpy as np
import argparse
import random
import cv2
import os

# set the matplotlib backend so figures can be saved in the background
import matplotlib
matplotlib.use("Agg")

dataPath = "data"
modelPath = "catl_5_5.model"
plotPath = "catl_plot_5_5_blog.png"

# initialize the number of epochs to train for, initia learning rate,
# and batch size
EPOCHS = 50
INIT_LR = 0.001
BS = 3
classNumber = 3
imageDepth = 1

# initialize the data and labels
print("[INFO] loading images...")
data = []
labels = []

# grab the image paths and randomly shuffle them
imagePaths = sorted(list(paths.list_images(dataPath)))  # args["dataset"])))
random.seed(42)
random.shuffle(imagePaths)

# loop over the input images
for imagePath in imagePaths:
    # load the image, pre-process it, and store it in the data list
    image = cv2.imread(imagePath, 0)
    image = cv2.resize(image, (28, 28))
    image = img_to_array(image)
    data.append(image)

    # extract the class label from the image path and update the
    # labels list
    label = imagePath.split(os.path.sep)[-2]
    if label == "dity":
        label = 0
    elif label == "tan":
        label = 1
    elif label == "valley":
        label = 2
    labels.append(label)

# scale the raw pixel intensities to the range [0, 1]
data = np.array(data, dtype="float") / 255.0
labels = np.array(labels)

# partition the data into training and testing splits using 75% of
# the data for training and the remaining 25% for testing
(trainX, testX, trainY, testY) = train_test_split(data, labels, test_size=0.3, random_state=42)
print(trainX.shape)

# convert the labels from integers to vectors
trainY = to_categorical(trainY, num_classes=classNumber)
testY = to_categorical(testY, num_classes=classNumber)
print(trainY.shape)
print(testX.shape)


# construct the image generator for data augmentation
aug = ImageDataGenerator(rotation_range=30, width_shift_range=0.1,
                         height_shift_range=0.1, shear_range=0.2, zoom_range=0.2,
                         horizontal_flip=True, fill_mode="nearest")

# # initialize the model
print("[INFO] compiling model...")
model = LeNet.build(width=28, height=28, depth=imageDepth, classes=classNumber)
opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS)
model.compile(loss="categorical_crossentropy", optimizer=opt,
              metrics=["accuracy"])

model.summary()

# train the network
print("[INFO] training network...")
H = model.fit_generator(aug.flow(trainX, trainY, batch_size=BS),
                        validation_data=(testX, testY), steps_per_epoch=len(trainX) // BS,
                        epochs=EPOCHS, verbose=1)

# save the model to disk
print("[INFO] serializing network...")
model.save(modelPath)  # args["model"])
model.save_weights("catl_5_5_wight.h5")

# plot the training loss and accuracy
plt.style.use("ggplot")
plt.figure()
N = EPOCHS
plt.plot(np.arange(0, N), H.history["loss"], label="train_loss")
plt.plot(np.arange(0, N), H.history["val_loss"], label="val_loss")
plt.plot(np.arange(0, N), H.history["acc"], label="train_acc")
plt.plot(np.arange(0, N), H.history["val_acc"], label="val_acc")
plt.title("Training Loss and Accuracy")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend(loc="lower left")
plt.savefig(plotPath)  # args["plot"])
plt.show()

layer_outputs = [layer.output for layer in model.layers]
activation_model = Model(inputs=model.input, outputs=layer_outputs)
activations = activation_model.predict(testX[0].reshape(1, 28, 28, 1))


def display_activation(activations, col_size, row_size, act_index):
    activation = activations[act_index]
    activation_index = 0
    fig, ax = plt.subplots(row_size, col_size, figsize=(row_size * 2.5, col_size * 1.5))
    for row in range(0, row_size):
        for col in range(0, col_size):
            ax[row][col].imshow(activation[0, :, :, activation_index], cmap='gray')
            activation_index += 1

    plt.show()


display_activation(activations, 4, 5, 1)
# import the necessary packages
from keras.models import Sequential
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Flatten
from keras.layers.core import Dense
from keras.layers.core import Dropout
from tensorflow.keras import backend as K


class LeNet:
    @staticmethod
    def build(width, height, depth, classes):
        # initialize the model
        model = Sequential()
        inputShape = (height, width, depth)

        # if we are using "channels first", update the input shape
        if K.image_data_format() == "channels_first":
            inputShape = (depth, height, width)
        else:
            inputShape = (width, height, depth)

        # first set of CONV => RELU => POOL layers
        model.add(Conv2D(20, (3, 3), padding="same", input_shape=inputShape))
        model.add(Activation("relu"))
        model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

        # second set of CONV => RELU => POOL layers
        model.add(Conv2D(50, (3, 3), padding="same"))
        model.add(Activation("relu"))
        model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))


        # first (and only) set of FC => RELU layers
        model.add(Flatten())
        model.add(Dense(500))
        model.add(Activation("relu"))

        # softmax classifier
        model.add(Dense(classes))
        model.add(Activation("softmax"))

        # return the constructed network architecture
        return model

 

  

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章