最近在学习ImageAI对图片的处理,在此记录下学习过程。
目标:用自己的图片库实现液晶屏异常预测。
1. 将自己的图片整理成224×224(模型默认)像素的大小,新建一个tf-train目录(名字随便起),在该目录下分为train和test两个目录,目录下将正常的图片放在normal文件夹下,将存在异常的图片放在abnormal文件夹下,文件夹的名称就是图片类别的名称。
——>——>在train和test目录下都有这两个文件夹,train目录下共有512张图片,test有64张。
2.新建一个模型训练的.py文件
from imageai.Prediction.Custom import ModelTraining
model_trainer = ModelTraining()
model_trainer.setModelTypeAsResNet()
model_trainer.setDataDirectory("tf-train")
model_trainer.trainModel(num_objects=2, num_experiments=100, enhance_data=True, batch_size=4, show_network_summary=True)
由于本项目只有两类目标,所以num_object = 2, 设置迭代训练100次后停止,batch_size可以根据GPU内存大小适当设置,我一开始设置32GPU跑步起来,所以改为4了。
开始等待模型迭代优化,共迭代100次,每次训练次数:图片数量/batch=512/4=128
3.训练完成后,会在tf-train目录下生产两个文件夹,joson文件夹下放的是类别标签,models文件夹下放的是100次的训练模型。
4.实现预测
from imageai.Prediction.Custom import CustomImagePrediction
import os
import glob
import numpy as np
execution_path = os.getcwd()
# 指定训练好的模型
prediction = CustomImagePrediction()
prediction.setModelTypeAsResNet()
prediction.setModelPath(os.path.join(execution_path, "model_ex-100_acc-1.000000.h5"))
prediction.setJsonPath(os.path.join(execution_path, "model_class.json"))
prediction.loadModel(num_objects=2)
# 三种方式
# 1.预测当前路径下的指定图片
predictions, probabilities = prediction.predictImage(os.path.join(execution_path, "t4.jpg"), result_count=1)
for eachPrediction, eachProbability in zip(predictions, probabilities):
print(eachPrediction, " : ", eachProbability)
# 2.预测当前路径指定文件夹下所有图片 predictImage
# for files in glob.glob(os.path.join(execution_path, r"tf-train\test\normal\*.jpg")):
# filepath, filename = os.path.split(files)
# predictions, probabilities = prediction.predictImage(files, result_count=1)
# for eachPrediction, eachProbability in zip(predictions, probabilities):
# print(filename, " predicted is -->", eachPrediction, " : ", eachProbability)
# # 3.预测当前路径指定文件夹下所有图片 predictMultipleImages
# imgpathls=[]
# imgnamels=[]
# for img in glob.glob(os.path.join(execution_path, r"tf-train\test\normal\*.jpg")):
# imgpathls.append(img)
# imgnamels.append(os.path.basename(img))
# output = prediction.predictMultipleImages(np.array(imgpathls))
# for k in np.arange(0, len(output)):
# print(imgnamels[k], " predicted is -->", results_array[k]["predictions"][0], " : ", results_array[k]["percentage_probabilities"][0])
结果如下:
还有很好的资料值得学习:
- Somshubra Majumdar, DenseNet Implementation of the paper, Densely Connected Convolutional Networks in Keras
https://github.com/titu1994/DenseNet/ - Broad Institute of MIT and Harvard, Keras package for deep residual networks
https://github.com/broadinstitute/keras-resnet - Fizyr, Keras implementation of RetinaNet object detection
https://github.com/fizyr/keras-retinanet - Francois Chollet, Keras code and weights files for popular deeplearning models
https://github.com/fchollet/deep-learning-models - Forrest N. et al, SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size
https://arxiv.org/abs/1602.07360 - Kaiming H. et al, Deep Residual Learning for Image Recognition
https://arxiv.org/abs/1512.03385 - Szegedy. et al, Rethinking the Inception Architecture for Computer Vision
https://arxiv.org/abs/1512.00567 - Gao. et al, Densely Connected Convolutional Networks
https://arxiv.org/abs/1608.06993 - Tsung-Yi. et al, Focal Loss for Dense Object Detection
https://arxiv.org/abs/1708.02002 - O Russakovsky et al, ImageNet Large Scale Visual Recognition Challenge
https://arxiv.org/abs/1409.0575 - TY Lin et al, Microsoft COCO: Common Objects in Context
https://arxiv.org/abs/1405.0312 - Moses & John Olafenwa, A collection of images of identifiable professionals.
https://github.com/OlafenwaMoses/IdenProf