COVID-19:具有OpenCV,Keras / TensorFlow和深度學習的面罩檢測器

在本教程中,你將學習如何使用OpenCV,Keras / TensorFlow和Deep Learning訓練COVID-19面罩檢測器。

  • 在圖像中檢測COVID-19口罩

  • 檢測實時視頻流中的口罩



  • 培訓:這裏我們主要從磁盤加載口罩檢測數據集,在該數據集上培訓一個模型(使用Keras/TensorFlow),然後將口罩檢測器序列化到磁盤

  • 部署:一旦口罩檢測器接受了訓練,我們就可以加載口罩檢測器,執行人臉檢測,然後將每個臉分類爲with_mask或without_mask


  • with_mask:690張圖片

  • without_mask:686張圖片






然後,我們使用dlib 檢測面部標誌(能夠定位眼睛,鼻子,嘴巴等)以便我們知道將遮罩放置在臉上哪裏:

接下來,我們需要一張帶有透明背景的蒙版圖像,例如以下圖像,COVID-19 冠狀病毒面罩/護罩的示例。由於我們知道面部標誌位置,因此該面罩將自動覆蓋在原始面部ROI上:







$ tree --dirsfirst --filelimit 10.├── dataset│   ├── with_mask [690 entries]│   └── without_mask [686 entries]├── examples│   ├── example_01.png│   ├── example_02.png│   └── example_03.png├── face_detector│   ├── deploy.prototxt│   └── res10_300x300_ssd_iter_140000.caffemodel├──├──├── mask_detector.model├── plot.png└── train_mask_detector.py5 directories, 10 files






接下來使用Keras和TensorFlow訓練分類器,以自動檢測一個人是否戴着口罩。打開 文件並插入以下代碼:​​​​​​​

# import the necessary packagesfrom tensorflow.keras.preprocessing.image import ImageDataGeneratorfrom tensorflow.keras.applications import MobileNetV2from tensorflow.keras.layers import AveragePooling2Dfrom tensorflow.keras.layers import Dropoutfrom tensorflow.keras.layers import Flattenfrom tensorflow.keras.layers import Densefrom tensorflow.keras.layers import Inputfrom tensorflow.keras.models import Modelfrom tensorflow.keras.optimizers import Adamfrom tensorflow.keras.applications.mobilenet_v2 import preprocess_inputfrom tensorflow.keras.preprocessing.image import img_to_arrayfrom tensorflow.keras.preprocessing.image import load_imgfrom tensorflow.keras.utils import to_categoricalfrom sklearn.preprocessing import LabelBinarizerfrom sklearn.model_selection import train_test_splitfrom sklearn.metrics import classification_reportfrom imutils import pathsimport matplotlib.pyplot as pltimport numpy as npimport argparseimport os


  • 數據增加

  • 加載MobilNetV2分類器(我們將使用預先訓練的ImageNet權重對該模型進行微調)

  • 預處理

  • 加載圖像數據

我們使用scikit-learn (sklearn)對類標籤進行二值化,對數據集進行分段,並打印分類報告。imutils路徑實現將幫助我們在數據集中查找和列出圖像。我們將使用matplotlib來繪製訓練曲線。


# construct the argument parser and parse the argumentsap = argparse.ArgumentParser()ap.add_argument("-d", "--dataset", required=True,  help="path to input dataset")ap.add_argument("-p", "--plot", type=str, default="plot.png",  help="path to output loss/accuracy plot")ap.add_argument("-m", "--model", type=str,  default="mask_detector.model",  help="path to output face mask detector model")args = vars(ap.parse_args())




INIT_LR = 1e-4EPOCHS = 20BS = 32



# grab the list of images in our dataset directory, then initialize# the list of data (i.e., images) and class imagesprint("[INFO] loading images...")imagePaths = list(paths.list_images(args["dataset"]))data = []labels = []# loop over the image pathsfor imagePath in imagePaths:  # extract the class label from the filename  label = imagePath.split(os.path.sep)[-2]  # load the input image (224x224) and preprocess it  image = load_img(imagePath, target_size=(224, 224))  image = img_to_array(image)  image = preprocess_input(image)  # update the data and labels lists, respectively  data.append(image)  labels.append(label)# convert the data and labels to NumPy arraysdata = np.array(data, dtype="float32")labels = np.array(labels)


# perform one-hot encoding on the labelslb = LabelBinarizer()labels = lb.fit_transform(labels)labels = to_categorical(labels)# partition the data into training and testing splits using 80% of# the data for training and the remaining 20% for testing(trainX, testX, trainY, testY) = train_test_split(data, labels,  test_size=0.20, stratify=labels, random_state=42)# construct the training image generator for data augmentationaug = ImageDataGenerator(  rotation_range=20,  zoom_range=0.15,  width_shift_range=0.2,  height_shift_range=0.2,  shear_range=0.15,  horizontal_flip=True,  fill_mode="nearest")


$ python --dataset  dataset [INFO] loading images...-> (trainX, testX, trainY, testY) = train_test_split(data, labels,(Pdb) labels[500:]array([[1., 0.],       [1., 0.],       [1., 0.],       ...,       [0., 1.],       [0., 1.],       [0., 1.]], dtype=float32)(Pdb)我們標籤數組的每個元素都由一個數組組成,在訓練過程中,我們將對圖像進行動態突變,以提高泛化能力。隨機旋轉、縮放、剪切、偏移和翻轉參數。我們將在訓練時使用aug對象。我們需要準備MobileNetV2微調:​​​​​​​
# load the MobileNetV2 network, ensuring the head FC layer sets are# left offbaseModel = MobileNetV2(weights="imagenet", include_top=False,  input_tensor=Input(shape=(224, 224, 3)))# construct the head of the model that will be placed on top of the# the base modelheadModel = baseModel.outputheadModel = AveragePooling2D(pool_size=(7, 7))(headModel)headModel = Flatten(name="flatten")(headModel)headModel = Dense(128, activation="relu")(headModel)headModel = Dropout(0.5)(headModel)headModel = Dense(2, activation="softmax")(headModel)# place the head FC model on top of the base model (this will become# the actual model we will train)model = Model(inputs=baseModel.input, outputs=headModel)# loop over all layers in the base model and freeze them so they will# *not* be updated during the first training processfor layer in baseModel.layers:  layer.trainable = False


# compile our modelprint("[INFO] compiling model...")opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS)model.compile(loss="binary_crossentropy", optimizer=opt,  metrics=["accuracy"])# train the head of the networkprint("[INFO] training head...")H =  aug.flow(trainX, trainY, batch_size=BS),  steps_per_epoch=len(trainX) // BS,  validation_data=(testX, testY),  validation_steps=len(testX) // BS,  epochs=EPOCHS)


# make predictions on the testing setprint("[INFO] evaluating network...")predIdxs = model.predict(testX, batch_size=BS)# for each image in the testing set we need to find the index of the# label with corresponding largest predicted probabilitypredIdxs = np.argmax(predIdxs, axis=1)# show a nicely formatted classification reportprint(classification_report(testY.argmax(axis=1), predIdxs,  target_names=lb.classes_))# serialize the model to diskprint("[INFO] saving mask detector model...")["model"], save_format="h5")


# plot the training loss and accuracyN ="ggplot")plt.figure()plt.plot(np.arange(0, N), H.history["loss"], label="train_loss")plt.plot(np.arange(0, N), H.history["val_loss"], label="val_loss")plt.plot(np.arange(0, N), H.history["accuracy"], label="train_acc")plt.plot(np.arange(0, N), H.history["val_accuracy"], label="val_acc")plt.title("Training Loss and Accuracy")plt.xlabel("Epoch #")plt.ylabel("Loss/Accuracy")plt.legend(loc="lower left")plt.savefig(args["plot"])


現在,我們準備使用Keras,TensorFlow和Deep Learning訓練我們的面罩檢測器。確保在文章開頭下載源代碼和麪罩數據集。從那裏打開一個終端,然後執行以下命令:​​​​​​​

$ python --dataset dataset[INFO] loading images...[INFO] compiling model...[INFO] training head...Train for 34 steps, validate on 276 samplesEpoch 1/2034/34 [==============================] - 30s 885ms/step - loss: 0.6431 - accuracy: 0.6676 - val_loss: 0.3696 - val_accuracy: 0.8242Epoch 2/2034/34 [==============================] - 29s 853ms/step - loss: 0.3507 - accuracy: 0.8567 - val_loss: 0.1964 - val_accuracy: 0.9375Epoch 3/2034/34 [==============================] - 27s 800ms/step - loss: 0.2792 - accuracy: 0.8820 - val_loss: 0.1383 - val_accuracy: 0.9531Epoch 4/2034/34 [==============================] - 28s 814ms/step - loss: 0.2196 - accuracy: 0.9148 - val_loss: 0.1306 - val_accuracy: 0.9492Epoch 5/2034/34 [==============================] - 27s 792ms/step - loss: 0.2006 - accuracy: 0.9213 - val_loss: 0.0863 - val_accuracy: 0.9688...Epoch 16/2034/34 [==============================] - 27s 801ms/step - loss: 0.0767 - accuracy: 0.9766 - val_loss: 0.0291 - val_accuracy: 0.9922Epoch 17/2034/34 [==============================] - 27s 795ms/step - loss: 0.1042 - accuracy: 0.9616 - val_loss: 0.0243 - val_accuracy: 1.0000Epoch 18/2034/34 [==============================] - 27s 796ms/step - loss: 0.0804 - accuracy: 0.9672 - val_loss: 0.0244 - val_accuracy: 0.9961Epoch 19/2034/34 [==============================] - 27s 793ms/step - loss: 0.0836 - accuracy: 0.9710 - val_loss: 0.0440 - val_accuracy: 0.9883Epoch 20/2034/34 [==============================] - 28s 838ms/step - loss: 0.0717 - accuracy: 0.9710 - val_loss: 0.0270 - val_accuracy: 0.9922[INFO] evaluating network...              precision    recall  f1-score   support   with_mask       0.99      1.00      0.99       138without_mask       1.00      0.99      0.99       138    accuracy                           0.99       276   macro avg       0.99      0.99      0.99       276weighted avg       0.99      0.99      0.99       276



  • 從磁盤加載輸入圖像

  • 檢測圖像中的人臉

  • 應用我們的口罩檢測器來將臉部分類爲with_mask或without_mask


# import the necessary packagesfrom tensorflow.keras.applications.mobilenet_v2 import preprocess_inputfrom tensorflow.keras.preprocessing.image import img_to_arrayfrom tensorflow.keras.models import load_modelimport numpy as npimport argparseimport cv2import os


# construct the argument parser and parse the argumentsap = argparse.ArgumentParser()ap.add_argument("-i", "--image", required=True,  help="path to input image")ap.add_argument("-f", "--face", type=str,  default="face_detector",  help="path to face detector model directory")ap.add_argument("-m", "--model", type=str,  default="mask_detector.model",  help="path to trained face mask detector model")ap.add_argument("-c", "--confidence", type=float, default=0.5,  help="minimum probability to filter weak detections")args = vars(ap.parse_args())



--face: face detector model目錄的路徑(在對人臉進行分類之前,我們需要對人臉進行本地化)




# load our serialized face detector model from diskprint("[INFO] loading face detector model...")prototxtPath = os.path.sep.join([args["face"], "deploy.prototxt"])weightsPath = os.path.sep.join([args["face"],  "res10_300x300_ssd_iter_140000.caffemodel"])net = cv2.dnn.readNet(prototxtPath, weightsPath)# load the face mask detector model from diskprint("[INFO] loading face mask detector model...")model = load_model(args["model"])


# load the input image from disk, clone it, and grab the image spatial# dimensionsimage = cv2.imread(args["image"])orig = image.copy()(h, w) = image.shape[:2]# construct a blob from the imageblob = cv2.dnn.blobFromImage(image, 1.0, (300, 300),  (104.0, 177.0, 123.0))# pass the blob through the network and obtain the face detectionsprint("[INFO] computing face detections...")net.setInput(blob)detections = net.forward()

知道每張面孔的預測位置後,我們將確保它們滿足 --confidence提取faceROIs之前的閾值:​​​​​​​

# loop over the detectionsfor i in range(0, detections.shape[2]):  # extract the confidence (i.e., probability) associated with  # the detection  confidence = detections[0, 0, i, 2]  # filter out weak detections by ensuring the confidence is  # greater than the minimum confidence  if confidence > args["confidence"]:    # compute the (x, y)-coordinates of the bounding box for    # the object    box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])    (startX, startY, endX, endY) = box.astype("int")    # ensure the bounding boxes fall within the dimensions of    # the frame    (startX, startY) = (max(0, startX), max(0, startY))    (endX, endY) = (min(w - 1, endX), min(h - 1, endY))


# extract the face ROI, convert it from BGR to RGB channel    # ordering, resize it to 224x224, and preprocess it    face = image[startY:endY, startX:endX]    face = cv2.cvtColor(face, cv2.COLOR_BGR2RGB)    face = cv2.resize(face, (224, 224))    face = img_to_array(face)    face = preprocess_input(face)    face = np.expand_dims(face, axis=0)    # pass the face through the model to determine if the face    # has a mask or not    (mask, withoutMask) = model.predict(face)[0]​​​​​​
# determine the class label and color we'll use to draw    # the bounding box and text    label = "Mask" if mask > withoutMask else "No Mask"    color = (0, 255, 0) if label == "Mask" else (0, 0, 255)    # include the probability in the label    label = "{}: {:.2f}%".format(label, max(mask, withoutMask) * 100)    # display the label and bounding box rectangle on the output    # frame    cv2.putText(image, label, (startX, startY - 10),      cv2.FONT_HERSHEY_SIMPLEX, 0.45, color, 2)    cv2.rectangle(image, (startX, startY), (endX, endY), color, 2)# show the output imagecv2.imshow("Output", image)cv2.waitKey(0)




$ python --image examples/example_01.png [INFO] loading face detector model...[INFO] loading face mask detector model...[INFO] computing face detections...


$ python --image examples/example_02.png [INFO] loading face detector model...[INFO] loading face mask detector model...[INFO] computing face detections...


$ python --image examples/example_03.png [INFO] loading face detector model...[INFO] loading face mask detector model...[INFO] computing face detections...



打開 detect_mask_video.py文件放在目錄結構中,並插入以下代碼:​​​​​​​

# import the necessary packagesfrom tensorflow.keras.applications.mobilenet_v2 import preprocess_inputfrom tensorflow.keras.preprocessing.image import img_to_arrayfrom tensorflow.keras.models import load_modelfrom import VideoStreamimport numpy as npimport argparseimport imutilsimport timeimport cv2import os

我們針對此腳本的面部檢測/遮罩預測邏輯位於 detect_and_predict_mas:​​​​​​​

def detect_and_predict_mask(frame, faceNet, maskNet):  # grab the dimensions of the frame and then construct a blob  # from it  (h, w) = frame.shape[:2]  blob = cv2.dnn.blobFromImage(frame, 1.0, (300, 300),    (104.0, 177.0, 123.0))  # pass the blob through the network and obtain the face detections  faceNet.setInput(blob)  detections = faceNet.forward()  # initialize our list of faces, their corresponding locations,  # and the list of predictions from our face mask network  faces = []  locs = []  preds = []



  • frame幀:我們的流的幀

  • faceNet:用於檢測圖像中人臉位置的模型

  • maskNet:我們的COVID-19口罩分類器模

# loop over the detectionsfor i in range(0, detections.shape[2]):# extract the confidence (i.e., probability) associated with# the detection    confidence = detections[0, 0, i, 2]# filter out weak detections by ensuring the confidence is# greater than the minimum confidenceif confidence > args["confidence"]:# compute the (x, y)-coordinates of the bounding box for# the object      box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])      (startX, startY, endX, endY) = box.astype("int")# ensure the bounding boxes fall within the dimensions of# the frame      (startX, startY) = (max(0, startX), max(0, startY))      (endX, endY) = (min(w - 1, endX), min(h - 1, endY))​​​​​​
# extract the face ROI, convert it from BGR to RGB channel      # ordering, resize it to 224x224, and preprocess itface = frame[startY:endY, startX:endX]face = cv2.cvtColor(face, cv2.COLOR_BGR2RGB)face = cv2.resize(face, (224, 224))face = img_to_array(face)face = preprocess_input(face)face = np.expand_dims(face, axis=0)      # add the face and bounding boxes to their respective      # listsfaces.append(face)locs.append((startX, startY, endX, endY))


# only make a predictions if at least one face was detectedif len(faces) > 0:    # for faster inference we'll make batch predictions on *all*    # faces at the same time rather than one-by-one predictions    # in the above `for` looppreds = maskNet.predict(faces)  # return a 2-tuple of the face locations and their corresponding  # locationsreturn (locs, preds)


# construct the argument parser and parse the argumentsap = argparse.ArgumentParser()ap.add_argument("-f", "--face", type=str,  default="face_detector",help="path to face detector model directory")ap.add_argument("-m", "--model", type=str,  default="mask_detector.model",help="path to trained face mask detector model")ap.add_argument("-c", "--confidence", type=float, default=0.5,help="minimum probability to filter weak detections")args = vars(ap.parse_args())


  • --face:面部檢測器目錄的路徑

  • --model:我們訓練好的口罩分類器的路徑

  • --confidence:過濾弱臉檢測的最小概率閾值

通過我們的導入,便捷功能和命令行 args 準備好了,在循環遍歷幀之前,我們只需要處理一些初始化工作:​​​​​​​

# load our serialized face detector model from diskprint("[INFO] loading face detector model...")prototxtPath = os.path.sep.join([args["face"], "deploy.prototxt"])weightsPath = os.path.sep.join([args["face"],"res10_300x300_ssd_iter_140000.caffemodel"])faceNet = cv2.dnn.readNet(prototxtPath, weightsPath)# load the face mask detector model from diskprint("[INFO] loading face mask detector model...")maskNet = load_model(args["model"])# initialize the video stream and allow the camera sensor to warm upprint("[INFO] starting video stream...")vs = VideoStream(src=0).start()time.sleep(2.0)


  • 人臉檢測器

  • COVID-19面罩檢測儀

  • 網絡攝像頭視頻


# loop over the frames from the video streamwhile True:  # grab the frame from the threaded video stream and resize it  # to have a maximum width of 400 pixelsframe = = imutils.resize(frame, width=400)  # detect faces in the frame and determine if they are wearing a  # face mask or not(locs, preds) = detect_and_predict_mask(frame, faceNet, maskNet)​​​​​​
# loop over the detected face locations and their corresponding# locationsfor (box, pred) in zip(locs, preds):# unpack the bounding box and predictions    (startX, startY, endX, endY) = box    (mask, withoutMask) = pred# determine the class label and color we'll use to draw# the bounding box and text    label = "Mask" if mask > withoutMask else "No Mask"    color = (0, 255, 0) if label == "Mask" else (0, 0, 255)# include the probability in the label    label = "{}: {:.2f}%".format(label, max(mask, withoutMask) * 100)# display the label and bounding box rectangle on the output# frame    cv2.putText(frame, label, (startX, startY - 10),      cv2.FONT_HERSHEY_SIMPLEX, 0.45, color, 2)    cv2.rectangle(frame, (startX, startY), (endX, endY), color, 2)


  • 打開臉部包圍框和蒙版/非蒙版預測

  • 確定標籤和顏色

  • 標註標籤和麪包圍框

  • 最後,我們顯示結果並執行清理:​​​​​​​

# show the output frame  cv2.imshow("Frame", frame)  key = cv2.waitKey(1) & 0xFF  # if the `q` key was pressed, break from the loop  if key == ord("q"):    break# do a bit of cleanupcv2.destroyAllWindows()vs.stop()



$ python[INFO] loading face detector model...[INFO] loading face mask detector model...[INFO] starting video stream...


