高性能口罩人臉對齊模型MaskFAN設計與實踐

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"size","attrs":{"size":10}},{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"注:原文發表於ICMEW2021"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"size","attrs":{"size":10}},{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"EFFICIENT FACE ALIGNMENT NETWORK FOR MASKED FACE"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"摘要:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"2020年新冠疫情在全球突然爆發,嚴重地影響了我們的正常生活。佩戴口罩成爲了阻止疫情蔓延的重要方法,口罩逐漸成爲了我們日常生活中的必須品。然而口罩的大量使用爲那些以人臉爲基礎的算法(如人臉識別、視頻安防等)帶來了嚴重的挑戰。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"人臉對齊作爲衆多人臉分析任務的基礎,性能也受到了嚴重的影響。爲了提升人臉對齊模型在口罩等遮擋場景中的魯棒性,在本文中我們提出了一種高效的口罩人臉對齊模型,命名爲MaskFAN。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在模型中我們使用了深度可分離卷積和分組卷積來構建了一個輕量化的特徵提取網絡。爲了提升模型對遮擋數據的魯棒性,我們設計了一種全新的loss函數用於輔助模型的訓練。此外,我們還探索了3D數據增廣方法來生成大量帶有口罩的人臉圖片。實驗結果顯示,我們所提出的方法在模型體積和計算量都很小的情況下,性能明顯優於現有的方法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"關鍵詞:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"人臉對齊、輕量化模型設計、口罩遮擋"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"1. 介紹"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"2020年初,新型冠狀肺炎在世界範圍內大肆傳播,口罩作爲一種防止疫情擴散的有效手段,被廣泛的使用。然而,大量使用的口罩爲人臉相關的算法帶來了嚴重的挑戰,例如現有的人臉識別系統幾乎無法正常工作。人臉對齊模型是多種人臉算法的基礎,性能同樣因爲口罩而受到嚴重影響。爲了解決這一問題,我們從提升模型遮擋魯棒性的角度來增強算法的定位精度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"人臉對齊的目標是在一張輸入的人臉圖像中找到一些具有特殊語音信息的位置,如鼻尖、眉梢、嘴角等。人臉對齊在人證比對、3D人臉重建、視頻美顏等多個領域發揮這重要的作用;因此近年來受到了學術界和工業界廣泛關注,在性能上取得了極大的突破。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"隨着人工智能等技術的不斷髮展,深度學習、卷積神經網絡在計算機視覺領域中被大量使用,在效率和精度上都遠優於傳統的方法。與之類似,基於深度學習的方法自DCNN之後,也成爲了人臉對齊領域中的主流方法。一些工作指出,當人臉圖片未被嚴重遮擋時,現有的人臉對齊模型仍然能夠取得較好的效果;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"然而,一旦佩戴口罩,臉上的大部分區域都會被覆蓋住,通用的人臉對齊方法幾乎無法正常工作。一些工作試圖通過使用更加複雜的特徵提取模型來解決遮擋問題。但是,這些複雜的特徵提取網絡在運行時需要耗費大量的資源,因此很難在嵌入式設備中部署和使用。人臉對齊算法對遮擋的魯棒性和運行效率都需要被思考和解決。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了解決上述提到的問題,我們從兩個角度對現有的人臉對齊方法進行了優化。首先,爲了提升算法對遮擋(口罩)的魯棒性,我們設計了一種全新的loss函數和數據增廣策略。其次,我們使用了深度可分離卷積和分組卷積設計了一種輕量化且適用於人臉對齊的特徵提取網絡。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"2. 方法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在這一部分中,我們將詳細的介紹MaskFAN的設計思路和訓練過程。一般來說,人臉對齊模型通常使用RGB圖像作爲輸入,然後輸出一組特徵,其中"},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"H"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"和"},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"W"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"分別表示輸入圖像的高和寬,"},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"N"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"代表模型所需預測點的個數。圖1是算法的總體流程圖。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/27\/e7\/27b439caff0c44bdebefe6dcb437e7e7.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"圖1 算法總體流程圖"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.1 輕量化模型設計"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"設計合適的特徵提取網絡,一直是深度學習領域中的一項核心問題;針對不同的任務和應用場景,我們需要對模型的結構進行一系類調整。當模型部署到雲端和大型服務器上時,我們更應該關注模型的預測精度,因此可以使用那些相對複雜的模型,如ResNet、VGG、DenseNet等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"然而,當算法需要部署到嵌入式設備(智能手機、機器人、攝像頭等)時,我們更加關注模型的運行效率問題,因此需要在保證精度符合項目需求的情況下,儘量壓縮模型參數量和計算量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"經過研究發現,常用的高精度人臉對齊模型都使用了十分複雜的特徵提取網絡如HRNet等,這些複雜的模型包含了大量的重複block和channel,導致模型的體積和計算量非常龐大,無法在嵌入式設備中運行。但是口罩人臉對齊模型需要部署到小型的智能終端設備中,因此模型的參數量和計算量必須受被嚴格的控制。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"受到MobileNet和ShuffleNet等輕量化模型的啓發,我們認爲在人臉對齊領域中也可以設計一種精度高、速度快、體積小的特徵提取模型。深度可分離卷積是一種常用的降低模型參數量和計算量的卷積操作,在模型輕量化設計中被廣泛使用。分組卷積起源於AlexNet,用來將深度學習模型拆分,並使其可以在多個GPUs中訓練;目前分組卷積主要用來降低模型的參數量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"因此,我們使用深度可分離卷積和分組卷積來構建一個適用於遮擋場景、高效的特徵提取結構。在我們設計的模型中,將分組卷積的Group數設定爲模型的Channel數。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"此外我們還將Receptive Field Block模塊引入到設計的特徵提取結構中來增強模型的信息建模能力。模型結構如表1所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/f6\/a7\/f66cbefd733a241f6ffe373b5faf6fa7.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"表1 模型具體結構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.2 增強型WingLoss函數(E-Wing)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Loss函數的設計也是深度學習領域中的一項重要的研究課題。因此,設計一個恰當的loss函數可以極大的增強模型的性能。研究發現,基於深度學習的人臉對齊模型在訓練時,基本都採用L1或L2 Loss。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在口罩人臉對齊問題中,由於部分面部區域被遮擋,使得大量需要檢測的面部關鍵點無法被準確的定位。在這種情況下,繼續基於L1或L2 Loss進行訓練,將會使模型更多的關注於那些被遮擋的區域,導致算法無法收斂。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"受到WingLoss的啓發,在遮擋人臉對齊領域中,我們需要使模型更加關注於那些未被遮擋的區域。WingLoss雖然取得了較好的檢測性能,但是在訓練過程中有可能出現梯度爲零的情況,影響模型穩定性和收斂速度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了解決上述的問題,我們提出了一種增強型的WingLoss (E-Wing)。該方法在誤差較小地位置擴大梯度,在誤差較大地位置使用固定梯度。因此,可以迫使模型將更多地關注那些小誤差點。E-Wing定義如公式1所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/19\/2e\/194d34f9ed758268919ca8db6b4ac42e.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"其中"},{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"r"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"是一個固定地常數,被用來限制算法地曲率,是一個常數,用來銜接E-Wing的線性和非線性部分。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"2.3 數據增強模塊"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"訓練數據是深度學習任務中最重要的一部分,訓練數據的數量和質量將對模型的性能產生決定性的影響。在口罩人臉對齊任務中,我們很難獲取大量的標註數據;如果自行構建數據集則需要耗費大量的人力和財力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了解決這一問題,我們基於3DMM和生成對抗網絡提出了一種數據增強模模塊,該模塊可以保證人臉相對位置不發生任何改變的情況下,生成大量帶有口罩的圖片。部分結果如圖2所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/03\/36\/0325daa69b93611b002d0fab7ce01336.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 圖2 數據增強結果;(a)輸入圖片,(b)模塊生成結果"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在人臉對齊領域中,正面的人臉圖片的數量遠多於測量圖像的數量。因此,這種數據不平衡問題有可能會導致模型對正面人臉嚴重過擬合。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了緩解這一現象,我們使用了一種數據平衡策略;該策略首先計算出人臉朝向的一組歐拉角(patch、yaw、roll),然後根據角度的分佈對那些數量較少的圖片進行旋轉、鏡像等多種增廣。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"3. 實驗"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.1 數據集"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了證明本位所提出方法的性能,我們在FLL2021數據集上進行了對比實驗。FLL2021是一個最近發佈的口罩人臉對齊的數據集,該數據集共包含24,386張圖片,每張圖片中均表述了106個關鍵點;數據集覆蓋了大姿態、誇張表情等多中不同場景。我們選擇其中的18,384張圖片作爲訓練集,2,038張圖像作爲測試集。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.2 測試指標"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Normalized Mean Error (NME):NME 是一中在人臉對齊任務中廣泛使用的評測指標,具體定義如公式2所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"                       "}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/e8\/f1\/e83d1133bf738f6c570cc8a85dd920f1.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"其中N是索取檢測點的數量;L是歸一會距離,在該任務中我們使用圖像的對角線長度作來對結果進行歸一化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Failure Rate (FR):FR於NME類似,用來表徵算法的性能,我們將FR的閾值設定爲0.08。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.3 實驗結果"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"將所提出的MaskFAN於常用的人臉對齊模型在FLL2021數據集上進行實驗,並分析結果。由於不同方法的輸入圖像的分辨率不同,爲了公平對比,我們展示了兩組對比結果,表2中我們使用256x256的圖像作爲輸入;表3中我們使用128x128的圖像作爲輸入。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"表2 人臉對齊結果對比,使用256x256的圖像作爲輸入"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

Model

NME (%)

FR0.08 (%)

Param (M)

ShuffleNet-v2

2.13

3.11

2.28

MobileNet-v3

2.05

2.89

5.48

VGG-16bn

1.81

1.91

138.37

ResNet-50

1.87

2.07

25.56

Wing-Loss

1.75

1.85

12.31

HRNet-W18

1.64

1.27

9.65

HRNet-W32

1.59

1.09

38.10

MaskFAN

1.47

1.17

0.32"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"表3 人臉對齊結果對比,使用128x128的圖像作爲輸入"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

Model

NME (%)

FR0.08 (%)

Param (M)

ShuffleNet-v2

2.79

3.12

2.28

MobileNet-v2

2.73

3.06

3.50

MobileNet-v3

2.39

2.70

5.48

VGG-16bn

1.99

1.95

138.37

ResNet-50

1.96

1.83

25.56

Wing-Loss

1.86

1.61

12.31

MaskFAN

1.53

1.23

0.32"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"從結果中我們可以得出,本文所提出的算法在模型參數量非常小的情況下,取得了和現有算法相似的精度,證明了我們所設計模塊的性能。部分可視化結果如圖3所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/cd\/7c\/cd8479d015e8741c90a1278821dde67c.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 圖3 可視化結果展示"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"4. 結論"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在本文中,我們系統性的研究了口罩人臉對齊領域中的一些核心問題,如模型輕量化設計、遮擋魯棒性、Loss函數優化等。爲了解決口罩遮擋帶來的挑戰,我們提出了一種高性能的口罩人臉對齊模型MaskFAN。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了提升該模型的性能,我們在模型設計時引入了深度可分離卷積和分組卷積;此外還提出了E-Wing用來增強模型對遮擋的魯棒性。實驗結果顯示,我們的方法在參數量極小的情況下,取得了優於現有方法的精度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"作者介紹"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"沙宇洋,中科院計算所工程師,北京郵電大學碩士,目前主要從事人臉識別以及無人駕駛等相關方向的研究和實際產品開發。"}]}]}

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章