【KD】、【reid】Distilled Person Re-identification: Towards a More Scalable System

原創

moonuke

2020-06-13 20:26

評論：log distance對我們有參考意義，多teacher其他的研究是面向無監督學習的。

Motivation:

面向應用，論文從以下幾點出發：

要有低的標註成本。A scalable Re-ID system可以從無標籤數據和半標註數據中學習。
要有低的場景擴展成本。當擴展新的場景時，低成本的解決cross domin的問題。
要有低的testing computation（inference計算量）成本。爲了應對在芯片上計算的趨勢，需要有更輕量級的模型。

Contribution:

(1) Reid是一個open-set的識別任務，爲其的KD提出Log-Euclidean Similarity Distillation Loss.

(2)提出自適應知識門來聚合多teacher模型來學習輕量級學生網絡。
(3)進一步用一個Multi-teacher Adaptive Similarity Distillation Framework來聚合他們，降低標註成本，擴展成本，testing計算成本。（實際上做的是跨域的工作，在無監督，半監督方法中達到SOTA）

論文框架：

Teacher model: trained 5 teacher models T 1, T 2, T 3, T 4, T 5 with labelled data in the training sets of MSMT17 [53], CUHK03 [28], ViPER [18], DukeMTMC [70] and Market-1501

3. Similarity Knowledge Distillation

3.1. Construction of Similarity Matrices

目標：we minimize the distance between the student similarity matrix AS and the teacher similarity matrix AT。

Similarity Matrices約束：

在KD前要對student,teacher vector分別做construction

1,normalization(用Relu 將xs的範圍放到0-1之間)

2, As是symmetric positive definite (正定矩陣)

出發點：限制特徵值爲正，可進行凸優化，也可求特徵值。The range of similarities in AS is [0, 1]。

3.2. Log-Euclidean Similarity Distillation

measure the distance in a log-Euclidean Riemannian framework [4] instead of using Euclidean metric as follow:

其中：log(A)是對任意symmetric positive definite正定矩陣可求特徵根，如下式：

目的：降維。

最後的log距離：

We distill the knowledge embedded in the similarity from teacher to student by minimizing the Log-Euclidean distance as follow:

4. Learning to learn from Multiple Teachers

4.1. Multi-teacher Adaptive Aggregated Distillation

權重係數是學習的we aim to learn αi dynamically to make the loss LT A adaptive to the target domain. We call LT A the Adaptive Aggregated Distillation Loss

4.2. Adaptive Knowledge Aggregation

目的：爲了使用較少的label a small amount of identities in target domin，就能訓練.

Let xU S,k and xL S,i denote the features of unlabelled sample IU k and labelled sample IL i

As there is no overlap identity in DL and DU

標註label和沒標註的id no overlap

然後loss的優化目標就是讓pos pair大，neg(no labeled)小

validation empirical risk：

實驗結果：

比較log-euclidean距離和歐式距離

Setting：

Implementation Details. For teacher models of source scenes, an advanced Re-ID model PCB [47] was adopted. For student model of the target scene, a lightweight model MobileNetV2 [46] was adopted and a convolution layer was applied to reduce the last feature map channel number to 256.

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【KD】、【reid】Distilled Person Re-identification: Towards a More Scalable System

詐騙（殺豬盤）網站進行滲透測試

Python 潮流週刊#50：我最喜歡的 Python 3.13 新特性！

外行也能讀懂的網絡硬件設備功能原理速成

【KD】基礎知識

Knowledge Distillation總目錄

【KD】Correlation Congruence for Knowledge Distillation

【KD】、【reid】Distilled Person Re-identification: Towards a More Scalable System

【person search】Re-ID Driven Localization Refinement for Person Search

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結