【KD】、【reid】Distilled Person Re-identification: Towards a More Scalable System

原創

moonuke

2020-06-13 20:26

评论：log distance对我们有参考意义，多teacher其他的研究是面向无监督学习的。

Motivation:

面向应用，论文从以下几点出发：

要有低的标注成本。A scalable Re-ID system可以从无标签数据和半标注数据中学习。
要有低的场景扩展成本。当扩展新的场景时，低成本的解决cross domin的问题。
要有低的testing computation（inference计算量）成本。为了应对在芯片上计算的趋势，需要有更轻量级的模型。

Contribution:

(1) Reid是一个open-set的识别任务，为其的KD提出Log-Euclidean Similarity Distillation Loss.

(2)提出自适应知识门来聚合多teacher模型来学习轻量级学生网络。
(3)进一步用一个Multi-teacher Adaptive Similarity Distillation Framework来聚合他们，降低标注成本，扩展成本，testing计算成本。（实际上做的是跨域的工作，在无监督，半监督方法中达到SOTA）

论文框架：

Teacher model: trained 5 teacher models T 1, T 2, T 3, T 4, T 5 with labelled data in the training sets of MSMT17 [53], CUHK03 [28], ViPER [18], DukeMTMC [70] and Market-1501

3. Similarity Knowledge Distillation

3.1. Construction of Similarity Matrices

目标：we minimize the distance between the student similarity matrix AS and the teacher similarity matrix AT。

Similarity Matrices约束：

在KD前要对student,teacher vector分别做construction

1,normalization(用Relu 将xs的范围放到0-1之间)

2, As是symmetric positive definite (正定矩阵)

出发点：限制特征值为正，可进行凸优化，也可求特征值。The range of similarities in AS is [0, 1]。

3.2. Log-Euclidean Similarity Distillation

measure the distance in a log-Euclidean Riemannian framework [4] instead of using Euclidean metric as follow:

其中：log(A)是对任意symmetric positive definite正定矩阵可求特征根，如下式：

目的：降维。

最后的log距离：

We distill the knowledge embedded in the similarity from teacher to student by minimizing the Log-Euclidean distance as follow:

4. Learning to learn from Multiple Teachers

4.1. Multi-teacher Adaptive Aggregated Distillation

权重系数是学习的we aim to learn αi dynamically to make the loss LT A adaptive to the target domain. We call LT A the Adaptive Aggregated Distillation Loss

4.2. Adaptive Knowledge Aggregation

目的：为了使用较少的label a small amount of identities in target domin，就能训练.

Let xU S,k and xL S,i denote the features of unlabelled sample IU k and labelled sample IL i

As there is no overlap identity in DL and DU

标注label和没标注的id no overlap

然后loss的优化目标就是让pos pair大，neg(no labeled)小

validation empirical risk：

实验结果：

比较log-euclidean距离和欧式距离

Setting：

Implementation Details. For teacher models of source scenes, an advanced Re-ID model PCB [47] was adopted. For student model of the target scene, a lightweight model MobileNetV2 [46] was adopted and a convolution layer was applied to reduce the last feature map channel number to 256.

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【KD】、【reid】Distilled Person Re-identification: Towards a More Scalable System

985 硕士程序员，空窗 4 个月没有 Offer！

一文搞懂 Spring 循环依赖

赛博斗地主——使用大语言模型扮演Agent智能体玩牌类游戏。

VScode右键打开(添加到右键)

记一次 .NET某工控视觉自动化系统卡死分析

【KD】基礎知識

Knowledge Distillation總目錄

【KD】Correlation Congruence for Knowledge Distillation

【KD】、【reid】Distilled Person Re-identification: Towards a More Scalable System

【person search】Re-ID Driven Localization Refinement for Person Search

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結