【論文閱讀】Universal Domain Adaptation

Universal Domain Adaptation

SUMMARY@2020/3/27

文章目錄

Motivation

This paper focuses on its special setting of universal domain adaptation, where

no prior information about the target label sets is provided.
we know source domain with labeled data.

The following figure shows this motivation of this setting.

and the following show some settings of this universal domain adaptation:

Related Work

This work is partly based on some early works of partially set domain adaptation by Mingsheng Long group, like:

SAN (Partial Transfer Learning with Selective Adversarial Networks)
- utilizes multiple domain discriminators with class level and instance-level weighting mechanism to achieve per-class adversarial distribution matching.
PADA (Partial adversarial domain adaptation)
- only one adversarial network and jointly applying class-level weighting on the source classifier
- haven’t yet read

and some of others’ relative work of :

IWAN (Importance weighted adversarial nets for partial domain adaptation)
- constructs an auxiliary domain discriminator to quantify the probability of a source sample being similar to the target domain.
- haven’t yet read

And these works all partly applies the idea of adversarial network GAN and domain adaptation version GAN:

GAN (Generative Adversarial Nets)
DANN( Domain-Adversarial Training of Neural Networks)
- adversarial-based, deep method domain adaptation

Challenges / Aims /Contribution

Under the universal domain adaptation setting, our goal now is to match the common categories in source and target domain. The main challenges of solving this universal problem are:

how to deal with the $\bar {C_S}$ part which is unrelated part of source domain to circumvent negative transfer for target domain
effective domain adaptation between related part of source domain and target domain
learn model (feature extraction & classifier )to minimize the target risk in the common set $C$

Method Proposed

UAN(universal adaptation network) is composes of 4 parts in training phase as following figure shows.

Feature extractor $F$

find good features that match source and target
good features to be used by classifier

Label classifier $G$

compute prediction label $\hat y = G(F(x)) \in C_S$ (source domain label set)
classification loss need to be minimized by good parameters of $F$ and $G$
$E_G = \mathbb E_{(\mathrm{x,y})\sim p}L(\mathrm{y},G(F(\mathrm x)))$

Non-adversarial domain discriminator $D^\prime$

compute similarity of each $\rm x$ to source domain
- $\hat d^\prime = D^\prime(\rm z) \in[0,1]$
- $\hat d^\prime \rightarrow 1 $ if x is more similar to source
domain classification loss need to be minimized, thus end up with good $\hat d^\prime$ output for every sample from both source and target domain:
$E_{D^\prime} = - \mathbb E_{\mathrm{x}\sim p}\mathrm{log}(D^\prime(F(\mathrm x))) - \mathbb E_{\mathrm{x}\sim q}\mathrm{log}(1- D^\prime(F(\mathrm x)))$
hypothesis: expectation of similarity value from different label set distribution will be used in weighting adversarial domain discriminator D:
$\mathbb E_{\mathrm x\sim {p_{\bar {C_S}}}} {\hat d^\prime} > \mathbb E_{\mathrm x\sim {p_{ {C}}}} {\hat d^\prime} > \mathbb E_{\mathrm x\sim {q_{{C}}}} {\hat d^\prime} > \mathbb E_{\mathrm x\sim {q_{\bar {C_t}}}} {\hat d^\prime}$
not used in adversarial, since it is the same as in DANN, which aims at matching the exactly same source and target label space. may cause negative transfer in universal setting.

Adversarial domain discriminator $D$

aims at discriminate source and target in the common label set $C$
domain discriminate loss: needs to be minimized for good discriminator; needs to be maximized which equals the good representation of feature extractor:
$E_{D} = - \mathbb E_{\mathrm{x}\sim p}w^s(\mathrm x)\mathrm{log}(D^\prime(F(\mathrm x))) - \mathbb E_{\mathrm{x}\sim q}w^t(\mathrm x)\mathrm{log}(1- D^\prime(F(\mathrm x)))$
add big weights for samples from common label set in both source and target domain , aims at maximally match the source and target domain specially in common label set.
weights(called “sample level transferability criterion”) to be constructed:
$\mathbb E_{\mathrm x\sim {p_{{C}}}} w^s(\mathrm x) > \mathbb E_{\mathrm x\sim {\bar p_{{C_s}}}} w^s(\mathrm x) \\ \mathbb E_{\mathrm x\sim {q_{{C}}}} w^t(\mathrm x) > \mathbb E_{\mathrm x\sim {\bar q_{{C_t}}}} w^t(\mathrm x)$
use entropy of predicted vector to measure uncertainty of prediction:
$\mathbb E_{\mathrm x\sim {q_{\bar {C_t}}}} H(\mathrm {\hat y}) >\mathbb E_{\mathrm x\sim {q_{{C}}}} H(\mathrm {\hat y}) >\mathbb E_{\mathrm x\sim {p_{{C}}}} H(\mathrm {\hat y}) >\mathbb E_{\mathrm x\sim {p_{\bar {C_s}}}} H(\mathrm {\hat y})$
use domain similarity and the prediction uncertainty of each sample, to develop a weighting mechanism for discovering label sets shared by both domains and promote common-class adaptation
$w^s(\mathrm x) = \frac{H(\mathrm {\hat y})}{\mathrm{log}|C_s|}-\hat d^\prime(\mathrm x) \\ w^t(\mathrm x) = \hat d^\prime(\mathrm x)-\frac{H(\mathrm {\hat y})}{\mathrm{log}|C_s|}\\$
- normalized H
- all together normalized when training

Training

to write in GAN-based two stage, but in neural network implemented end-to-end by using the gradient reversal layer from DANN:

$KaTeX parse error: Expected group after '_' at position 6: \max_̲\limits{D}\min_…$

Testing

see figure below :

no adversarial $D$
calculate weight $w^t(x)$ for sample $x$ from target
set a validated threshold to argue whether x comes from common label set

Experiment

$F$ is pretrained ResNet50
all unknown in target labeled as a whole "unknow " big class
better than prior setting methods

【論文閱讀】Universal Domain Adaptation

文章目錄

Motivation

Related Work

Challenges / Aims /Contribution

Method Proposed

Feature extractor $F$

Label classifier $G$

Non-adversarial domain discriminator $D^\prime$

Adversarial domain discriminator $D$

Training

Testing

Experiment

vue項目獲取富文本編輯器wangEditor內容導出爲word（html轉word格式並下載）

dotnet C# 創建 X11 應用時設置窗口背景顏色

Navicat安裝與激活教程

TDengine docker安裝方法

vue3組件通信與props

sapui5

Alpine Linux apk add DNS lookup error

部分JDK版本的發佈時間

工作中用到的腳本合集

合併代碼時Beyond Compare設置

LeetCode 1：two sum

百面機器學習 #3 經典算法：02 邏輯迴歸

一些常用常新的數學公式（備查）

【論文閱讀】JDA（joint distribution adaptation)/2013初稿

【論文閱讀】Multisource Transfer Learning With Convolutional Neural Networks for Lung Pattern Analysis

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

【論文閱讀】Universal Domain Adaptation

文章目錄

Motivation

Related Work

Challenges / Aims /Contribution

Method Proposed

Feature extractor FFF

Label classifier GGG

Non-adversarial domain discriminator D′D^\primeD′

Adversarial domain discriminator DDD

Training

Testing

Experiment

Feature extractor $F$

Label classifier $G$

Non-adversarial domain discriminator $D^\prime$

Adversarial domain discriminator $D$