Zero-Shot Deep Domain Adaptation

Domain adaptation

Reference[原文]: Joselynzhao.top & 夏木青 | Zero-Shot Deep Domain Adaptation

Abstract

Domain adaptation(域適應) is an important tool to transfer knowledge about a task.

Current approaches:
假設 task-relevant target-domain數據在訓練期間是可用的。
而我們展示瞭如何在上述數據不可用的情況下實現 Domain adaptation的

爲了解決這個問題，我們提出了zero-shot deep domain adaptation (ZDDA)

使用來自任務無關的雙領域對的特權信息，學習一個源領域的表示，不僅適合於興趣任務，還接近於目標域的表達。

聯合源領域表達訓練的源領域解決方法可以使用源表達和目標表達。

數據集：
Using the MNIST, FashionMNIST, NIST, EMNIST, and SUN RGB-D datasets

效果
可以在不是訪問任務相關目標域訓練數據的情況下，實現分類任務的域適應。
我們還通過模擬與任務相關的源域數據的任務相關目標域表示，擴展ZDDA以在SUN RGB-D場景分類任務中執行傳感器融合

ZDDA是第一個域適應和傳感器融合方法，它不需要任務相關的目標域數據。
基本原則並不特定於計算機視覺數據，但應該可擴展到其他領域。

Introduction

domain shift[17] 會導致將解決方法轉移到另外領域時的性能下降。
DA任務的目標是爲源域和目標域導出TOI的解決方案。

The state-of-the-art DA methods： [1, 14–16, 25, 30, 35, 37, 39–41, 43,44, 47, 50] 假設任務相關數據，可以直接應用和關係到TOI，在訓練時在目標域可用。但這些假設在真實情況下往往不是這樣的。

sensor fusion [31,48]

ZDDA learns from the task-irrelevant dual-domain training pairs without using the task-relevant target-domain training data, where we use the term task-irrelevant data to refer to the data which is not task-relevant. # 任務不相關
在後文中，我們用T-R表示任務相關，用T_I表示任務無關。

圖一：當任務相關的目標域訓練數據不可用時，ZDDA從與任務無關的雙域對中學習。

這個圖沒有太明白

DA task MNIST [27]→MNIST-M [13]，source domian 灰度模式，target domain RGB模式
TOI：在MNIST[27]和[13]做測試的數字分類。
假設不能會用MNIST[13]來訓練數據

在例子中：
ZDDA 使用 MNIST [27] training data and
the T-I gray-RGB pairs from the Fashion-MNIST [46] dataset and the Fashion-MNIST-M dataset to train digit classifiers for MNIST [27] and MNIST-M [13] images。

ZDDA achieves this by 使用灰度模式圖像模擬 the RGB 表達
and 創建聯合網絡 with the supervision of the TOI in the gray scale
domain. We present the details of ZDDA in Sec. 3.
We make the following two contributions：

ZDDA, 第一個基於深度學習的域適應方法，從一個圖像形態到另一個不同的圖像形態 (not just different datasets in the same modality such as the
Office dataset [32]) without using the task-relevant target-domain training data. We show ZDDA’s efficacy using the MNIST [27], Fashion-MNIST [46], NIST [18], EMNIST [9], and SUN RGB-D [36] datasets with cross validation.）
Given no task-relevant target-domain training data, we show that ZDDA
can 執行傳感器融合 and that 和a naive fusion approach 相比， ZDDA is more robust to noisy testing data in either source or target or both domains in the scene classification task from the SUN RGB-D [36] dataset.

Related work

域適應DA 被廣泛地應用於計算機視覺和圖像分類.[1,14–16,25,30,35,
37,39–41,43,44,47,50] 還有 語義分割[45,51] 和圖像字幕[8].

在結合深度神經網絡的情況下:
the state-of-the-art methods successfully perform DA with (fully or partially) labeled [8,15,25,30, 39] or unlabeled [1,14–16,35,37,39–41,43–45,47,50] T-R target-domain data. (最好的方法使用了標註\部分標註\無標註的T-R 目標域數據)

用於改善DA任務性能的策略:

domain adversarial loss [40] 域對抗
domain confusion loss[39]

大多數存在的方法都是需要T_R 目標域訓練數據. (但現實情況下,不可行)

ZDDA 在不實用T-R目標域訓練數據的情況下從T-I雙域對中學習.

ZDDA包括使用源域數據模擬目標域表示，[19,21]中提到了類似的概念。但是[19,21]需要訪問T-R雙域訓練對.

Table 1, which shows that the ZDDA problem setting is different from those of UDA, MVL, and DG.

MVL DG 給出了多個域中的T-R訓練數據

但是，在ZDDA中，T-R目標域訓練數據不可用，並且唯一可用的T-R訓練數據在一個源域中。

only ZDDA can work under all four conditions.

在傳感器融合方面:

[…]

Our Proposed Method — ZDDA

ZDDA被設計來實現兩個目標:

Domain adaptation
Sensor fusion

Domain adaptation

REFERENCE

Aljundi, R., Tuytelaars, T.: Lightweight unsupervised domain adaptation by convolutional filter reconstruction. In: Hua, G., J´egou, H. (eds.) ECCV 2016. LNCS,
vol. 9915, pp. 508–515. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-
49409-8 43
Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical
image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 898–916 (2011)
BAIR/BVLC: BAIR/BVLC AlexNet model. http://dl.caffe.berkeleyvision.org/
bvlc alexnet.caffemodel. Accessed 02 March 2017
BAIR/BVLC: BAIR/BVLC GoogleNet model. http://dl.caffe.berkeleyvision.org/
bvlc googlenet.caffemodel. Accessed 02 March 2017
BAIR/BVLC: Lenet architecture in the Caffe tutorial. https://github.com/BVLC/
caffe/blob/master/examples/mnist/lenet.prototxt
Blitzer, J., Foster, D.P., Kakade, S.M.: Zero-shot domain adaptation: a multi-view
approach. In: Technical Report TTI-TR-2009-1. Technological institute Toyota
(2009)
Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D.: Unsupervised
pixel-level domain adaptation with generative adversarial networks. In: The IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3722–3731.
IEEE (2017)
Chen, T.H., Liao, Y.H., Chuang, C.Y., Hsu, W.T., Fu, J., Sun, M.: Show, adapt
and tell: adversarial training of cross-domain image captioner. In: The IEEE International Conference on Computer Vision (ICCV), pp. 521–530. IEEE (2017)
Cohen, G., Afshar, S., Tapson, J., van Schaik, A.: EMNIST: An extension of
MNIST to handwritten letters. arXiv preprint arXiv: 1702.05373 (2017)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: A large-scale
hierarchical image database. In: The IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), pp. 248–255. IEEE (2009)
Ding, Z., Shao, M., Fu, Y.: Missing modality transfer learning via latent low-rank
constraint. IEEE Trans. Image Proces. 24, 4322–4334 (2015)
Fu, Z., Xiang, T., Kodirov, E., Gong, S.: Zero-shot object recognition by semantic
manifold distance. In: The IEEE Conference on Computer Vision and Pattern
Recognition (CVPR). IEEE (2015)
Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation.
In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on
Machine Learning (ICML-2015), vol. 37, pp. 1180–1189. PMLR (2015)
Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F.,
Marchand, M., Lempitsky, V.: Domain-adversarial training of neural networks. J.
Mach. Learn. Res. (JMLR) 17(59), 1–35 (2016)
Gebru, T., Hoffman, J., Li, F.F.: Fine-grained recognition in the wild: A multi-task
domain adaptation approach. In: The IEEE International Conference on Computer
Vision (ICCV), pp. 1349–1358. IEEE (2017)
Ghifary, M., Kleijn, W.B., Zhang, M., Balduzzi, D., Li, W.: Deep reconstructionclassification networks for unsupervised domain adaptation. In: Leibe, B., Matas,
J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 597–613.
Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0 36
Gretton, A., Smola, A.J., Huang, J., Schmittfull, M., Borgwardt, K.M., Sch¨olkopf,
B.: Covariate shift and local learning by distribution matching, pp. 131–160. MIT
Press, Cambridge (2009)
Grother, P., Hanaoka, K.: NIST special database 19 handprinted forms and characters database. National Institute of Standards and Technology (2016)
Gupta, S., Hoffman, J., Malik, J.: Cross modal distillation for supervision transfer.
In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
pp. 2827–2836. IEEE (2016)
Haeusser, P., Frerix, T., Mordvintsev, A., Cremers, D.: Associative domain adaptation. In: The IEEE International Conference on Computer Vision (ICCV), pp.
2765–2773. IEEE (2017)
Hoffman, J., Gupta, S., Darrell, T.: Learning with side information through modality hallucination. In: The IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 826–834. IEEE (2016)
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.:
SqueezeNet v1.1model. https://github.com/DeepScale/SqueezeNet/blob/master/
SqueezeNet v1.1/squeezenet v1.1.caffemodel. Accessed 11 Feb 2017
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.:
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model
size. arXiv preprint arXiv: 1602.07360 (2016)
Jia, Y., et al.: Caffe: Convolutional architecture for fast feature embedding. arXiv
preprint arXiv: 1408.5093 (2014)
Koniusz, P., Tas, Y., Porikli, F.: Domain adaptation by mixture of alignments of
second- or higher-order scatter tensors. In: The IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), pp. 4478–4487. IEEE (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger,
K.Q. (eds.) Advances in Neural Information Processing Systems (NIPS), vol. 25,
pp. 1097–1105. Curran Associates, Inc. (2012)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to
document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Li, D., Yang, Y., Song, Y.Z., Hospedales, T.M.: Deeper, broader and artier
domain generalization. In: The IEEE International Conference on Computer Vision
(ICCV). IEEE (2017)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C.J.C.,
Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in
Neural Information Processing Systems, vol. 26, pp. 3111–3119. Curran Associates
Inc. (2013)
Motiian, S., Piccirilli, M., Adjeroh, D.A., Doretto, G.: Unified deep supervised
domain adaptation and generalization. In: The IEEE International Conference on
Computer Vision (ICCV), pp. 5715–5725. IEEE (2017)
Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., Ng, A.Y.: Multimodal deep
learning. In: Getoor, L., Scheffer, T. (eds.) Proceedings of the 28th International
Conference on Machine Learning (ICML-2011), pp. 689–696. Omnipress (2011)
Saenko, K., Kulis, B., Fritz, M., Darrell, T.: Adapting visual category models to
new domains. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010.
LNCS, vol. 6314, pp. 213–226. Springer, Heidelberg (2010). https://doi.org/10.
1007/978-3-642-15561-1 16
Saito, K., Ushiku, Y., Harada, T.: Asymmetric tri-training for unsupervised domain
adaptation. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International
Conference on Machine Learning (ICML-2017), vol. 70, pp. 2988–2997. PMLR
(2017)
Zero-S
Sener, O., Song, H.O., Saxena, A., Savarese, S.: Learning transferrable representations for unsupervised domain adaptation. In: Lee, D.D., Sugiyama, M., Luxburg,
U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing
Systems (NIPS), vol. 29, pp. 2110–2118. Curran Associates, Inc. (2016)
Sohn, K., Liu, S., Zhong, G., Yu, X., Yang, M.H., Chandraker, M.: Unsupervised
domain adaptation for face recognition in unlabeled videos. In: The IEEE International Conference on Computer Vision (ICCV), pp. 3210–3218. IEEE (2017)
Song, S., Lichtenberg, S., Xiao, J.: SUN RGB-D: a RGB-D scene understanding benchmark suite. In: The IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 567–576. IEEE (2015)
Sun, B., Saenko, K.: Deep CORAL: correlation alignment for deep domain adaptation. In: Hua, G., J´egou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 443–450.
Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8 35
Szegedy, C., et al.: Going deeper with convolutions. In: The IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), pp. 1–9. IEEE (2015)
Tzeng, E., Hoffman, J., Darrell, T., Saenko, K.: Simultaneous deep transfer across
domains and tasks. In: The IEEE International Conference on Computer Vision
(ICCV), pp. 4068–4076. IEEE (2015)
Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain
adaptation. In: The IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), pp. 7167–7176. IEEE (2017)
Venkateswara, H., Eusebio, J., Chakraborty, S., Panchanathan, S.: Deep hashing
network for unsupervised domain adaptation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5018–5027. IEEE (2017)
Wang, W., Arora, R., Livescu, K., Bilmes, J.: On deep multi-view representation
learning. In: Bach, F., Blei, D. (eds.) Proceedings of the 32th International Conference on Machine Learning (ICML-2015), vol. 37, pp. 1083–1092. PMLR (2015)
Wang, Y., Li, W., Dai, D., Gool, L.V.: Deep domain adaptation by geodesic distance minimization. In: The IEEE International Conference on Computer Vision
(ICCV), pp. 2651–2657. IEEE (2017)
Wu, C., Wen, W., Afzal, T., Zhang, Y., Chen, Y., Li, H.: A compact DNN:
approaching GoogLeNet-level accuracy of classification and domain adaptation.
In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
pp. 5668–5677. IEEE (2017)
Wulfmeier, M., Bewley, A., Posner, I.: Addressing appearance change in outdoor
robotics with adversarial domain adaptation. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1551–1558. IEEE (2017)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv: 1702.05374 (2017)
Yan, H., Ding, Y., Li, P., Wang, Q., Xu, Y., Zuo, W.: Mind the class weight bias:
Weighted maximum mean discrepancy for unsupervised domain adaptation. In:
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.
2272–2281. IEEE (2017)
Yang, X., Ramesh, P., Chitta, R., Madhvanath, S., Bernal, E.A., Luo, J.: Deep multimodal representation learning from temporal data. In: The IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), pp. 5447–5455. IEEE (2017)
Yang, Y., Hospedales, T.M.: Zero-shot domain adaptation via kernel regression on
the grassmannian. In: Drira, H., Kurtek, S., Turaga, P. (eds.) BMVC Workshop
on Differential Geometry in Computer Vision. BMVA Press (2015)
Zhang, J., Li, W., Ogunbona, P.: Joint geometrical and statistical alignment for
visual domain adaptation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1859–1867. IEEE (2017)
Zhang, Y., David, P., Gong, B.: Curriculum domain adaptation for semantic segmentation of urban scenes. In: The IEEE International Conference on Computer
Vision (ICCV), pp. 2020–2030. IEEE (2017)

Zero-Shot Deep Domain Adaptation[reading notes]

Zero-Shot Deep Domain Adaptation

Abstract

Introduction

Related work

Our Proposed Method — ZDDA

Domain adaptation

【MMT】ICLR 2020: MMT(Mutual Mean-Teaching)方法，無監督域適應在Person Re-ID上性能再創新高

【FSR】Feature Space Regularization for Person Re-Identification with One Sample

Zero-Shot Deep Domain Adaptation[reading notes]

Generation Tasks

概念學習前沿報告

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結