Zero-Shot Deep Domain Adaptation[reading notes]

Reference[原文]: Joselynzhao.top & 夏木青 | Zero-Shot Deep Domain Adaptation

Abstract

Domain adaptation(域適應) is an important tool to transfer knowledge about a task.

Current approaches:
假設 task-relevant target-domain數據在訓練期間是可用的。
而我們展示瞭如何在 上述數據不可用的情況下實現 Domain adaptation的

爲了解決這個問題,我們提出了zero-shot deep domain adaptation (ZDDA)

使用來自 任務無關的雙領域對 的 特權信息, 學習一個源領域的表示,不僅適合於興趣任務,還接近於目標域的表達。

聯合源領域表達訓練的 源領域解決方法 可以使用 源表達和目標表達。

數據集:
Using the MNIST, FashionMNIST, NIST, EMNIST, and SUN RGB-D datasets

效果
可以在不是訪問 任務相關目標域訓練數據的情況下,實現分類任務的域適應。
我們還通過模擬與任務相關的源域數據的任務相關目標域表示,擴展ZDDA以在SUN RGB-D場景分類任務中執行傳感器融合

ZDDA是第一個域適應和傳感器融合方法,它不需要任務相關的目標域數據。
基本原則並不特定於計算機視覺數據,但應該可擴展到其他領域。

Introduction

domain shift[17] 會導致將解決方法轉移到另外領域時的性能下降。
DA任務的目標是爲源域和目標域導出TOI的解決方案。

The state-of-the-art DA methods: [1, 14–16, 25, 30, 35, 37, 39–41, 43,44, 47, 50] 假設任務相關數據,可以直接應用和關係到TOI,在訓練時在目標域可用。但這些假設在真實情況下往往不是這樣的。

sensor fusion [31,48]

ZDDA learns from the task-irrelevant dual-domain training pairs without using the task-relevant target-domain training data, where we use the term task-irrelevant data to refer to the data which is not task-relevant. # 任務不相關
在後文中,我們用T-R表示任務相關,用T_I表示任務無關。

在這裏插入圖片描述
圖一 :當任務相關的目標域訓練數據不可用時,ZDDA從與任務無關的雙域對中學習。

這個圖沒有太明白

DA task MNIST [27]→MNIST-M [13],source domian 灰度模式,target domain RGB模式
TOI: 在MNIST[27]和[13]做測試的數字分類。
假設不能會用MNIST[13]來訓練數據

在例子中:
ZDDA 使用 MNIST [27] training data and
the T-I gray-RGB pairs from the Fashion-MNIST [46] dataset and the Fashion-MNIST-M dataset to train digit classifiers for MNIST [27] and MNIST-M [13] images。

ZDDA achieves this by 使用灰度模式圖像 模擬 the RGB 表達
and 創建聯合網絡 with the supervision of the TOI in the gray scale
domain. We present the details of ZDDA in Sec. 3.
We make the following two contributions

  • ZDDA, 第一個基於深度學習的 域適應 方法,從一個圖像形態到另一個不同的圖像形態 (not just different datasets in the same modality such as the
    Office dataset [32]) without using the task-relevant target-domain training data. We show ZDDA’s efficacy using the MNIST [27], Fashion-MNIST [46], NIST [18], EMNIST [9], and SUN RGB-D [36] datasets with cross validation.)

  • Given no task-relevant target-domain training data, we show that ZDDA
    can 執行傳感器融合 and that 和a naive fusion approach 相比, ZDDA is more robust to noisy testing data in either source or target or both domains in the scene classification task from the SUN RGB-D [36] dataset.

Related work

域適應DA 被廣泛地應用於計算機視覺和圖像分類.[1,14–16,25,30,35,
37,39–41,43,44,47,50] 還有 語義分割[45,51]圖像字幕[8].

在結合深度神經網絡的情況下:
the state-of-the-art methods successfully perform DA with (fully or partially) labeled [8,15,25,30, 39] or unlabeled [1,14–16,35,37,39–41,43–45,47,50] T-R target-domain data. (最好的方法使用了 標註\部分標註\無標註 的T-R 目標域數據)

用於改善DA任務性能的策略:

  • domain adversarial loss [40] 域對抗
  • domain confusion loss[39]

大多數存在的方法都是需要T_R 目標域 訓練數據. (但現實情況下,不可行)

ZDDA 在不實用T-R目標域訓練數據的情況下 從T-I雙域對中學習.

ZDDA包括使用源域數據模擬目標域表示,[19,21]中提到了類似的概念。但是[19,21]需要訪問T-R雙域訓練對.

在這裏插入圖片描述

Table 1, which shows that the ZDDA problem setting is different from those of UDA, MVL, and DG.

MVL DG 給出了多個域中的T-R訓練數據

但是,在ZDDA中,T-R目標域訓練數據不可用,並且唯一可用的T-R訓練數據在一個源域中。

在這裏插入圖片描述

only ZDDA can work under all four conditions.

在傳感器融合方面:

[…]

Our Proposed Method — ZDDA

ZDDA被設計來實現兩個目標:

  • Domain adaptation
  • Sensor fusion

Domain adaptation

在這裏插入圖片描述

在這裏插入圖片描述

在這裏插入圖片描述

在這裏插入圖片描述

REFERENCE

  1. Aljundi, R., Tuytelaars, T.: Lightweight unsupervised domain adaptation by convolutional filter reconstruction. In: Hua, G., J´egou, H. (eds.) ECCV 2016. LNCS,
    vol. 9915, pp. 508–515. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-
    49409-8 43

  2. Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical
    image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 898–916 (2011)

  3. BAIR/BVLC: BAIR/BVLC AlexNet model. http://dl.caffe.berkeleyvision.org/
    bvlc alexnet.caffemodel. Accessed 02 March 2017

  4. BAIR/BVLC: BAIR/BVLC GoogleNet model. http://dl.caffe.berkeleyvision.org/
    bvlc googlenet.caffemodel. Accessed 02 March 2017

  5. BAIR/BVLC: Lenet architecture in the Caffe tutorial. https://github.com/BVLC/
    caffe/blob/master/examples/mnist/lenet.prototxt

  6. Blitzer, J., Foster, D.P., Kakade, S.M.: Zero-shot domain adaptation: a multi-view
    approach. In: Technical Report TTI-TR-2009-1. Technological institute Toyota
    (2009)

  7. Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D.: Unsupervised
    pixel-level domain adaptation with generative adversarial networks. In: The IEEE
    Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3722–3731.
    IEEE (2017)

  8. Chen, T.H., Liao, Y.H., Chuang, C.Y., Hsu, W.T., Fu, J., Sun, M.: Show, adapt
    and tell: adversarial training of cross-domain image captioner. In: The IEEE International Conference on Computer Vision (ICCV), pp. 521–530. IEEE (2017)

  9. Cohen, G., Afshar, S., Tapson, J., van Schaik, A.: EMNIST: An extension of
    MNIST to handwritten letters. arXiv preprint arXiv: 1702.05373 (2017)

  10. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: A large-scale
    hierarchical image database. In: The IEEE Conference on Computer Vision and
    Pattern Recognition (CVPR), pp. 248–255. IEEE (2009)

  11. Ding, Z., Shao, M., Fu, Y.: Missing modality transfer learning via latent low-rank
    constraint. IEEE Trans. Image Proces. 24, 4322–4334 (2015)

  12. Fu, Z., Xiang, T., Kodirov, E., Gong, S.: Zero-shot object recognition by semantic
    manifold distance. In: The IEEE Conference on Computer Vision and Pattern
    Recognition (CVPR). IEEE (2015)

  13. Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation.
    In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on
    Machine Learning (ICML-2015), vol. 37, pp. 1180–1189. PMLR (2015)

  14. Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F.,
    Marchand, M., Lempitsky, V.: Domain-adversarial training of neural networks. J.
    Mach. Learn. Res. (JMLR) 17(59), 1–35 (2016)

  15. Gebru, T., Hoffman, J., Li, F.F.: Fine-grained recognition in the wild: A multi-task
    domain adaptation approach. In: The IEEE International Conference on Computer
    Vision (ICCV), pp. 1349–1358. IEEE (2017)

  16. Ghifary, M., Kleijn, W.B., Zhang, M., Balduzzi, D., Li, W.: Deep reconstructionclassification networks for unsupervised domain adaptation. In: Leibe, B., Matas,
    J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 597–613.
    Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0 36

  17. Gretton, A., Smola, A.J., Huang, J., Schmittfull, M., Borgwardt, K.M., Sch¨olkopf,
    B.: Covariate shift and local learning by distribution matching, pp. 131–160. MIT
    Press, Cambridge (2009)

  18. Grother, P., Hanaoka, K.: NIST special database 19 handprinted forms and characters database. National Institute of Standards and Technology (2016)

  19. Gupta, S., Hoffman, J., Malik, J.: Cross modal distillation for supervision transfer.
    In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
    pp. 2827–2836. IEEE (2016)

  20. Haeusser, P., Frerix, T., Mordvintsev, A., Cremers, D.: Associative domain adaptation. In: The IEEE International Conference on Computer Vision (ICCV), pp.
    2765–2773. IEEE (2017)

  21. Hoffman, J., Gupta, S., Darrell, T.: Learning with side information through modality hallucination. In: The IEEE Conference on Computer Vision and Pattern
    Recognition (CVPR), pp. 826–834. IEEE (2016)

  22. Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.:
    SqueezeNet v1.1model. https://github.com/DeepScale/SqueezeNet/blob/master/
    SqueezeNet v1.1/squeezenet v1.1.caffemodel. Accessed 11 Feb 2017

  23. Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.:
    SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model
    size. arXiv preprint arXiv: 1602.07360 (2016)

  24. Jia, Y., et al.: Caffe: Convolutional architecture for fast feature embedding. arXiv
    preprint arXiv: 1408.5093 (2014)

  25. Koniusz, P., Tas, Y., Porikli, F.: Domain adaptation by mixture of alignments of
    second- or higher-order scatter tensors. In: The IEEE Conference on Computer
    Vision and Pattern Recognition (CVPR), pp. 4478–4487. IEEE (2017)

  26. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger,
    K.Q. (eds.) Advances in Neural Information Processing Systems (NIPS), vol. 25,
    pp. 1097–1105. Curran Associates, Inc. (2012)

  27. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to
    document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

  28. Li, D., Yang, Y., Song, Y.Z., Hospedales, T.M.: Deeper, broader and artier
    domain generalization. In: The IEEE International Conference on Computer Vision
    (ICCV). IEEE (2017)

  29. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C.J.C.,
    Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in
    Neural Information Processing Systems, vol. 26, pp. 3111–3119. Curran Associates
    Inc. (2013)

  30. Motiian, S., Piccirilli, M., Adjeroh, D.A., Doretto, G.: Unified deep supervised
    domain adaptation and generalization. In: The IEEE International Conference on
    Computer Vision (ICCV), pp. 5715–5725. IEEE (2017)

  31. Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., Ng, A.Y.: Multimodal deep
    learning. In: Getoor, L., Scheffer, T. (eds.) Proceedings of the 28th International
    Conference on Machine Learning (ICML-2011), pp. 689–696. Omnipress (2011)

  32. Saenko, K., Kulis, B., Fritz, M., Darrell, T.: Adapting visual category models to
    new domains. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010.
    LNCS, vol. 6314, pp. 213–226. Springer, Heidelberg (2010). https://doi.org/10.
    1007/978-3-642-15561-1 16

  33. Saito, K., Ushiku, Y., Harada, T.: Asymmetric tri-training for unsupervised domain
    adaptation. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International
    Conference on Machine Learning (ICML-2017), vol. 70, pp. 2988–2997. PMLR
    (2017)
    Zero-S

  34. Sener, O., Song, H.O., Saxena, A., Savarese, S.: Learning transferrable representations for unsupervised domain adaptation. In: Lee, D.D., Sugiyama, M., Luxburg,
    U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing
    Systems (NIPS), vol. 29, pp. 2110–2118. Curran Associates, Inc. (2016)

  35. Sohn, K., Liu, S., Zhong, G., Yu, X., Yang, M.H., Chandraker, M.: Unsupervised
    domain adaptation for face recognition in unlabeled videos. In: The IEEE International Conference on Computer Vision (ICCV), pp. 3210–3218. IEEE (2017)

  36. Song, S., Lichtenberg, S., Xiao, J.: SUN RGB-D: a RGB-D scene understanding benchmark suite. In: The IEEE Conference on Computer Vision and Pattern
    Recognition (CVPR), pp. 567–576. IEEE (2015)

  37. Sun, B., Saenko, K.: Deep CORAL: correlation alignment for deep domain adaptation. In: Hua, G., J´egou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 443–450.
    Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8 35

  38. Szegedy, C., et al.: Going deeper with convolutions. In: The IEEE Conference on
    Computer Vision and Pattern Recognition (CVPR), pp. 1–9. IEEE (2015)

  39. Tzeng, E., Hoffman, J., Darrell, T., Saenko, K.: Simultaneous deep transfer across
    domains and tasks. In: The IEEE International Conference on Computer Vision
    (ICCV), pp. 4068–4076. IEEE (2015)

  40. Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain
    adaptation. In: The IEEE Conference on Computer Vision and Pattern Recognition
    (CVPR), pp. 7167–7176. IEEE (2017)

  41. Venkateswara, H., Eusebio, J., Chakraborty, S., Panchanathan, S.: Deep hashing
    network for unsupervised domain adaptation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5018–5027. IEEE (2017)

  42. Wang, W., Arora, R., Livescu, K., Bilmes, J.: On deep multi-view representation
    learning. In: Bach, F., Blei, D. (eds.) Proceedings of the 32th International Conference on Machine Learning (ICML-2015), vol. 37, pp. 1083–1092. PMLR (2015)

  43. Wang, Y., Li, W., Dai, D., Gool, L.V.: Deep domain adaptation by geodesic distance minimization. In: The IEEE International Conference on Computer Vision
    (ICCV), pp. 2651–2657. IEEE (2017)

  44. Wu, C., Wen, W., Afzal, T., Zhang, Y., Chen, Y., Li, H.: A compact DNN:
    approaching GoogLeNet-level accuracy of classification and domain adaptation.
    In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
    pp. 5668–5677. IEEE (2017)

  45. Wulfmeier, M., Bewley, A., Posner, I.: Addressing appearance change in outdoor
    robotics with adversarial domain adaptation. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1551–1558. IEEE (2017)

  46. Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv: 1702.05374 (2017)

  47. Yan, H., Ding, Y., Li, P., Wang, Q., Xu, Y., Zuo, W.: Mind the class weight bias:
    Weighted maximum mean discrepancy for unsupervised domain adaptation. In:
    The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.
    2272–2281. IEEE (2017)

  48. Yang, X., Ramesh, P., Chitta, R., Madhvanath, S., Bernal, E.A., Luo, J.: Deep multimodal representation learning from temporal data. In: The IEEE Conference on
    Computer Vision and Pattern Recognition (CVPR), pp. 5447–5455. IEEE (2017)

  49. Yang, Y., Hospedales, T.M.: Zero-shot domain adaptation via kernel regression on
    the grassmannian. In: Drira, H., Kurtek, S., Turaga, P. (eds.) BMVC Workshop
    on Differential Geometry in Computer Vision. BMVA Press (2015)

  50. Zhang, J., Li, W., Ogunbona, P.: Joint geometrical and statistical alignment for
    visual domain adaptation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1859–1867. IEEE (2017)

  51. Zhang, Y., David, P., Gong, B.: Curriculum domain adaptation for semantic segmentation of urban scenes. In: The IEEE International Conference on Computer
    Vision (ICCV), pp. 2020–2030. IEEE (2017)

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章