Generalizing from a Few Examples: A Survey on Few-Shot Learning 小樣本學習最新綜述 | 三大數據增強方法

原文鏈接:小樣本學習與智能前沿


上一篇:A Survey on Few-Shot Learning | Introduction and Overview

本節中的FSL方法使用先驗知識來增強數據DtrainD_{train},從而豐富了E中的監督信息。(圖4)。
在這裏插入圖片描述

Data augmentation via hand-crafted rules is usually used as pre-processing in FSL methods.
They can introduce different kinds of invariance for the model to capture. For example, on images, one can use translation [12, 76, 114, 119], flipping [103, 119], shearing [119], scaling [76, 160], reflection [34, 72], cropping [103, 160] and rotation [114, 138].

許多增強規則根據數據集制定,使得他們很難應用到其他數據集中。

因此manual data augmentation 不能完全解決FSL問題。

還有一些數據增強方式依賴於樣本是如何轉化和添加到訓練集的。我們把他們分類在Table 3當中。

在這裏插入圖片描述
下面,我們將分別介紹這三種方法。

01 Transforming Samples from Dtrain

這個策略通過轉換訓練集中原有的(xi,yi)(x_i,y_i)爲多個樣本來增加訓練集DtrainD_{train}. 轉換過程作爲先驗知識包含在經驗E中,以便生成其他樣本。
早期的FSL論文[90]通過將每個樣本與其他樣本反覆對齊,從相似的類中學習了一組幾何變換。將學習到的變換應用於每個(xi,yi)以形成大數據集,然後可以通過標準機器學習方法學習大數據集。類似地,[116]從相似類中學習了一組自動編碼器,每個自動編碼器代表一個類內可變性。通過將習得的變化量添加到xix_i來生成新樣本。在[53]中,通過假設所有類別在樣本之間共享一些可變換的可變性,可以學習單個變換函數,將從其他類別學習到的樣本對之間的差異轉移到(xi,yi)。在[74]中,不是枚舉成對的變量,而是使用從大量場景圖像中獲悉的一組獨立的屬性強度迴歸將每個xix_i轉換爲幾個樣本,並將原始xix_i的標籤分配給這些新樣本。[82]在[74]的基礎上進行了改進,將連續屬性子空間用於向x添加屬性變化。

02 Transforming Samples from a Weakly Labeled or Unlabeled Data Set

此策略通過從標記弱監督或未標記的大數據集中選擇帶有目標標記的樣本來增強Dtrain。例如,在用監控攝像頭拍攝的照片中,有人,汽車和道路,但沒有一個被標記。另一個示例是一段較長的演示視頻。它包含說話者的一系列手勢,但是沒有一個被明確註釋。由於此類數據集包含樣本的較大變化,因此將其增加到Dtrain有助於描繪更清晰的pxy)p(x,y)。而且,由於不需要人工來標記,因此收集這樣的數據集更加容易。但是,儘管收集成本很低,但主要問題是如何選擇帶有目標標籤的樣本以增加到Dtrain。在[102]中,爲Dtrain中的每個目標標籤學習了一個示例SVM,然後將其用於從弱標籤數據集中預測樣本的標籤。然後將具有目標標籤的樣品添加到Dtrain中。在[32]中,標籤傳播直接用於標記未標記的數據集,而不是學習分類器。在[148]中,採用漸進策略選擇信息豐富的未標記樣品。然後爲選定的樣本分配僞標籤,並用於更新CNN。

03 Transforming Samples from Similar Data Sets

該策略通過彙總和改編來自相似但較大數據集的輸入輸出對來增強DtrainD_{train}。 聚集權重通常基於樣本之間的某種相似性度量。 在[133]中,它從輔助文本語料庫中提取聚合權重[133]。 由於這些樣本可能不是來自目標FSL類,因此直接將彙總樣本增加到DtrainD_{train}可能會產生誤導。 因此,生成對抗網絡(GAN)[46]被設計爲從許多樣本的數據集中生成不可區分的合成x [42]。 它有兩個生成器,一個生成器將少拍類的樣本映射到大規模類,另一個生成器將大規模類的樣本映射到少數類(以彌補GAN訓練中樣本的不足) 。

Discussion and Summary

使用哪種增強策略的選擇取決於具體的應用。
有時,針對目標任務(或類)存在大量弱監督或未標記的樣本,但是由於收集註釋數據和/或計算成本高昂(這對應於引入的第三種情況)。 在這種情況下,可以通過轉換標記較弱或未標記的數據集中的樣本來執行增強。 當難以收集大規模的未標記數據集,但few-shot類具有某些相似類時,可以從這些相似類中轉換樣本。 如果只有一些學習的轉換器而不是原始樣本可用,則可以通過轉換訓練集中的原始樣本來進行擴充。

總的來說,通過增強DtrainD_{train}解決FSL問題非常簡單明瞭, 即通過利用目標任務的先驗信息來擴充數據。
另一方面,通過數據擴充來解決FSL問題的弱點在於,擴充策略通常是針對每個數據集量身定製的,並且不能輕易地用於其他數據集(尤其是來自其他數據集或域的數據。

最近,AutoAugment [27]提出了自動學習用於深度網絡訓練的增強策略的來解決這個問題。 除此之外,因爲生成的圖像可以很容易地被人在視覺上評估,現有的方法主要是針對圖像設計的。而文本和音頻涉及語法和結構較難生成。 [144]報告了最近對文本使用數據增強的嘗試。

參考文獻

[1] N.Abdo,H.Kretzschmar,L.Spinello,andC.Stachniss.2013.Learningmanipulationactionsfromafewdemonstrations. In International Conference on Robotics and Automation. 1268–1275.
[2] Z. Akata, F. Perronnin, Z. Harchaoui, and C. Schmid. 2013. Label-embedding for attribute-based classification. In Conference on Computer Vision and Pattern Recognition. 819–826.
[3] M. Al-Shedivat, T. Bansal, Y. Burda, I. Sutskever, I. Mordatch, and P. Abbeel. 2018. Continuous adaptation via meta- learning in nonstationary and competitive environments. In International Conference on Learning Representations.
[4] H. Altae-Tran, B. Ramsundar, A. S. Pappu, and V. Pande. 2017. Low data drug discovery with one-shot learning. ACS Central Science 3, 4 (2017), 283–293.
[5] M. Andrychowicz, M. Denil, S. Gomez, M. W. Hoffman, D. Pfau, T. Schaul, and N. de Freitas. 2016. Learning to learn by gradient descent by gradient descent. In Advances in Neural Information Processing Systems. 3981–3989.
[6] S. Arik, J. Chen, K. Peng, W. Ping, and Y. Zhou. 2018. Neural voice cloning with a few samples. In Advances in Neural Information Processing Systems. 10019–10029.
[7] S. Azadi, M. Fisher, V. G. Kim, Z. Wang, E. Shechtman, and T. Darrell. 2018. Multi-content GAN for few-shot font style transfer. In Conference on Computer Vision and Pattern Recognition. 7564–7573.

[8] P. Bachman, A. Sordoni, and A. Trischler. 2017. Learning algorithms for active learning. In International Conference on Machine Learning. 301–310.
[9] Bengio Y. Bahdanau D, Cho K. 2015. Neural machine translation by jointly learning to align and translate. In International Conference on Learning Representations.
[10] E.BartandS.Ullman.2005.Cross-generalization:Learningnovelclassesfromasingleexamplebyfeaturereplacement. In Conference on Computer Vision and Pattern Recognition, Vol. 1. 672–679.
[11] S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira. 2007. Analysis of representations for domain adaptation. In Advances in Neural Information Processing Systems. 137–144.
[12] S. Benaim and L. Wolf. 2018. One-shot unsupervised cross domain translation. In Advances in Neural Information Processing Systems. 2104–2114.
[13] L. Bertinetto, J. F. Henriques, P. Torr, and A. Vedaldi. 2019. Meta-learning with differentiable closed-form solvers. In International Conference on Learning Representations.
[14] L. Bertinetto, J. F. Henriques, J. Valmadre, P. Torr, and A. Vedaldi. 2016. Learning feed-forward one-shot learners. In Advances in Neural Information Processing Systems. 523–531.
[15] C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer.
[16] J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, and J. Wortman. 2008. Learning bounds for domain adaptation. In
Advances in Neural Information Processing Systems. 129–136.
[17] L. Bottou and O. Bousquet. 2008. The tradeoffs of large scale learning. In Advances in Neural Information Processing
Systems. 161–168.
[18] L. Bottou, F. E. Curtis, and J. Nocedal. 2018. Optimization methods for large-scale machine learning. SIAM Rev. 60, 2
(2018), 223–311.
[19] A. Brock, T. Lim, J.M. Ritchie, and N. Weston. 2018. SMASH: One-shot model architecture search through hypernet-
works. In International Conference on Learning Representations.
[20] J. Bromley, I. Guyon, Y. LeCun, E. Säckinger, and R. Shah. 1994. Signature verification using a “siamese” time delay
neural network. In Advances in Neural Information Processing Systems. 737–744.
[21] S. Caelles, K.-K. Maninis, J. Pont-Tuset, L. Leal-Taixé, D. Cremers, and L. Van Gool. 2017. One-shot video object
segmentation. In Conference on Computer Vision and Pattern Recognition. 221–230.
[22] Q. Cai, Y. Pan, T. Yao, C. Yan, and T. Mei. 2018. Memory matching networks for one-shot image recognition. In
Conference on Computer Vision and Pattern Recognition. 4080–4088.
[23] R. Caruana. 1997. Multitask learning. Machine learning 28, 1 (1997), 41–75.
[24] J. Choi, J. Krishnamurthy, A. Kembhavi, and A. Farhadi. 2018. Structured set matching networks for one-shot part
labeling. In Conference on Computer Vision and Pattern Recognition. 3627–3636.
[25] J.D.Co-Reyes,A.Gupta,S.Sanjeev,N.Altieri,J.DeNero,P.Abbeel,andS.Levine.2019.Meta-learninglanguage-guided
policy learning. In International Conference on Learning Representations.
[26] J. J. Craig. 2009. Introduction to Robotics: Mechanics and Control. Pearson Education India.
[27] E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, and Q. V. Le. 2019. AutoAugment: Learning augmentation policies
from data. In Conference on Computer Vision and Pattern Recognition. 113–123.
[28] T. Deleu and Y. Bengio. 2018. The effects of negative adaptation in Model-Agnostic Meta-Learning. arXiv preprint
arXiv:1812.02159 (2018).
[29] G. Denevi, C. Ciliberto, D. Stamos, and M. Pontil. 2018. Learning to learn around a common mean. In Advances in
Neural Information Processing Systems. 10190–10200.
[30] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database.
In Conference on Computer Vision and Pattern Recognition. 248–255.
[31] X. Dong, L. Zhu, D. Zhang, Y. Yang, and F. Wu. 2018. Fast parameter adaptation for few-shot image captioning and
visual question answering. In ACM International Conference on Multimedia. 54–62.
[32] M. Douze, A. Szlam, B. Hariharan, and H. Jégou. 2018. Low-shot learning with large-scale diffusion. In Conference on
Computer Vision and Pattern Recognition. 3349–3358.
[33] Y. Duan, M. Andrychowicz, B. Stadie, J. Ho, J. Schneider, I. Sutskever, P. Abbeel, and W. Zaremba. 2017. One-shot
imitation learning. In Advances in Neural Information Processing Systems. 1087–1098.
[34] H.EdwardsandA.Storkey.2017.Towardsaneuralstatistician.InInternationalConferenceonLearningRepresentations.
[35] L. Fei-Fei, R. Fergus, and P. Perona. 2006. One-shot learning of object categories. IEEE Transactions on Pattern Analysis
and Machine Intelligence 28, 4 (2006), 594–611.
[36] M. Fink. 2005. Object classification from a single example utilizing class relevance metrics. In Advances in Neural
Information Processing Systems. 449–456.
[37] C. Finn, P. Abbeel, and S. Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In
International Conference on Machine Learning. 1126–1135.
[38] C. Finn and S. Levine. 2018. Meta-learning and universality: Deep representations and gradient descent can approxi-
mate any learning algorithm. In International Conference on Learning Representations.
[39] C. Finn, K. Xu, and S. Levine. 2018. Probabilistic model-agnostic meta-learning. In Advances in Neural Information
Processing Systems. 9537–9548.
[40] L.Franceschi,P.Frasconi,S.Salzo,R.Grazzi,andM.Pontil.2018.Bilevelprogrammingforhyperparameteroptimization
and meta-learning. In International Conference on Machine Learning. 1563–1572.
[41] J. Friedman, T. Hastie, and R. Tibshirani. 2001. The Elements of Statistical Learning. Vol. 1. Springer series in statistics
New York.
[42] H. Gao, Z. Shou, A. Zareian, H. Zhang, and S. Chang. 2018. Low-shot learning via covariance-preserving adversarial
augmentation networks. In Advances in Neural Information Processing Systems. 983–993.
[43] P. Germain, F. Bach, A. Lacoste, and S. Lacoste-Julien. 2016. PAC-Bayesian theory meets Bayesian inference. In
Advances in Neural Information Processing Systems. 1884–1892.
[44] S. Gidaris and N. Komodakis. 2018. Dynamic few-shot visual learning without forgetting. In Conference on Computer
Vision and Pattern Recognition. 4367–4375.
[45] I. Goodfellow, Y. Bengio, and A. Courville. 2016. Deep Learning. MIT Press.
[46] I.Goodfellow,J.Pouget-Abadie,M.Mirza,B.Xu,D.Warde-Farley,S.Ozair,A.Courville,andY.Bengio.2014.Generative
adversarial nets. In Advances in Neural Information Processing Systems. 2672–2680.
[47] J. Gordon, J. Bronskill, M. Bauer, S. Nowozin, and R. Turner. 2019. Meta-learning probabilistic inference for prediction.
In International Conference on Learning Representations.
[48] E. Grant, C. Finn, S. Levine, T. Darrell, and T. Griffiths. 2018. Recasting gradient-based meta-learning as hierarchical
Bayes. In International Conference on Learning Representations.
[49] A. Graves, G. Wayne, and I. Danihelka. 2014. Neural Turing machines. arXiv preprint arXiv:1410.5401 (2014).
[50] L.-Y. Gui, Y.-X. Wang, D. Ramanan, and J. Moura. 2018. Few-shot human motion prediction via meta-learning. In
European Conference on Computer Vision. 432–450.
[51] M. Hamaya, T. Matsubara, T. Noda, T. Teramae, and J. Morimoto. 2016. Learning assistive strategies from a few
user-robot interactions: Model-based reinforcement learning approach. In International Conference on Robotics and
Automation. 3346–3351.
[52] X. Han, H. Zhu, P. Yu, Z. Wang, Y. Yao, Z. Liu, and M. Sun. 2018. FewRel: A large-scale supervised few-shot relation
classification dataset with state-of-the-art evaluation. In Conference on Empirical Methods in Natural Language
Processing. 4803–4809.
[53] B. Hariharan and R. Girshick. 2017. Low-shot visual recognition by shrinking and hallucinating features. In Interna-
tional Conference on Computer Vision.
[54] H. He and E. A. Garcia. 2008. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering
9 (2008), 1263–1284.
[55] K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep residual learning for image recognition. In Conference on Computer Vision and Pattern Recognition. 770–778.
[56] A. Herbelot and M. Baroni. 2017. High-risk learning: Acquiring new word vectors from tiny data. In Conference on Empirical Methods in Natural Language Processing. 304–309.
[57] L. B. Hewitt, M. I. Nye, A. Gane, T. Jaakkola, and J. B. Tenenbaum. 2018. The variational homoencoder: Learning to learn high capacity generative models from few examples. In Uncertainty in Artificial Intelligence. 988–997.
[58] S. Hochreiter and J. Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.
[59] S. Hochreiter, A. S. Younger, and P. R. Conwell. 2001. Learning to learn using gradient descent. In International
Conference on Artificial Neural Networks. 87–94.
[60] J. Hoffman, E. Tzeng, J. Donahue, Y. Jia, K. Saenko, and T. Darrell. 2013. One-shot adaptation of supervised deep
convolutional models. In International Conference on Learning Representations.
[61] Z. Hu, X. Li, C. Tu, Z. Liu, and M. Sun. 2018. Few-shot charge prediction with discriminative legal attributes. In
International Conference on Computational Linguistics. 487–498.
[62] S. J. Hwang and L. Sigal. 2014. A unified semantic embedding: Relating taxonomies and attributes. In Advances in
Neural Information Processing Systems. 271–279.
[63] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. 2014. Caffe:
Convolutional architecture for fast feature embedding. In ACM International Conference on Multimedia. 675–678.
[64] V. Joshi, M. Peters, and M. Hopkins. 2018. Extending a parser to distant domains using a few dozen partially annotated
examples. In Annual Meeting of the Association for Computational Linguistics. 1190–1199.
[65] Ł. Kaiser, O. Nachum, A. Roy, and S. Bengio. 2017. Learning to remember rare events. In International Conference on
Learning Representations.
[66] J. M. Kanter and K. Veeramachaneni. 2015. Deep feature synthesis: Towards automating data science endeavors. In
International Conference on Data Science and Advanced Analytics. 1–10.
[67] R. Keshari, M. Vatsa, R. Singh, and A. Noore. 2018. Learning structure and strength of CNN filters for small sample
size training. In Conference on Computer Vision and Pattern Recognition. 9349–9358.
[68] D. P. Kingma and M. Welling. 2014. Auto-encoding variational Bayes. In International Conference on Learning
Representations.
[69] J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A.
Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks. National Academy of Sciences
114, 13 (2017), 3521–3526.
[70] G. Koch. 2015. Siamese neural networks for one-shot image recognition. Ph.D. Dissertation. University of Toronto.
[71] L. Kotthoff, C. Thornton, H. H. Hoos, F. Hutter, and K. Leyton-Brown. 2017. Auto-WEKA 2.0: Automatic model
selection and hyperparameter optimization in WEKA. Journal of Machine Learning Research 18, 1 (2017), 826–830.
[72] J.KozerawskiandM.Turk.2018.CLEAR:Cumulativelearningforone-shotone-classimagerecognition.InConference
on Computer Vision and Pattern Recognition. 3446–3455.
[73] A. Krizhevsky, I. Sutskever, and G. E. Hinton. 2012. ImageNet classification with deep convolutional neural networks.
In Advances in Neural Information Processing Systems. 1097–1105.
[74] R. Kwitt, S. Hegenbart, and M. Niethammer. 2016. One-shot learning of scene locations via feature trajectory transfer.
In Conference on Computer Vision and Pattern Recognition. 78–86.
[75] B. Lake, C.-Y. Lee, J. Glass, and J. Tenenbaum. 2014. One-shot learning of generative speech concepts. In Annual
Meeting of the Cognitive Science Society, Vol. 36.
[76] B.M.Lake,R.Salakhutdinov,andJ.B.Tenenbaum.2015.Human-levelconceptlearningthroughprobabilisticprogram
induction. Science 350, 6266 (2015), 1332–1338.
[77] B. M. Lake, T. D. Ullman, J. B. Tenenbaum, and S. J. Gershman. 2017. Building machines that learn and think like
people. Behavioral and Brain Sciences 40 (2017).
[78] C. H. Lampert, H. Nickisch, and S. Harmeling. 2009. Learning to detect unseen object classes by between-class
attribute transfer. In Conference on Computer Vision and Pattern Recognition. 951–958.
[79] Y. Lee and S. Choi. 2018. Gradient-based meta-learning with learned layerwise metric and subspace. In International
Conference on Machine Learning. 2933–2942.
[80] K. Li and J. Malik. 2017. Learning to optimize. In International Conference on Learning Representations.
[81] X.-L. Li, P. S. Yu, B. Liu, and S.-K. Ng. 2009. Positive unlabeled learning for data stream classification. In SIAM
International Conference on Data Mining. 259–270.
[82] B. Liu, X. Wang, M. Dixit, R. Kwitt, and N. Vasconcelos. 2018. Feature space transfer for data augmentation. In
Conference on Computer Vision and Pattern Recognition. 9090–9098.
[83] H. Liu, K. Simonyan, and Y. Yang. 2019. DARTS: Differentiable architecture search. In International Conference on
Learning Representations.
[84] Y. Liu, J. Lee, M. Park, S. Kim, E. Yang, S. Hwang, and Y Yang. 2019. Learning to propopagate labels: Transductive propagation network for few-shot learning. In International Conference on Learning Representations.
[85] Z.Luo,Y.Zou,J.Hoffman,andL.Fei-Fei.2017.Labelefficientlearningoftransferablerepresentationsacrosssdomains and tasks. In Advances in Neural Information Processing Systems. 165–177.
[86] S. Mahadevan and P. Tadepalli. 1994. Quantifying prior determination knowledge using the PAC learning model. Machine Learning 17, 1 (1994), 69–105.
[87] D. McNamara and M.-F. Balcan. 2017. Risk bounds for transferring representations with and without fine-tuning. In International Conference on Machine Learning. 2373–2381.
[88] T. Mensink, E. Gavves, and C. Snoek. 2014. Costa: Co-occurrence statistics for zero-shot classification. In Conference on Computer Vision and Pattern Recognition. 2441–2448.
[89] A. Miller, A. Fisch, J. Dodge, A.-H. Karimi, A. Bordes, and J. Weston. 2016. Key-value memory networks for directly reading documents. In Conference on Empirical Methods in Natural Language Processing. 1400–1409.
[90] E. G. Miller, N. E. Matsakis, and P. A. Viola. 2000. Learning from one example through shared densities on transforms. In Conference on Computer Vision and Pattern Recognition, Vol. 1. 464–471.
[91] N. Mishra, M. Rohaninejad, X. Chen, and P. Abbeel. 2018. A simple neural attentive meta-learner. In International Conference on Learning Representations.
[92] M. T. Mitchell. 1997. Machine Learning. McGraw-Hill.
[93] S.H.MohammadiandT.Kim.2018.Investigationofusingdisentangledandinterpretablerepresentationsforone-shot
cross-lingual voice conversion. In INTERSPEECH. 2833–2837.
[94] M. Mohri, A. Rostamizadeh, and A. Talwalkar. 2018. Foundations of machine learning. MIT Press.
[95] S. Motiian, Q. Jones, S. Iranmanesh, and G. Doretto. 2017. Few-shot adversarial domain adaptation. In Advances in
Neural Information Processing Systems. 6670–6680.
[96] T. Munkhdalai and H. Yu. 2017. Meta networks. In International Conference on Machine Learning. 2554–2563.
[97] T. Munkhdalai, X. Yuan, S. Mehri, and A. Trischler. 2018. Rapid adaptation with conditionally shifted neurons. In
International Conference on Machine Learning. 3661–3670.
[98] A. Nagabandi, C. Finn, and S. Levine. 2018. Deep online learning via meta-learning: Continual adaptation for
model-based RL. In International Conference on Learning Representations.
[99] H. Nguyen and L. Zakynthinou. 2018. Improved algorithms for collaborative PAC learning. In Advances in Neural
Information Processing Systems. 7631–7639.
[100] B. Oreshkin, P. R. López, and A. Lacoste. 2018. TADAM: Task dependent adaptive metric for improved few-shot
learning. In Advances in Neural Information Processing Systems. 719–729.
[101] S. J. Pan and Q. Yang. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 10,
22 (2010), 1345–1359.
[102] T. Pfister, J. Charles, and A. Zisserman. 2014. Domain-adaptive discriminative one-shot learning of gestures. In
European Conference on Computer Vision. 814–829.
[103] H. Qi, M. Brown, and D. G. Lowe. 2018. Low-shot learning with imprinted weights. In Conference on Computer Vision
and Pattern Recognition. 5822–5830.
[104] T. Ramalho and M. Garnelo. 2019. Adaptive posterior learning: Few-shot learning with a surprise-based memory
module. In International Conference on Learning Representations.
[105] S. Ravi and A. Beatson. 2019. Amortized Bayesian meta-learning. In International Conference on Learning Representa-
tions.
[106] S. Ravi and H. Larochelle. 2017. Optimization as a model for few-shot learning. In International Conference on Learning
Representations.
[107] S. Reed, Y. Chen, T. Paine, A. van den Oord, S. M. A. Eslami, D. Rezende, O. Vinyals, and N. de Freitas. 2018. Few-shot
autoregressive density estimation: Towards learning to learn distributions. In International Conference on Learning
Representations.
[108] M. Ren, S. Ravi, E. Triantafillou, J. Snell, K. Swersky, J. B. Tenenbaum, H. Larochelle, and R. S. Zemel. 2018. Meta-
learning for semi-supervised few-shot classification. In International Conference on Learning Representations.
[109] D. Rezende, I. Danihelka, K. Gregor, and D. Wierstra. 2016. One-shot generalization in deep generative models. In
International Conference on Machine Learning. 1521–1529.
[110] A. Rios and R. Kavuluru. 2018. Few-shot and zero-shot multi-label learning for structured label spaces. In Conference
on Empirical Methods in Natural Language Processing. 3132.
[111] A. A. Rusu, D. Rao, J. Sygnowski, O. Vinyals, R. Pascanu, S. Osindero, and R. Hadsell. 2019. Meta-learning with latent
embedding optimization. In International Conference on Learning Representations.
[112] R. Salakhutdinov and G. Hinton. 2009. Deep boltzmann machines. In International Conference on Artificial Intelligence
and Statistics. 448–455.
在這裏插入圖片描述
在這裏插入圖片描述
在這裏插入圖片描述
在這裏插入圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章