Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

原創

2018-09-04 09:48

這是CVPR2018 Oral的一篇關於 Image Captioning和Visual Question Answering的文章，paper鏈接https://arxiv.org/abs/1707.07998，作者的homepage http://www.panderson.me/，code已經被released出來了https://github.com/peteanderson80/bottom-up-attention。
文章要做的事情：
image caption + visual question answer
文章中show出來的關於image caption和visual question answer的實驗結果。

這篇文章的實驗結果很好， 2017 VQA Challenge第一名，image caption方面也與很多最新的方法進行了比較，文章列出了很多trick，但framework沒有講清楚，沒看明白，以後再看看。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

相關文章

image caption研究進展

主要介紹image caption最近的幾篇文章，及其相關的應用。 1.Google NIC，Show and Tell: A Neural Image Caption Generator [CVPR2015]。code 2.H

2020-07-06 08:38:40

計算機視覺方向如何寫文章

一般學術論文主要分爲這8各方面。 1.Title 多看文章，看別人怎麼給題目起名字，短小精悍，描述自己方法和任務的同時，吸引讀者，題目最好帶一些熱點的詞彙比如revisit，graph，adversarial，reinforce

2020-07-06 08:38:40

Text2Colors: Guiding Image Colorization through Text-Driven Palette Generation

arxiv上面2018年4月13號更新的韓國高麗大學的關於跨媒體（NLP與CV結合）的文章，一作是個研究生，團隊主頁http://davian.korea.ac.kr，文章鏈接https://arxiv.org/pdf/1804.

2020-07-06 08:38:40

Actor and Action Video Segmentation from a Sentence

2020-02-20 18:05:59

Baby Talk and Neural Baby Talk

2019-05-21 14:13:43

跨媒體分析中的新任務

2018-12-24 03:15:27

如何寫文章

2018-10-27 12:23:39

ECCV2018比較有意思的paper

2018-09-26 11:21:49

文本圖像跨媒體檢索進展

2018-09-04 09:48:47

Learning Cross-modal Embeddings for Cooking Recipes and Food Images

2018-09-04 09:48:46

AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

2018-09-04 09:48:46

Finding “It”: Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos

2018-09-04 09:48:46

Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge

2018-09-04 09:48:45

TALL: Temporal Activity Localization via Language Query

2018-09-04 09:48:45

Cross-Modal Retrieval in the Cooking Context：Learning Semantic Text-Image Embeddings

2018-09-04 09:48:45

24小時熱門文章

最新文章

最新評論文章