Finding “It”: Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos

原創

2018-09-04 09:48

這是CVPR2018 Oral的一片關於Weakly-Supervised Video Grounding的文章，paper連接http://ai.stanford.edu/~dahuang/papers/cvpr18-ramil.pdf，作者的homepage http://ai.stanford.edu/~dahuang/，code暫時沒有被released出來。
文章要做的事情：
輸入：sentence+video　　　　　　　輸出：bounding box（train的時候沒有bbox ground truth）
文章中show出來的example如下所示。

文章在兩個datasets上面的實驗結果如下所示。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

相關文章

image caption研究進展

主要介紹image caption最近的幾篇文章，及其相關的應用。 1.Google NIC，Show and Tell: A Neural Image Caption Generator [CVPR2015]。code 2.H

2020-07-06 08:38:40

計算機視覺方向如何寫文章

一般學術論文主要分爲這8各方面。 1.Title 多看文章，看別人怎麼給題目起名字，短小精悍，描述自己方法和任務的同時，吸引讀者，題目最好帶一些熱點的詞彙比如revisit，graph，adversarial，reinforce

2020-07-06 08:38:40

Text2Colors: Guiding Image Colorization through Text-Driven Palette Generation

arxiv上面2018年4月13號更新的韓國高麗大學的關於跨媒體（NLP與CV結合）的文章，一作是個研究生，團隊主頁http://davian.korea.ac.kr，文章鏈接https://arxiv.org/pdf/1804.

2020-07-06 08:38:40

Actor and Action Video Segmentation from a Sentence

2020-02-20 18:05:59

Baby Talk and Neural Baby Talk

2019-05-21 14:13:43

跨媒體分析中的新任務

2018-12-24 03:15:27

如何寫文章

2018-10-27 12:23:39

ECCV2018比較有意思的paper

2018-09-26 11:21:49

文本圖像跨媒體檢索進展

2018-09-04 09:48:47

Learning Cross-modal Embeddings for Cooking Recipes and Food Images

2018-09-04 09:48:46

AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

2018-09-04 09:48:46

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

2018-09-04 09:48:46

Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge

2018-09-04 09:48:45

TALL: Temporal Activity Localization via Language Query

2018-09-04 09:48:45

Cross-Modal Retrieval in the Cooking Context：Learning Semantic Text-Image Embeddings

2018-09-04 09:48:45

24小時熱門文章

最新文章

最新評論文章