一週新論文 | 2020年第13周 | 自然語言處理相關

《一週新論文》系列之2020年第13周:自然語言處理相關

本週重點關注:

  • Google: [38], [40]
  • Microsoft: [13]
  • Facebook: [2]
  • 其他: [1]

2020年3月27日

[1]. TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation
鏈接 | https://arxiv.org/abs/2003.11963
作者 | Shaojie Jiang, Thomas Wolf, Christof Monz, Maarten de Rijke
單位 | University of Amsterdam; Hugging Face

[2]. Rat big, cat eaten! Ideas for a useful deep-agent protolanguage
鏈接 | https://arxiv.org/abs/2003.11922
作者 | Marco Baroni
單位 | Facebook AI Research

[3]. Common-Knowledge Concept Recognition for SEVA
鏈接 | https://arxiv.org/abs/2003.11687
作者 | Jitin Krishnan, Patrick Coronado, Hemant Purohit, Huzefa Rangwala

[4]. Word2Vec: Optimal Hyper-Parameters and Their Impact on NLP Downstream Tasks
鏈接 | https://arxiv.org/abs/2003.11645
作者 | Tosin P. Adewumi, Foteini Liwicki, Marcus Liwicki

[5]. Multi-Label Text Classification using Attention-based Graph Neural Network
鏈接 | https://arxiv.org/abs/2003.11644
作者 | Ankit Pal, Muru Selvakumar, Malaikannan Sankarasubbu

[6]. Sentiment Analysis in Drug Reviews using Supervised Machine Learning Algorithms
鏈接 | https://arxiv.org/abs/2003.11643
作者 | Sairamvinay Vijayaraghavan, Debraj Basu

[7]. Author2Vec: A Framework for Generating User Embedding
鏈接 | https://arxiv.org/abs/2003.11627
作者 | Xiaodong Wu, Weizhe Lin, Zhilin Wang, Elena Rastorgueva
單位 | University of Cambridge

[8]. Predicting Unplanned Readmissions with Highly Unstructured Data
鏈接 | https://arxiv.org/abs/2003.11622
作者 | Constanza Fierro, Jorge Pérez, Javier Mora

[9]. Cost-Sensitive BERT for Generalisable Sentence Classification with Imbalanced Data
鏈接 | https://arxiv.org/abs/2003.11563
作者 | Harish Tayyar Madabushi, Elena Kochkina, Michael Castelle
單位 | University of Birmingham; Alan Turing Institute
備註 | NLP4IF 2019

[10]. Finnish Language Modeling with Deep Transformer Models
鏈接 | https://arxiv.org/abs/2003.11562
作者 | Abhilash Jain

[11]. Predicting Legal Proceedings Status: an Approach Based on Sequential Text Data
鏈接 | https://arxiv.org/abs/2003.11561
作者 | Felipe Maia Polo, Itamar Ciochetti, Emerson Bertolo

[12]. Forensic Authorship Analysis of Microblogging Texts Using N-Grams and Stylometric Features
鏈接 | https://arxiv.org/abs/2003.11545
作者 | Nicole Mariah Sharon Belvisi, Naveed Muhammad, Fernando Alonso-Fernandez
備註 | Accepted for publication at 8th International Workshop on Biometrics and Forensics, IWBF 2020

[13]. VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
鏈接 | https://arxiv.org/abs/2003.11618
作者 | Jingzhou Liu, Wenhu Chen, Yu Cheng, Zhe Gan, Licheng Yu, Yiming Yang, Jingjing Liu
單位 | Carnegie Mellon University; University of California, Santa Barbara; Microsoft
備註 | Accepted to CVPR2020

[14]. Heavy-tailed Representations, Text Polarity Classification & Data Augmentation
鏈接 | https://arxiv.org/abs/2003.11593
作者 | Hamid Jalalzai, Pierre Colombo, Chloé Clavel, Eric Gaussier, Giovanna Varni, Emmanuel Vignon, Anne Sabourin

2020年3月26日

[15]. The Medical Scribe: Corpus Development and Model Performance Analyses
鏈接 | https://arxiv.org/abs/2003.11531
作者 | Izhak Shafran, Nan Du, Linh Tran, Amanda Perry, Lauren Keyes, Mark Knichel, Ashley Domin, Lei Huang, Yuhui Chen, Gang Li, Mingqiu Wang, Laurent El Shafey, Hagen Soltau, Justin S. Paul
單位 | Google
備註 | Extended version of the paper accepted at LREC 2020

[16]. Meta-CoTGAN: A Meta Cooperative Training Paradigm for Improving Adversarial Text Generation
鏈接 | https://arxiv.org/abs/2003.11530
作者 | Haiyan Yin, Dingcheng Li, Xu Li, Ping Li
單位 | Baidu Research

[17]. Masakhane – Machine Translation For Africa
鏈接 | https://arxiv.org/abs/2003.11529
作者 | Iroro Orife, Julia Kreutzer, Blessing Sibanda, Daniel Whitenack, Kathleen Siminyu, Laura Martinus, Jamiil Toure Ali, Jade Abbott, Vukosi Marivate, Salomon Kabongo, Musie Meressa, Espoir Murhabazi, Orevaoghene Ahia, Elan van Biljon, Arshath Ramkilowan, Adewale Akinfaderin, Alp Öktem, Wole Akin, Ghollah Kioko, Kevin Degila, Herman Kamper, Bonaventure Dossou, Chris Emezue, Kelechi Ogueji, Abdallah Bashir
備註 | Accepted for the AfricaNLP Workshop, ICLR 2020

[18]. Generating Major Types of Chinese Classical Poetry in a Uniformed Framework
鏈接 | https://arxiv.org/abs/2003.11528
作者 | Jinyi Hu, Maosong Sun
單位 | Tsinghua University

[19]. Tigrinya Neural Machine Translation with Transfer Learning for Humanitarian Response
鏈接 | https://arxiv.org/abs/2003.11523
作者 | Alp Öktem, Mirko Plitt, Grace Tang
單位 | University of Oxford; DeepMind; University College London; The Alan Turing Institute
備註 | Pre-print accepted to Africa NLP workshop organized within Eighth International Conference on Learning Representations (ICLR 2020)

[20]. Matching Text with Deep Mutual Information Estimation
鏈接 | https://arxiv.org/abs/2003.11521
作者 | Xixi Zhou, Chengxi Li, Jiajun Bu, Chengwei Yao, Keyue Shi, Zhi Yu, Zhou Yu
單位 | Zhejiang University; University of California, Davis

[21]. Joint Multiclass Debiasing of Word Embeddings
鏈接 | https://arxiv.org/abs/2003.11520
作者 | Radomir Popović, Florian Lemmerich, Markus Strohmaier

[22]. Vector logic and counterfactuals
鏈接 | https://arxiv.org/abs/2003.11519
作者 | Eduardo Mizraji

[23]. Hybrid Attention-Based Transformer Block Model for Distant Supervision Relation Extraction
鏈接 | https://arxiv.org/abs/2003.11518
作者 | Yan Xiao, Yaochu Jin, Ran Cheng, Kuangrong Hao

[24]. From Algebraic Word Problem to Program: A Formalized Approach
鏈接 | https://arxiv.org/abs/2003.11517
作者 | Adam Wiemerslage, Shafiuddin Rehan Ahmed
單位 | University of Colorado, Boulder
備註 | 9 pages, 6 figures, Course project of Programming Languages

[25]. Keyword-Attentive Deep Semantic Matching
鏈接 | https://arxiv.org/abs/2003.11516
作者 | Changyu Miao, Zhen Cao, Yik-Cheung Tam
單位 | WeChat AI

[26]. Hurtful Words: Quantifying Biases in Clinical Contextual Word Embeddings
鏈接 | https://arxiv.org/abs/2003.11515
作者 | Haoran Zhang, Amy X. Lu, Mohamed Abdalla, Matthew McDermott, Marzyeh Ghassemi
單位 | University of Toronto; Vector Institute; MIT
備註 | Accepted at ACM CHIL 2020 (Spotlight)

[27]. BaitWatcher: A lightweight web interface for the detection of incongruent news headlines
鏈接 | https://arxiv.org/abs/2003.11459
作者 | Kunwoo Park, Taegyun Kim, Seunghyun Yoon, Meeyoung Cha, Kyomin Jung

[28]. Adversarial Multi-Binary Neural Network for Multi-class Classification
鏈接 | https://arxiv.org/abs/2003.11184
作者 | Haiyang Xu, Junwen Chen, Kun Han, Xiangang Li
單位 | Didi

[29]. Learning Syntactic and Dynamic Selective Encoding for Document Summarization
鏈接 | https://arxiv.org/abs/2003.11173
作者 | Haiyang Xu, Yahao He, Kun Han, Junwen Chen, Xiangang Li
備註 | Didi

[30]. Can Embeddings Adequately Represent Medical Terminology? New Large-Scale Medical Term Similarity Datasets Have the Answer!
鏈接 | https://arxiv.org/abs/2003.11082
作者 | Claudia Schulz, Damir Juric

[31]. XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
鏈接 | https://arxiv.org/abs/2003.11080
作者 | Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, Melvin Johnson
單位 | Carnegie Mellon University; DeepMind; Google Research

[32]. Utilizing Deep Learning to Identify Drug Use on Twitter Data
鏈接 | https://arxiv.org/abs/2003.11522
作者 | Joseph Tassone, Peizhi Yan, Mackenzie Simpson, Chetan Mendhe, Vijay Mago, Salimur Choudhury

[33]. COVID-19 and Computer Audition: An Overview on What Speech & Sound Analysis Could Contribute in the SARS-CoV-2 Corona Crisis
鏈接 | https://arxiv.org/abs/2003.11117
作者 | Björn W. Schuller, Dagmar M. Schuller, Kun Qian, Juan Liu, Huaiyuan Zheng, Xiao Li

[34]. EQL – an extremely easy to learn knowledge graph query language, achieving highspeed and precise search
鏈接 | https://arxiv.org/abs/2003.11105
作者 | Han Liu, Shantao Liu

2020年3月25日

[35]. Cross-Lingual Adaptation Using Universal Dependencies
鏈接 | https://arxiv.org/abs/2003.10816
作者 | Nasrin Taghizadeh, Heshaam Faili
單位 |

[36]. Generating Chinese Poetry from Images via Concrete and Abstract Information
鏈接 | https://arxiv.org/abs/2003.10773
作者 | Yusen Liu, Dayiheng Liu, Jiancheng Lv, Yongsheng Sang
單位 | Sichuan University
備註 | Accepted by the 2020 International Joint Conference on Neural Networks (IJCNN 2020)

[37]. Towards Neural Machine Translation for Edoid Languages
鏈接 | https://arxiv.org/abs/2003.10704
作者 | Iroro Orife

[38]. Felix: Flexible Text Editing Through Tagging and Insertion
鏈接 | https://arxiv.org/abs/2003.10687
作者 | Jonathan Mallinson, Aliaksei Severyn, Eric Malmi, Guillermo Garrido
單位 | University of Edinburgh; Google Research

[39]. Improving Yorùbá Diacritic Restoration
鏈接 | https://arxiv.org/abs/2003.10564
作者 | Iroro Orife, David I. Adelani, Timi Fasubaa, Victor Williamson, Wuraola Fisayo Oyewusi, Olamilekan Wahab, Kola Tubosun
備註 | Accepted to ICLR 2020 AfricaNLP workshop

[40]. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
鏈接 | https://arxiv.org/abs/2003.10555
作者 | Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning
單位 | Stanford University; Google Brain
備註 | ICLR 2020

[41]. Learning Compact Reward for Image Captioning
鏈接 | https://arxiv.org/abs/2003.10925
作者 | Nannan Li, Zhenzhong Chen
單位 | Wuhan University

[42]. Investigating Software Usage in the Social Sciences: A Knowledge Graph Approach
鏈接 | https://arxiv.org/abs/2003.10715
作者 | David Schindler, Benjamin Zapilko, Frank Krüger
備註 | 16 pages, 4 figures, preprint of a full paper at Extended Semantic Web Conference (ESWC 2020)

[43]. Video Object Grounding using Semantic Roles in Language Description
鏈接 | https://arxiv.org/abs/2003.10606
作者 | Arka Sadhu, Kan Chen, Ram Nevatia
單位 | University of Southern California; Facebook

[44]. ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation
鏈接 | https://arxiv.org/abs/2003.10557
作者 | Sharon Fogel, Hadar Averbuch-Elor, Sarel Cohen, Shai Mazor, Roee Litman
單位 | Amazon; Cornell University
備註 | in CVPR 2020

[45]. Data-driven models and computational tools for neurolinguistics: a language technology perspective
鏈接 | https://arxiv.org/abs/2003.10540
作者 | Ekaterina Artemova, Amir Bakarov, Aleksey Artemov, Evgeny Burnaev, Maxim Sharaev

2020年3月24日

[46]. Generating Natural Language Adversarial Examples on a Large Scale with Generative Models
鏈接 | https://arxiv.org/abs/2003.10388
作者 | Yankun Ren, Jianbin Lin, Siliang Tang, Jun Zhou, Shuang Yang, Yuan Qi, Xiang Ren
單位 | Ant Financial Services Group; Zhejiang University; University of Southern California

[47]. Adaptive Name Entity Recognition under Highly Unbalanced Data
鏈接 | https://arxiv.org/abs/2003.10296
作者 | Thong Nguyen, Duy Nguyen, Pramod Rao

[48]. PathVQA: 30000+ Questions for Medical Visual Question Answering
鏈接 | https://arxiv.org/abs/2003.10286
作者 | Xuehai He, Yichen Zhang, Luntian Mou, Eric Xing, Pengtao Xie
單位 | University of California San Diego; Beijing University of Technology; Carnegie Mellon University

[49]. Fast Cross-domain Data Augmentation through Neural Sentence Editing
鏈接 | https://arxiv.org/abs/2003.10254
作者 | Guillaume Raille, Sandra Djambazovska, Claudiu Musat

[50]. Unsupervised Word Polysemy Quantification with Multiresolution Grids of Contextual Embeddings
鏈接 | https://arxiv.org/abs/2003.10224
作者 | Christos Xypolopoulos, Antoine J.-P. Tixier, Michalis Vazirgiannis

[51]. E2EET: From Pipeline to End-to-end Entity Typing via Transformer-Based Embeddings
鏈接 | https://arxiv.org/abs/2003.10097
作者 | Michael Stewart, Wei Liu

[52]. Caption Generation of Robot Behaviors based on Unsupervised Learning of Action Segments
鏈接 | https://arxiv.org/abs/2003.10066
作者 | Koichiro Yoshino, Kohei Wakimoto, Yuta Nishimura, Satoshi Nakamura
備註 | Will appear in IWSDS2020

[53]. SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection
鏈接 | https://arxiv.org/abs/2003.09833
作者 | Xiaoya Li, Yuxian Meng, Qinghong Han, Fei Wu, Jiwei Li
單位 | Shannon.AI; Zhejiang University

[54]. Prior Knowledge Driven Label Embedding for Slot Filling in Natural Language Understanding
鏈接 | https://arxiv.org/abs/2003.07962
作者 | Su Zhu, Zijian Zhao, Rao Ma, Kai Yu
單位 | Shanghai Jiao Tong University
備註 | 11 pages, 6 figures; Accepted for IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING

[55]. A Joint Approach to Compound Splitting and Idiomatic Compound Detection
鏈接 | https://arxiv.org/abs/2003.09606
作者 | Irina Krotova, Sergey Aksenov, Ekaterina Artemova
備註 | 8 pages, 5 tables, 1 figure, accepted at LREC 2020

[56]. Analyzing Word Translation of Transformer Layers
鏈接 | https://arxiv.org/abs/2003.09586
作者 | Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu

[57]. A Framework for Generating Explanations from Temporal Personal Health Data
鏈接 | https://arxiv.org/abs/2003.09530
作者 | Jonathan J. Harris, Ching-Hua Chen, Mohammed J. Zaki
單位 | Rensselaer Polytechnic Institute; IBM Research

[58]. TArC: Incrementally and Semi-Automatically Collecting a Tunisian Arabish Corpus
鏈接 | https://arxiv.org/abs/2003.08529
作者 | Elisa Gugliotta, Marco Dinarelli
備註 | Paper accepted at the Language Resources and Evaluation Conference (LREC) 2020

[59]. A Better Variant of Self-Critical Sequence Training
鏈接 | https://arxiv.org/abs/2003.09971
作者 | Ruotian Luo
單位 | TTI-Chicago

[60]. Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles
鏈接 | https://arxiv.org/abs/2003.09881
作者 | Malte Ostendorff, Terry Ruas, Moritz Schubotz, Georg Rehm, Bela Gipp
備註 | Accepted at ACM/IEEE Joint Conference on Digital Libraries (JCDL 2020)

[61]. Invariant Rationalization
鏈接 | https://arxiv.org/abs/2003.09772
作者 | Shiyu Chang, Yang Zhang, Mo Yu, Tommi S. Jaakkola
單位 | MIT; IBM

2020年3月23日

[62]. FedNER: Privacy-preserving Medical Named Entity Recognition with Federated Learning
鏈接 | https://arxiv.org/abs/2003.09288
作者 | Suyu Ge, Fangzhao Wu, Chuhan Wu, Tao Qi, Yongfeng Huang, Xing Xie
單位 | Tsinghua University; Microsoft Research Asia

[63]. Language Technology Programme for Icelandic 2019-2023
鏈接 | https://arxiv.org/abs/2003.08717
作者 | Anna Björk Nikulásdóttir, Jón Guðnason, Anton Karl Ingason, Hrafn Loftsson, Eiríkur Rögnvaldsson, Einar Freyr Sigurðsson, Steinþór Steingrímsson
備註 | Accepted at LREC 2020

[64]. Parallel Intent and Slot Prediction using MLB Fusion
鏈接 | https://arxiv.org/abs/2003.09211
作者 | Anmol Bhasin, Bharatram Natarajan, Gaurav Mathur, Himanshu Mangla

[65]. TNT-KID: Transformer-based Neural Tagger for Keyword Identification
鏈接 | https://arxiv.org/abs/2003.09166
作者 | Matej Martinc, Blaž Škrlj, Senja Pollak
備註 | Submitted to Natural Language Engineering journal

[66]. NSURL-2019 Task 7: Named Entity Recognition (NER) in Farsi
鏈接 | https://arxiv.org/abs/2003.09029
作者 | Nasrin Taghizadeh, Zeinab Borhanifard, Melika GolestaniPour, Heshaam Faili

[67]. Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems
鏈接 | https://arxiv.org/abs/2003.09024
作者 | Nikolay Malkovsky, Vladimir Bataev, Dmitrii Sviridkin, Natalia Kizhaeva, Aleksandr Laptev, Ildar Valiev, Oleg Petrov
備註 | Submitted to Interspeech 2020

[68]. Learning to Encode Position for Transformer with Continuous Dynamical Model
鏈接 | https://arxiv.org/abs/2003.09229
作者 | Xuanqing Liu, Hsiang-Fu Yu, Inderjit Dhillon, Cho-Jui Hsieh
單位 | UCLA; UT Austin; Amazon

[69]. Detecting Mismatch between Text Script and Voice-over Using Utterance Verification Based on Phoneme Recognition Ranking
鏈接 | https://arxiv.org/abs/2003.09180
作者 | Yoonjae Jeong, Hoon-Young Cho
備註 | Accepted by ICASSP 2020

[70]. Automatic Identification of Types of Alterations in Historical Manuscripts
鏈接 | https://arxiv.org/abs/2003.09136
作者 | David Lassner, Anne Baillot, Sergej Dogadov, Klaus-Robert Müller, Shinichi Nakajima

[71]. The value of text for small business default prediction: A deep learning approach
鏈接 | https://arxiv.org/abs/2003.08964
作者 | Matthew Stevenson, Christophe Mues, Cristián Bravo


想要了解更多的自然語言處理最新進展、技術乾貨及學習教程,歡迎關注微信公衆號“語言智能技術筆記簿”或掃描二維碼添加關注。
在這裏插入圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章