【论文笔记】Enhancing Pre-Trained Language Representations with Rich Knowledge for MRC

原創

changreal

2020-02-21 05:55

KT-NET——Knowledge and Text fusion NET

KBs ：WrodNet + NELL ; distrubuted representations of KBs(KB embeddings).

WordNet:记录了lexical relations, 比如(organism, hypernym of, animal)

NELL:stores beliefs about entities；比如(Coca Cola, headquartered in, Atlanta)

Datasets：ReCoRD, SQuAD1.1

与其他利用extra knowledge model的区别（比如Kn-Reader区别）

首先学习了KB concepts的embeddins，对学习到的KB embeddings再做retrieved并整合进MRC系统里（也就是structured kg和context是整合起来的）。这样用到的relevant KB是globally的，这对MRC系统来说more useful.

之前的KB model都是先retrieve相关KB，然后再对相关KB encode和整合进MRC系统，其中的relevant KB是locally的。

评估指标：EM, F1，EM+F1 score

这篇论文的相关利用知识的模型和论文都值得看一看。

贡献

1. pre-trained LMs + kn，未来研究的潜在方向，enhancing advanced LMs with kg from KBs.

2.设计了MRC的KT-NET

使用了kb的bert的效果

来源于ReCoRD（2018）: 引入来自WordNet和NELL的kn以后，提高了CST准确度。

Real-word entities, synsets, concepts

KT-NET模型

模型简述

①首先学习2个KBs的embeddings；

②检索相关的可能的KB embeddings；

③encodes，把选中的embeddings 和BERT的隐层状态fuse起来；

④用context-, knowledge-aware predictions.

为了encode kg，使用了knowledge graph embedding技术，从而学到KB concepts的向量表示。

给定P,Q,然后为所有token w（w∈P∪Q的）检索一系列相关的KB concepts C(w)，其中每个概念c∈C(w)，c是学到的vector embedding c. 从而得到预训练的KB embeddings，再+ 4 major components里。

然后，迭代地：

BERT Encoding layer，计算问题和passages的deep, context-aware representations;
Knowledge intergration layer, 不仅context-aware，并且knowledge-aware。利用attention机制从kb memory中选择最相关的kb embeddings, 然后把他们和bert encode的representations整合起来；
Self-maching layer，fuse BERT and KB representations，进一步rich interactions.
Output layer，make knowledge-aware predictions.

具体

使用的2个KBs，知识的被存储为triples:(subject, relation, object),

Knowledge embedding

给定一个triple(s,r,o),学习vector embeddings of subject s, relation r, and object o.

然后使用BILINEAR model，f(s,r,o) = sTdiag(r)o.

这样已经在KB里的triples会有higher validity. 然后一个magin-base ranking loss来学习嵌入。从而得到两个KBs的每个entity的vector representation。

Retrieval

Wordnet里，返回word的synsets作为候选;

NELL里，首先识别P,Q的NE,通过string matching识别出的entities连接到NELL entities，然后搜集相关NELL concepts作为候选获得一系列潜在相关概念。

如图：passage/question的 token，给出kb中3个最相关度概念~ （用attention来选出）

4 component

实验

预处理：使用BERT的BasicTokenizer，用NLTK找同义词，还用FullTokenizer built in BERT to segment words into wordicecs.

考虑所有句子的单词，(n. v. adj. adv)，然后每个词si，获取最后隐层词表示，然后计算q和p的词si、sj的余弦相似度。

在MRC任务fine-tune后BERT对question的词会学习到相似的表示。但是整合入知识以后，不同的q的单词展示出对一篇文章的单词不同的相似度，这些相似度很好地反映了它们在KBs里encode的关系。

KT-NET可以学习更准确的representations, 从而取得更好的question-passage matching.

提到的技术

Knowledge graph embedding techniques (Yang et al., 2015)：用于encode knowledge, 学习到KB concept的向量表示；

Element-wise multiplcation；

Row-wise softmax；

BILINEAR model(yang 2015) 通过一个双线性函数f(s,r,o)来测量validity，并且a margin-based ranking loss to learn the embeddings；

需要外部知识的数据集

ReCoRD ：extractive MRC datasets

ARC 、MCScript 、OpenBookQA 、CommonsenseQA ：multi-choice MRC datasets

structured knowledge from KBs :一系列论文（看论文）

部分提到的论文

(Bishan Yang and Tom Mitchell. 2017. ) Leveraging knowledge bases in lstms for improving machine reading；

(2018)Commonsense for generative multi-hop question answering tasks.；

【看过】(2018)Knowledgeable reader: Enhancing cloze-style reading comprehension with external commonsense knowledge.；

(2018, commonsense reasoning)Bridging the gap between human and machine commonsense reading comprehension

changreal

发布了60 篇原创文章 · 获赞 13 · 访问量 3万+

私信关注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【论文笔记】Enhancing Pre-Trained Language Representations with Rich Knowledge for MRC

KT-NET模型

具体

实验

【論文筆記】Attention總結二：Attention本質思想 + Hard/Soft/Global/Local形式Attention

【讀書筆記】《深度學習入門——基於python的理論與實現》

【論文筆記】MRC綜述論文+神經閱讀理解與超越基礎部分總結

【兼容調試】AttributeError: 'NoneType' object has no attribute 'loader'

【論文筆記】ULMFiT——Universal Language Model Fine-tuning for Text Classification

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結