update time: 11-5
Relation definition:
A relation is defined in the form of a tuple where the are entities in a predefined relation within document .
從這裏可以看出其實relation 的定義是十分廣泛的,只要在document中就可以了,對entity的數量,長度都沒有具體的要求,(不過這個觀點僅從一篇論文A Review of Relation Extraction 中得出,如果這樣的relation跨越了整個document,根據其他論文的觀點很可能會命名爲event extraction/event detection.
Relation extraction type:
number of entity:
- binary relation:two entities with one relation
- e.g.located-in(CMU, Pittsburgh)
- higher order relation: more than 2 entities and one relation.
- e.g.
Sentence: At codons 12, the occurence of point mutations from G to T were observed
Relation: point mutation(codon, 12, G, T)4 entities
Paper 2005:Simple algorithms for complex relation extraction with applications to biomedical ie
range of extraction:
- sentence-level
- 基本上非常多的論文都是關於sentence-level的,因爲數據集最多,問題最簡單,方法可操作性多。主要集中在方法的改進上
- document-level (to be revised)
- 主要是基於數據集的改進, 對針對sentence-level的方法來說是很大的挑戰。
Sentence level relation extraction main stream methods
在深度學習盛行之前,根據其他論文的related work,可以將
總的分成feature-based和kernel based (這段時間mainly focus on deep learning method)
所以不提供paper notes和brief description了,僅提供link。
deep learning methods:
- RNN based method
- LSTM 1997
- LSTM-based Context Aware Encoder 2017
- CNN based methods
- CNN with position features & lexical features & sentence features. 2014
Paper: Relation Classification via Convolutional Deep Neural Network
Paper Notes
- Piece-wise CNN with Multi-instance Learning 2015
Paper: Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks
Paper Notes
Feature based method:
Feature to extract
- (1) the entities themselves
- (2) the types of the two entities
- (3) word sequence between the entities
- (4) number of words between the entities
- (5) path in the parse tree containing the two entities.
typical methods
- log-linear model:for entity classification:
paper 2004 :Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations
- SVMs using polynomial and linear kernels
paper 2002:Exploring various knowledge in relation extraction
paper 2005:Extracting relations with integrated information using kernel methods
Kernel-based method:
- single kernel & string kernels
Paper 2002:Text classification using string kernels
- Bag of features Kernel ( 3 subkernels defined)
Paper 2005:Subsequence kernels for relation extraction
- Tree Kernels
Paper 2003: Kernel methods for relation extraction
Paper 2004: Dependency tree kernels for relation extraction.
Relation Extraction Learning Paradigm
distant supervision
不用標註的數據,只需要通過一些 general domain的relation數據集,通過數據集的supervise來進行訓練和測試。
Paper 2009: Distant supervision for relation extraction without labeled data
Paper Notes
Paper: Distant Supervision for Relation Extraction with Sentence-Level Attention and Entity Descriptions
semi-supervision
- Bootstrap method
- DIPRE(Dual Iterative Pattern Relation Expansion)
Paper 1998:Extracting patterns and relations from the world wide web.
- Snowball
Paper 2000:Snowball: Extracting relations from large plain-text collections
- KnowItAll
Paper 2005:Unsupervised Named-Entity Extraction from the Web: An Experimental Study
- TextRunner
Paper 2007:Open information extraction from the web.
supervision
上述的kernel-based & feature based
- without distant supervision entitled deep learning method papers.
unsupervision(to be revised)
這個不在研究範圍,暫時不寫了