sentiment analysis

定義來自Sentiment Analysis and Opinion Mining 2.1節
Definition (Opinion): An opinion is a quadruple,
(g, s, h, t),
where g is the opinion (or sentiment) target, can be any entity or aspect of the entity s is the sentiment about the target, h is the opinion holder or opinion source and t is the time when the opinion was expressed.

Definition (entity): An entity e is a product, service, topic, issue, person, organization, or event. It is described with a pair, e: (T, W),where T is a hierarchy of parts, sub-parts and so on, and W is a set of attributes of e. Each part or sub-part also has its own set of attributes.
we simplify the hierarchy to two levels and use the term aspects to denote both parts and attributes. In the simplified tree, the root node is still the entity itself, but the second level(also the leaf level) nodes are different aspects of the entity.
example: from http://alt.qcri.org/semeval2015/task12/
(1) It fires up in the morning in less than 30 seconds and I have never had any issues with it freezing. → {LAPTOP#OPERATION_PERFORMANCE}
(2) Sometimes you will be moving your finger and the pointer will not even move. → {MOUSE#OPERATION_PERFORMANCE}

包含了entity的抽取,聚類,ranking等問題
抽取可用的方法:
1、基於規則的抽取,可以根據情感詞和entity之間的關係來抽取
2、基於sequence模型
3、基於主題模型。

用stanford parser分析依存關係,然後設計語法規則,抽取修飾aspect(在訓練集合中已經標記出來了)的表達式,然後通過SVM來訓練。使用的feature如下:
1. POS,詞的詞性
2.上面提到的語法關係
3.情感詞的極性。建立了情感詞詞典,包括sentiWordNet,MPQA,eBLR(由於情感詞的極性有些是領域相關的,所以採用corps based方法:如果一個詞在訓練集合中只出有positive且頻率超過一定值,就把他加入positive列表,negative列表也是如此建立,對於即有positive也有negtive的情況,則如果P比N的頻率高則認爲是P)

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章