Stanford parser入門1：單句中文句法分析

原創

2018-09-03 18:02

開發工具：win10 + java8(jdk-8u111) + stanford-parser-full-2015-12-09

在eclipse中運行standfordparser官方java例程請參考“使用StanfordParser進行句法分析”一文。其中，以ParserDemo.java爲例，在Eclipse中右鍵點擊ParserDemo.java文件，設置運行參數Arguments爲：

edu/stanford/nlp/models/lexparser/chinesePCFG.ser.gz data/chinese-onesent-utf8.txt

如此，可進行中文句法分析。

這裏是一個簡單中文句法分析的例子。

1.在Stanford官方網站下載最新安裝包

http://nlp.stanford.edu/software/lex-parser.html#Download

2.解壓下載後的zip包 stanford-parser-full-2015-12-09.zip，裏面會有數據，依賴包以及demo，還有相關的source code和java doc

3.使用Eclipse創建項目，名爲stanfordparser，在build path中引入stanford-parser-3.6.0-models.jar，stanford-parser.jar，slf4j-simple.jar， slf4j-api.jar

4.從步驟2中解壓的文件中把data文件夾複製到Eclipse項目中，新建ParserTest1.java類，代碼如下：

import java.io.IOException;
import edu.stanford.nlp.parser.lexparser.LexicalizedParser;
import edu.stanford.nlp.trees.Tree;
 
public class ParserTest1 {
public static void main(String[]args)throws IOException {
//    String grammar = "edu/stanford/nlp/models/lexparser/chineseFactored.ser.gz";
      String grammar ="edu/stanford/nlp/models/lexparser/chinesePCFG.ser.gz";
      String[] options = {};
      LexicalizedParser lp = LexicalizedParser.loadModel(grammar,options);
      String line ="這 是 一個 簡單 的 例子";
      Tree parse =lp.parse(line);
      parse.pennPrint();
      String[] arg2 = {"-encoding","utf-8",
             "-outputFormat","penn,typedDependenciesCollapsed",
          "edu/stanford/nlp/models/lexparser/chineseFactored.ser.gz",
             "data/chinese-onesent-utf8.txt"};
      LexicalizedParser.main(arg2);
   }
}

5.運行，輸出的結果爲：

[main] INFOedu.stanford.nlp.parser.lexparser.LexicalizedParser - Loading parser fromserialized file edu/stanford/nlp/models/lexparser/chinesePCFG.ser.gz ...

done [1.6 sec].

(ROOT

(IP

(NP (PN這))

(VP (VC是)

(NP

(QP (CD一個))

(CP

(IP

(VP (VA 簡單)))

(DEC的))

(NP (NN例子))))))

[main] INFOedu.stanford.nlp.parser.lexparser.LexicalizedParser - Loading parser fromserialized file edu/stanford/nlp/models/lexparser/chineseFactored.ser.gz ...

done [6.0 sec].

Parsing file:data/chinese-onesent-utf8.txt

Parsing [sent. 1 len. 8]:俄國希望伊朗沒有製造核武器計劃。

(ROOT

(IP

(NP (NR俄國))

(VP (VV希望)

(IP

(NP (NR伊朗))

(VP

(ADVP (AD 沒有))

(VP (VV製造)

(NP (NN 核武器) (NN計劃))))))

(PU。)))

nsubj(希望-2,俄國-1)

root(ROOT-0,希望-2)

nsubj(製造-5,伊朗-3)

neg(製造-5,沒有-4)

ccomp(希望-2,製造-5)

nn(計劃-7,核武器-6)

dobj(製造-5,計劃-7)

Parsed file: data/chinese-onesent-utf8.txt[1 sentences].

Parsed 8 words in 1 sentences (22.66 wds/sec;2.83 sents/sec).

參考資料：

stanford parser使用說明

http://blog.csdn.net/u010454729/article/details/46845403

使用Stanford Parser進行句法分析

http://www.cnblogs.com/Denise-hzf/p/6612574.html

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Stanford parser入門1：單句中文句法分析

NLTK入門1：簡單句子結構分析

NLP | 自然語言處理 - 語法解析（Parsing, and Context-Free Grammars）

[NLP]1.StanfordNLP的安裝和初探

C++ 工程師面試體驗

乾貨！詳述Python NLTK下如何使用stanford NLP工具包

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結