- 1、安裝
pip install pattern
- 2、功能特點
爬蟲+自然語言處理+圖譜(如果沒理解錯的話)
- 3、自然語言處理
包括六種語言en | es | de | fr | it | nl
具體關注英語
(1)Parser
TAG CHUNK(組塊分析) ROLE(角色標註) POS(詞性標註)
(2)文本分類
(polarity, subjectivity)
fact opinion
positive negative
(3)wordnet
詞的定義,近義詞。。。詞的相似度
(4)常用詞表
ACADEMIC | English academic words | 500 | criterion, proportionally, research |
BASIC | English basic words | 1,000 | chicken, pain, road |
PROFANITY | English swear words | 350 | |
TIME | English time & date words | 100 | Christmas, past, saturday |