Usage: nutch COMMAND
where COMMAND is one of:inject inject new urls into the database :注入新的url到數據庫中
hostinject creates or updates an existing host table from a text file :從一個文本文件中創建或更新現有主機表
generate generate new batches to fetch from crawl db: 生成新的批次從db爬取
fetch fetch URLs marked during generate:獲取url標記中生成
updatedb update web table after parsing:解析後更新網絡表
updatehostdb update host table after parsing:解析後更新主機表
readdb read/dump records from page database:從頁面數據庫中讀取和轉儲記錄
readhostdb display entries from the hostDB:從主機數據庫中顯示實體
elasticindex run the elasticsearch indexer:運行elasticsearch索引器
solrindex run the solr indexer on parsed batches:在解析批次時運行solr索引器
solrdedup remove duplicates from solr:從solr中移出多重記錄(副本、去重!)
parsechecker check the parser for a given url:對於一個給定的url檢查解析
indexchecker check the indexing filters for a given url:對於一個給定的過濾器,檢查索引過濾器
plugin load a plugin and run one of its classes main():加載插件和運行它的主類
nutchserver run a (local) Nutch server on a user defined port:在用戶定義端口上運行一個本地nutch服務器
junit runs the given JUnit test:運行一個給定的junit單元測試
or
CLASSNAME run the class named CLASSNAME:運行類命名的類名
Most commands print help when invoked w/o parameters:大多數命令在調用時打印幫助參數。
cat nutch|wc -l:此行命令的作用是:統計nutch腳本的行數,結果是244行