在搭建Elasticsearch數據庫的過程中,首先使用了其推薦的Logstash工具導入數據,但是用起來非常不順手,所以想用Perl擅長的正則表達式對數據進行過濾分類,然後導入Elasticsearch,於是搜索CPAN找到了Search::Elasticsearch模塊。
該模塊在CPAN上的文檔寫的比較簡潔,於是將使用過程中的經驗總結如下:
一、逐條數據寫入:
use Search::Elasticsearch; my $e=Search::Elasticsearch->new(nodes=>['localhost:9200']); $e->index( index=>"$index_name", type=>"$type_name", id=>"$id_name", body=>{ title=>"$data_name", data=>"$data" } );
二、批量數據寫入:
use Search::Elasticsearch; my $e=Search::Elasticsearch->new(nodes=>['localhost:9200']); my $bulk=$e->bulk_helper( index=>"$index_name", type=>"$type_name" ); my $i=0; while(...){ #do something $bulk->add_action(index=>{id=>$id_name,source=>{title =>$data_name,data=>$data}}); if ($i>999){ $bulk->flush; $i=0; } $i++; }
三、讀取一條記錄:
use Search::Elasticsearch; my $e=Search::Elasticsearch->new(nodes=>['localhost:9200']); my $doc=$e->get( index=>"$index_name", type=>"$type_name", id=>"$id_name" ); my $data=$doc->{_source}->{$data_name}; #do something
四、依次讀取全部記錄:
use Search::Elasticsearch; my $e=Search::Elasticsearch->new(nodes=>['localhost:9200']); my $scroll=$e->scroll_helper( index=>"$index_name", type=>"$type_name", body=>{ query=>{match_all=>{}}, size=>5000 } ); while (my $doc=$scroll->next){ my $id=$doc->{_id}; my $data=$doc->{_source}->{$data_name}; #do something }
五、跳轉到第$n條數據開始讀取
my $doc=$scroll->next($n);
六、基本數據查詢
use strict; use Search::Elasticsearch; my $e = Search::Elasticsearch->new(nodes => ['localhost:9200']); my $results = $e->search( index => $index_name, body => { query => { query_string => { query => $search } } } ); print $results->{hits}->{hits}->[0]->{_source}->{word};