Logstash Aggregate使用聚合問題

問題:Logstash Aggregate使用聚合,發現數組中的數據會錯亂覆蓋

原因:過濾器默認是多線程運行,所以聚合數據會錯亂

官網原話:https://www.elastic.co/guide/en/logstash/current/plugins-filters-aggregate.html#plugins-filters-aggregate-description

Description
The aim of this filter is to aggregate information available among several events (typically log 
lines) belonging to a same task, and finally push aggregated information into final task event.

You should be very careful to set Logstash filter workers to 1 (-w 1 flag) for this filter to work 
correctly otherwise events may be processed out of sequence and unexpected results will occur.

翻譯:

說明

此篩選器的目的是在多個事件(通常是日誌)中聚合可用的信息

(行)屬於同一任務,並最終將聚合信息推送到最終任務事件中。

您應該非常小心地將Logstash filter workers設置爲1(-w 1標誌),以使此篩選器工作

正確無誤,否則事件可能會按順序處理,並會出現意外結果。

 

官網有指出需要設置過濾器線程數爲1,否則會有問題

 

解決方案:

啓動的時候加上-w 1

示例:

logstash -w 1 -f logstash.conf

聚合示例:

filter {
	#這裏做聚合
     aggregate {
        task_id => "%{id}"
        code => "
            map['id'] = event.get('id')
            #input中的type字段,用於判斷
            map['type'] = event.get('type')
            map['name'] = event.get('name')
            map['test_list'] ||=[]
            map['tests'] ||=[]
			#判斷是否爲空
            if (event.get('test_id') != nil)
				#用於去重,也可以在sql語句中去重
                if !(map['test_list'].include? event.get('test_id'))  
                    map['test_list'] << event.get('test_id')        
                    map['tests'] << {
                        'test_id' => event.get('test_id'),
                        'test_name' => event.get('test_name')
                    }
                end
            end
            event.cancel()
        "
        
        push_previous_map_as_event => true
        timeout => 5
    }
}

參考來源:
Logstash同步mysql一對多數據到ES(踩坑日記系列):

https://blog.csdn.net/menglinjie/article/details/102984845

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章