問題:Logstash Aggregate使用聚合,發現數組中的數據會錯亂覆蓋
原因:過濾器默認是多線程運行,所以聚合數據會錯亂
Description
The aim of this filter is to aggregate information available among several events (typically log
lines) belonging to a same task, and finally push aggregated information into final task event.
You should be very careful to set Logstash filter workers to 1 (-w 1 flag) for this filter to work
correctly otherwise events may be processed out of sequence and unexpected results will occur.
翻譯:
說明
此篩選器的目的是在多個事件(通常是日誌)中聚合可用的信息
(行)屬於同一任務,並最終將聚合信息推送到最終任務事件中。
您應該非常小心地將Logstash filter workers設置爲1(-w 1標誌),以使此篩選器工作
正確無誤,否則事件可能會按順序處理,並會出現意外結果。
官網有指出需要設置過濾器線程數爲1,否則會有問題
解決方案:
啓動的時候加上-w 1
示例:
logstash -w 1 -f logstash.conf
聚合示例:
filter {
#這裏做聚合
aggregate {
task_id => "%{id}"
code => "
map['id'] = event.get('id')
#input中的type字段,用於判斷
map['type'] = event.get('type')
map['name'] = event.get('name')
map['test_list'] ||=[]
map['tests'] ||=[]
#判斷是否爲空
if (event.get('test_id') != nil)
#用於去重,也可以在sql語句中去重
if !(map['test_list'].include? event.get('test_id'))
map['test_list'] << event.get('test_id')
map['tests'] << {
'test_id' => event.get('test_id'),
'test_name' => event.get('test_name')
}
end
end
event.cancel()
"
push_previous_map_as_event => true
timeout => 5
}
}
參考來源:
Logstash同步mysql一對多數據到ES(踩坑日記系列):
https://blog.csdn.net/menglinjie/article/details/102984845