1. fluentd

fluentd是一個針對日誌的收集、處理、轉發系統。通過豐富的插件系統，可以收集來自於各種系統或應用的日誌，轉化爲用戶指定的格式後，轉發到用戶所指定的日誌存儲系統之中。

fluentd 常常被拿來和Logstash比較，我們常說ELK，L就是這個agent。fluentd 是隨着Docker，GCP 和es一起流行起來的agent。

這篇文章裏概括一下的話，有以下區別：

fluentd 比 logstash 更省資源；
更輕量級的 fluent-bid 對應 filebeat，作爲部署在結點上的日誌收集器；
fluentd 有更多強大、開放的插件數量和社區。插件列表這一點值得多說，插件太多了，也非常靈活，規則也不復雜。

安裝的話。mac 自帶gem。 sudo gem install fluentd即可完成安裝。

2. fluentd helloword

Fluent 配置文件

<source>
  @type http
  port 4888
  bind 0.0.0.0
</source>

<match test.cycle>
  @type stdout
</match>

啓動fluentd： fluentd -c in_http.conf

發起http請求：curl -i -X POST -d ‘json={“action”:“login”,“user”:2}’ http://localhost:4888/test.cycle

3. fluent 的事件

tag：事件來自何處，是如何被路由的依據。
time：時間戳

record：消息體，也就是具體的日誌內容，json格式。

事件是由input plugin 負責去產生的，例如 in_tail 這個plugin，從文件末尾讀取文本行

192.168.0.1 - - [28/Feb/2013:12:00:00 +0900] "GET / HTTP/1.1" 200 777

事件的內容將會是：

tag: apache.access # set by configuration
time: 1362020400   # 28/Feb/2013:12:00:00 +0900
record: {"user":"-","method":"GET","code":200,"size":777,"host":"192.168.0.1","path":"/"}

4. 事件的處理

match

以上述爲例，match 就是通過 tag 匹配 source，然後執行指定的命令來分發日誌，最常見的用法就是將 source 收集的日誌轉存到數據庫。上例中的 test.cycle 就是 tag，tag 有好幾種匹配模式：
- *：匹配任意一個 tag；
- **：匹配任意數量個 tag；
- a b：匹配 a 或 b；
- {X,Y,Z}：匹配 X, Y, Z 中的一個。
Match 的匹配順序：

match 是從上往下依次匹配的，一旦一個日誌流被匹配上，就不會再繼續匹配剩下的 match 了。所以如果有 <match **> 這樣的全匹配，一定要放到配置文件的最後。

match 不僅僅用來處理輸出，還有一些高級功能，比如日誌事件進行一些處理後重新拋出，當成一個新的事件從新走一遍流程。
filter

filter 和 match 的語法幾乎一樣，但是多個filter 可以串聯成 pipeline，對數據進行串行處理，最終再交給 match 輸出。

我們對上述的 in_http.conf 作下修改，增加：
```
<filter test.cycle>
  @type record_transformer
  <record>
    host "#{Socket.gethostname}"
  </record>
</filter>
<filter test.cycle>
  @type stdout
</filter>
<filter test.cycle>
  @type grep
  <exclude>
    key action
    pattern ^logout$
  </exclude>
</filter>
```
然後重啓fluentd。

注意：
- 左側有三個請求，其中中間的請求在經過第三個filter的時候，被中斷掉了，且每一個事件被新增了主機名稱。所以三個請求，標準輸出產生了5個記錄。
- filter 匹配順序與 match 相同，我們應該在之前放置
- Input -> filter 1 -> … -> filter N -> Output（Match tag）

label

當我們的規則變得複雜以後，文件開始會變得複雜，並且不易閱讀。label是一種新的方式能夠改善這個問題。我們對上述文件稍作修改。

<source>
  @type http
  port 4888
  bind 0.0.0.0
</source>

<source>
  @type http
  port 4887
  bind 0.0.0.0
  @label @ADD
</source>

<filter test.cycle>
  @type stdout
</filter>
<filter test.cycle>
  @type grep
  <exclude>
    key action
    pattern ^logout$
  </exclude>
</filter>
<label @ADD>
  <filter test.cycle>
    @type record_transformer
    <record>
      host "#{Socket.gethostname}"
    </record>
  </filter>
    <match test.cycle>
     @type stdout
    </match>
</label>
<match test.cycle>
  @type stdout
</match>

很快會發現區別在哪兒：

第二個請求，我們請求的是4887端口，它打了一個label，叫做@ADD，那麼這個 source 所觸發的事件就會被髮送給指定的 ADD label 所包含的任務，而跳過被其緊跟其後的其他任務。

buffers

上邊的例子，我們使用了非-buffered 的stdout 輸出。但在生產中，你可以在buffered模式下使用輸出，例如 es，forward，mongodb，s3等。buffered模式下的輸出插件首先將接收到的事件存儲到緩衝區中（“內存”或“file”）當滿足刷新條件時纔將緩衝區寫入目的地址。因此，使用緩衝輸出，與stdout非緩衝輸出不同，我們不會立即看到接收到的事件。

這個插件不好演示，我不演示了。

算了，還是演示下吧。修改下，label @ADD的部分。
```
<label @ADD>
  <filter test.cycle>
    @type record_transformer
    <record>
      host "#{Socket.gethostname}"
    </record>
  </filter>
  <match test.cycle>
    @type file
    path /Users/xialingming/fluentd/myapp
    <buffer>
      timekey 100
      timekey_use_utc true
      timekey_wait 1m
    </buffer>
  </match>
</label>
```
這裏的含義是，把發給4887的日誌，經過filter組成的pipeline以後，最終會輸出到file類型裏，path是必須的。如果你啥不配置的話，就每天的0:10左右，刷進去文件裏。所以，你配置了file的時候，會發現沒有馬上看到文件。
- timekey： 100 （單位是秒），指定的是多久會產生一個chunk。把時間軸按照100s去分割。
```
timekey 60: ["12:00:00", ..., "12:00:59"], ["12:01:00", ..., "12:01:59"], ...
timekey 180: ["12:00:00", ..., "12:02:59"], ["12:03:00", ..., "12:05:59"], ...
timekey 3600: ["12:00:00", ..., "12:59:59"], ["13:00:00", ..., "13:59:59"], ...
```
```
假如配置了 timekey = 1h， 那麼CHUNK分別這麼如下產生。

11:59:30 web.access {"key1":"yay","key2":100}  ------> CHUNK_A

12:00:01 web.access {"key1":"foo","key2":200}  --|
                                                 |---> CHUNK_B
12:00:25 ssh.login  {"key1":"yay","key2":100}  --|
```
- timekey_wait: 1m 是指當前chunk 結束了以後，延遲1m，才把該chunk flush到文件裏。比如，當你配置了，timekey: 3600以後，那麼啥時候產生最終落地的文件呢？如下圖所示。
```
timekey: 3600
 -------------------------------------------------------
 time range for chunk | timekey_wait | actual flush time
  12:00:00 - 12:59:59 |           0s |          13:00:00
  12:00:00 - 12:59:59 |     60s (1m) |          13:01:00
  12:00:00 - 12:59:59 |   600s (10m) |          13:10:00
```

Buffer 的保存路徑最終是這樣的：

$ tree /tmp/logs/
/tmp/logs/
├── ${tag}
│   ├── buffer.b57fb1dd96306dd0b308e094f7ec2228f.log
│   ├── buffer.b57fb1dd96306dd0b308e094f7ec2228f.log.meta
│   ├── buffer.b57fb1dd96339a870530991d4871cfe11.log
│   └── buffer.b57fb1dd96339a870530991d4871cfe11.log.meta
├── current-dummy1 -> /tmp/logs/${tag}/buffer.b57fb1dd96339a870530991d4871cfe11.log
└── current-dummy2 -> /tmp/logs/${tag}/buffer.b57fb1dd96306dd0b308e094f7ec2228f.log

#5.總結：

除了玲琅滿目官方插件之外，我們還可以自己去編寫靈活的插件。@type指定即可。

一旦事件由Fluentd 配置的Source產生，就可以一步步被瀑布流般地處理或在引用的Label內部處理，任何事件都可能隨時被過濾掉。

Fluentd的使用入門到熟練

1. fluentd

2. fluentd helloword

3. fluent 的事件

4. 事件的處理

Fluentd的使用入門到熟練

Elasticsearch索引調優和背後的原理

elasticsearch 索引翻滾

深入Docker容器運行時（一）

Kubeedge概述

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結