目標是把數據放到Kafka裏面去,然後通過Telegraf進行轉化,把數據放到InfluxDB裏面去,通過配置這些中間件去完成。
以下操作都是在Mac上完成
1. Kafka準備好
已啓動,常用端口9092
Client使用Kafka Tool
2. InfluxDB準備好
已啓動,常用端口8086
常用命令記住
show databases
use [database_name]
show measurements
SELECT * FROM "telegraf"."autogen"."kafka_consumer"
3. 安裝Telegraf
brew update
brew install telegraf
在InfluxDB中
create user "telegraf" with password "telegraf"
修改配置文件,配置InfluxDB
[[outputs.influxdb]]
urls = ["http://localhost:8086"] # required
database = "telegraf" # required
retention_policy = ""
precision = "s"
timeout = "5s"
username = "telegraf"
password = "password"
啓動
telegraf -config /usr/local/etc/telegraf.conf
4. Telegraf 配置 Kafka
配置可參考官方給的一些建議
https://github.com/influxdata/telegraf/tree/master/plugins/inputs/kafka_consumer
下面是從配置文件中取出來的配置
# # Read metrics from Kafka topics
# 下面這段請放開,一定放開
[[inputs.kafka_consumer]]
# ## Kafka brokers.
# 這個地方需要設置
brokers = ["192.168.1.119:9092"]
#
# ## Topics to consume.
# 必須設置
topics = ["tstkafkainflux"]
#
# ## When set this tag will be added to all metrics with the topic as the value.
# 必須設置
topic_tag = "tstkafkainflux"
#
# ## Optional Client id
# # client_id = "Telegraf"
#
# ## Set the minimal supported Kafka version. Setting this enables the use of new
# ## Kafka features and APIs. Must be 0.10.2.0 or greater.
# ## ex: version = "1.1.0"
# # version = "0.10.2.0"
#
# ## Optional TLS Config
# # enable_tls = true
# # tls_ca = "/etc/telegraf/ca.pem"
# # tls_cert = "/etc/telegraf/cert.pem"
# # tls_key = "/etc/telegraf/key.pem"
# ## Use TLS but skip chain & host verification
# # insecure_skip_verify = false
#
# ## SASL authentication credentials. These settings should typically be used
# ## with TLS encryption enabled using the "enable_tls" option.
# # sasl_username = "kafka"
# # sasl_password = "secret"
#
# ## SASL protocol version. When connecting to Azure EventHub set to 0.
# # sasl_version = 1
#
# ## Name of the consumer group.
# # consumer_group = "telegraf_metrics_consumers"
#
# ## Initial offset position; one of "oldest" or "newest".
offset = "oldest"
#
# ## Consumer group partition assignment strategy; one of "range", "roundrobin" or "sticky".
# # balance_strategy = "range"
#
# ## Maximum length of a message to consume, in bytes (default 0/unlimited);
# ## larger messages are dropped
# max_message_len = 1000000
#
# ## Maximum messages to read from the broker that have not been written by an
# ## output. For best throughput set based on the number of metrics within
# ## each message and the size of the output's metric_batch_size.
# ##
# ## For example, if each message from the queue contains 10 metrics and the
# ## output metric_batch_size is 1000, setting this to 100 will ensure that a
# ## full batch is collected and the write is triggered immediately without
# ## waiting until the next flush_interval.
# # max_undelivered_messages = 1000
#
# ## Data format to consume.
# ## Each data format has its own unique set of configuration options, read
# ## more about them here:
# ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
# 這個地方需要設置的,目前我這樣設置是可行的。具體請見上面註釋的鏈接
data_format = "value"
data_type = "string"
5. 演示
Python代碼丟消息給Kafka
from kafka import KafkaProducer, KafkaConsumer
import time
if __name__ == '__main__':
producer = KafkaProducer(bootstrap_servers='192.168.1.119:9092')
value = "hereTime" + time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
producer.send('tstkafkainflux', key=b"abcde", value=bytes(value, encoding = "utf8"), partition=0)
producer.flush()
去KafKa Tool查詢看看,記錄結果如圖
去InfluxDB進行查詢,記錄結果如圖
6. Telegraf 支持的output和input服務
- influxdb
- amon
- ampq
- application_insights
- azure_monitor
- cloud_pubsub
- cloudwatch
- cratedb
- datadog
- discard
- elasticsearch
- exec
- file
- graphite
- graylog
- health
- influxdb_v2
- kafka
- nats
- nsq
- opentsdb
反正挺多。。。