flume實時採集MySQL數據到kafka

1、flume安裝(省略)

2、kafka安裝(省略)

3、需要用到插件flume-ng-sql-source

(1)下載flume-ng-sql-source點擊下載
(2)將下載的jar包放到flume的lib目錄

注意:不同插件版本與不同的flume版本存在匹配問題(親測flume-ng-sql-source-1.4.1.jar與flume-1.6.0可以共同使用)

4、在${FLUME_HOME}/conf下創建mysql_conf.properties

可參考(1):

agent.sources = sql-source
a1.channels = ch1
a1.sinks = kafka

a1.sources.sql-source.channels = ch1
a1.sources.sql-source.connection.url = jdbc:mysql://192.168.1.185:3306/dataassets
a1.sources.sql-source.type = org.keedio.flume.source.SQLSource
a1.sources.sql-source.user = root
a1.sources.sql-source.password = xxx@2020
a1.sources.sql-source.table = datadictionary
a1.sources.sql-source.columns.to.select = *
a1.sources.sql-source.incremental.column.name = id
a1.sources.sql-source.incremental.value = 0
a1.sources.sql-source.run.query.delay=5000
a1.sources.sql-source.status.file.path = /var/lib/flume
a1.sources.sql-source.status.file.name = sql-source.status

a1.channels.ch1.type = memory
a1.channels.ch1.capacity = 10000
a1.channels.ch1.transactionCapacity = 10000
a1.channels.ch1.byteCapacityBufferPercentage = 20
a1.channels.ch1.byteCapacity = 800000

a1.sinks.kafka.channel = ch1
a1.sinks.kafka.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.kafka.topic = kafkatopic
a1.sinks.kafka.brokerList = 192.168.1.183:19092,192.168.1.184:19092,192.168.1.185:19092
a1.sinks.kafka.requiredAcks = 1
a1.sinks.kafka.batchSize = 20

可參考(2):

a1.channels = ch-1
a1.sources = src-1
a1.sinks = k1

#sql source
#For each one of the sources, the type is defined

a1.sources.src-1.type = org.keedio.flume.source.SQLSource
a1.sources.src-1.hibernate.connection.url = jdbc:mysql://192.168.1.185:3306/dataassets

#Hibernate Database connection properties
a1.sources.src-1.hibernate.connection.user = root
a1.sources.src-1.hibernate.connection.password = xxx@2020
a1.sources.src-1.hibernate.connection.autocommit = true
a1.sources.src-1.hibernate.dialect = org.hibernate.dialect.MySQL5Dialect
a1.sources.src-1.hibernate.connection.driver_class = com.mysql.jdbc.Driver
a1.sources.src-1.run.query.delay=5000
a1.sources.src-1.status.file.path = /opt/bigdata/flume-1.5.2-bin/status
a1.sources.src-1.status.file.name = src-1.status

#Custom query
a1.sources.src-1.start.from = 0
a1.sources.src-1.incremental.column.name = id
a1.sources.src-1.incremental.value = 0
a1.sources.src-1.custom.query = select * from datadictionary
a1.sources.src-1.batch.size = 1000
a1.sources.src-1.max.rows = 1000
a1.sources.src-1.hibernate.connection.provider_class = org.hibernate.connection.C3P0ConnectionProvider
a1.sources.src-1.hibernate.c3p0.min_size=1
a1.sources.src-1.hibernate.c3p0.max_size=10

################################################################
a1.channels.ch-1.type = memory
a1.channels.ch-1.capacity = 10000
a1.channels.ch-1.transactionCapacity = 10000
a1.channels.ch-1.byteCapacityBufferPercentage = 20
a1.channels.ch-1.byteCapacity = 800000

################################################################
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic = kafkatopic
a1.sinks.k1.brokerList = cdh:19092,cdh02:19092,cdh03:19092
a1.sinks.k1.requiredAcks = 1
a1.sinks.k1.batchSize = 20
a1.sinks.k1.channel = c1

a1.sinks.k1.channel = ch-1
a1.sources.src-1.channels=ch-1

5、添加mysql驅動mysql-connector-java-5.1.48.jar到flume的lib目錄下

6、添加kafka的topic

kafka-topics.sh --create --partitions 3 --replication-factor 2 --topic kafkatopic --zookeeper cdh01:2181,cdh02:2181,cdh03:2181

7、啓動flume agent

./flume-ng agent -n a1 -c conf -f ../conf/mysql_conf.properties -Dflume.root.logger=INFO,console

8、查看topic數據

./kafka-console-consumer.sh --bootstrap-server cdh01:29092,cdh02:29092,cdh03:29092 --topic kafkatopic --from-beginning
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章