flume实时采集MySQL数据到kafka

1、flume安装(省略)

2、kafka安装(省略)

3、需要用到插件flume-ng-sql-source

(1)下载flume-ng-sql-source点击下载
(2)将下载的jar包放到flume的lib目录

注意:不同插件版本与不同的flume版本存在匹配问题(亲测flume-ng-sql-source-1.4.1.jar与flume-1.6.0可以共同使用)

4、在${FLUME_HOME}/conf下创建mysql_conf.properties

可参考(1):

agent.sources = sql-source
a1.channels = ch1
a1.sinks = kafka

a1.sources.sql-source.channels = ch1
a1.sources.sql-source.connection.url = jdbc:mysql://192.168.1.185:3306/dataassets
a1.sources.sql-source.type = org.keedio.flume.source.SQLSource
a1.sources.sql-source.user = root
a1.sources.sql-source.password = xxx@2020
a1.sources.sql-source.table = datadictionary
a1.sources.sql-source.columns.to.select = *
a1.sources.sql-source.incremental.column.name = id
a1.sources.sql-source.incremental.value = 0
a1.sources.sql-source.run.query.delay=5000
a1.sources.sql-source.status.file.path = /var/lib/flume
a1.sources.sql-source.status.file.name = sql-source.status

a1.channels.ch1.type = memory
a1.channels.ch1.capacity = 10000
a1.channels.ch1.transactionCapacity = 10000
a1.channels.ch1.byteCapacityBufferPercentage = 20
a1.channels.ch1.byteCapacity = 800000

a1.sinks.kafka.channel = ch1
a1.sinks.kafka.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.kafka.topic = kafkatopic
a1.sinks.kafka.brokerList = 192.168.1.183:19092,192.168.1.184:19092,192.168.1.185:19092
a1.sinks.kafka.requiredAcks = 1
a1.sinks.kafka.batchSize = 20

可参考(2):

a1.channels = ch-1
a1.sources = src-1
a1.sinks = k1

#sql source
#For each one of the sources, the type is defined

a1.sources.src-1.type = org.keedio.flume.source.SQLSource
a1.sources.src-1.hibernate.connection.url = jdbc:mysql://192.168.1.185:3306/dataassets

#Hibernate Database connection properties
a1.sources.src-1.hibernate.connection.user = root
a1.sources.src-1.hibernate.connection.password = xxx@2020
a1.sources.src-1.hibernate.connection.autocommit = true
a1.sources.src-1.hibernate.dialect = org.hibernate.dialect.MySQL5Dialect
a1.sources.src-1.hibernate.connection.driver_class = com.mysql.jdbc.Driver
a1.sources.src-1.run.query.delay=5000
a1.sources.src-1.status.file.path = /opt/bigdata/flume-1.5.2-bin/status
a1.sources.src-1.status.file.name = src-1.status

#Custom query
a1.sources.src-1.start.from = 0
a1.sources.src-1.incremental.column.name = id
a1.sources.src-1.incremental.value = 0
a1.sources.src-1.custom.query = select * from datadictionary
a1.sources.src-1.batch.size = 1000
a1.sources.src-1.max.rows = 1000
a1.sources.src-1.hibernate.connection.provider_class = org.hibernate.connection.C3P0ConnectionProvider
a1.sources.src-1.hibernate.c3p0.min_size=1
a1.sources.src-1.hibernate.c3p0.max_size=10

################################################################
a1.channels.ch-1.type = memory
a1.channels.ch-1.capacity = 10000
a1.channels.ch-1.transactionCapacity = 10000
a1.channels.ch-1.byteCapacityBufferPercentage = 20
a1.channels.ch-1.byteCapacity = 800000

################################################################
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic = kafkatopic
a1.sinks.k1.brokerList = cdh:19092,cdh02:19092,cdh03:19092
a1.sinks.k1.requiredAcks = 1
a1.sinks.k1.batchSize = 20
a1.sinks.k1.channel = c1

a1.sinks.k1.channel = ch-1
a1.sources.src-1.channels=ch-1

5、添加mysql驱动mysql-connector-java-5.1.48.jar到flume的lib目录下

6、添加kafka的topic

kafka-topics.sh --create --partitions 3 --replication-factor 2 --topic kafkatopic --zookeeper cdh01:2181,cdh02:2181,cdh03:2181

7、启动flume agent

./flume-ng agent -n a1 -c conf -f ../conf/mysql_conf.properties -Dflume.root.logger=INFO,console

8、查看topic数据

./kafka-console-consumer.sh --bootstrap-server cdh01:29092,cdh02:29092,cdh03:29092 --topic kafkatopic --from-beginning
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章