由於業務系統使用的數據庫是Mysql,需要對業務進行大數據分析這就要求我們實時採集MySQL的數據。使用flume採集MySQL數據配置較簡單,下面是配置的過程。
插件下載
需要的插件
- mysql-connector-java-5.1.46-bin.jar
- flume-ng-sql-source-1.4.1.jar
- 這兩個軟件需要拷貝到 /usr/local/flume/lib
flume配置
agent.sources.s1.type=org.keedio.flume.source.SQLSource
agent.sources.s1.hibernate.connection.url=jdbc:mysql://localhost:3306/tickapi?useOldAliasMetadataBehavior=true
agent.sources.s1.hibernate.connection.user=root
agent.sources.s1.hibernate.connection.password=123456
agent.sources.s1.hibernate.connection.autocommit=true
agent.sources.s1.hibernate.connection.driver_calss=com.mysql.jdbc.Driver
agent.sources.s1.hibernate.dialect=org.hibernate.dialect.MySQL5Dialect
agent.sources.s1.hibernate.provider_class=org.hibernate.connection.C3P0ConnectionProvider
agent.sources.s1.run.query.delay=5000
# 增量配置
# agent.sources.s1.table=lt_api_getallstops
# agent.sources.s1.columns.to.select=*
# agent.sources.s1.incremental.column.name=id
# agent.sources.s1.incremental.value=0
agent.sources.s1.custom.query=select * from lt_api_getallstops order by id desc
# where id > $@$ order by id 注意加了此段SQL會報錯:SQL語句異常。
agent.sources.s1.start.from=0 #增量列的初始值
agent.sources.s1.status.file.path=/home/flume
agent.sources.s1.status.file.name=sql-source.status
採集到的效果
# kafka發送的數據
"1121","","99999999","CES","測試","","7214","盛汽車客運站","826f023cfa2a50feb92a958bcb16cdf0"
"1120","","75010101","BS","山","","7214","盛汽車客運站","826f023cfa2a50feb92a958bcb16cdf0"
"1119","","71010101","KM","明","","7213","汽車客運北站","826f023cfa2a50feb92a958bcb16cdf0"
"1118","西站","4659","","山","","71010101","西部客運站","826f023cfa2a50feb92a958bcb16cdf0"