ORACLE OGG同步到KAFKA
一、介紹
Kafka是一種高效的消息隊列實現,通過訂閱kafka的消息隊列,下游系統可以實時獲取在線Oracle系統的數據變更情況,實現業務系統
ogg同步全量數據方式:
①通過數據泵方式基於SCN導出並導入到目標端,此方式用於Oracle 到Oracle的ogg同步環境中。
②通過ogg本身的初始化方式,初始化全量數據到目標端,此方式通用於所有環境,但是速度相對較慢。
二、ogg環境搭建
ogg環境搭建並沒太大變化,同oracle中相似
1.源端Oracle數據庫中創建用戶 表空間 賦權
2.源端Oracle數據庫檢查歸檔,附加日誌開啓
3.源端目標端安裝軟件(其中目標端安裝ogg for bigdata軟件),創建相關目錄,開啓表級附加日誌
4.源端目標端創建mgr進程,添加檢查點,管理端口相同
GGSCI (db1) 1> edit params mgr
port 7809
DYNAMICPORTLIST 7810-8000
AUTOSTART EXTRACT *
AUTORESTART EXTRACT *
PURGEOLDEXTRACTS ./dirdat/*,usecheckpoints, minkeepdays 7
LAGREPORTHOURS 1
LAGINFOMINUTES 30
LAGCRITICALMINUTES 45
dblogin userid ogg, password ogg
add checkpointtable ogg.checkpoint
edit params ./GLOBALS
checkpointtable ogg.checkpoint
GGSCHEMA ogg
三、全量同步
1.源端創建初始化進程
GGSCI (db1) 2> edit params initkfk1
EXTRACT initkfk1
setenv (NLS_LANG=AMERICAN_AMERICA.ZHS16GBK)
USERID ogg,PASSWORD ogg
rmthost 192.168.1.111, mgrport 7809
RMTFILE ./dirdat/ia,maxfiles 999, megabytes 500
table user_ogg.*
--添加進程
GGSCI (db1) 3>add extract initkfk1,sourceistable
2.目標端kafka配置
--添加全量應用進程
GGSCI (kafka) 2> ADD replicat initkfk1,specialrun
GGSCI (kafka) 3> edit params initkfk1
SPECIALRUN
end runtime
setenv(NLS_LANG="AMERICAN_AMERICA.AL32UTF8")
targetdb libfile libggjava.so set property=./dirprm/kafka.props
SOURCEDEFS ./dirdef/define_kfk1.txt
REPLACEBADCHAR SKIP
SOURCECHARSET OVERRIDE ISO-8859-1
EXTFILE ./dirdat/ia
reportcount every 1 minutes, rate
grouptransops 10000
map user.table, target user.table;
--修改kafka props文件
vi ./dirprm/kafka.props
gg.handlerlist=kafkahandler
gg.handler.kafkahandler.type=kafka
gg.handler.kafkahandler.format.includePrimaryKeys=true
gg.handler.kafkahandler.KafkaProducerConfigFile=custom_kafka_producer.properties
gg.handler.kafkahandler.topicName=test_ogg --舊版參數
#gg.handler.kafkahandler.topicMappingTemplate=test_ogg –-新版本參數
gg.handler.kafkahandler.format=json
gg.handler.kafkahandler.mode=op
gg.classpath=dirprm/:/kafka/libs/*:/ogg/:/ogg/lib/* --kafka 安裝的位置ogg安裝的位置
將./dirprm/kafka.props 文件複製到/ogg/AdapterExamples/big-data/kafka 目錄下--修改properties文件
vi ./dirprm/custom_kafka_producer.properties
bootstrap.servers=192.168.1.111:9092 ---kafka地址
acks=-1
compression.type=gzip
reconnect.backoff.ms=1000
value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
key.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
batch.size=102400 --數據有堆積
linger.ms=10000 --數據傳輸kafka有延時
將./dirprm/custom_kafka_producer.properties 文件複製到/ogg/AdapterExamples/big-data/kafka
3.開啓全量同步
--啓動進程(在目標端中可看到ia trail文件)
GGSCI (db1) 4>start initkfk1
4.驗證全量同步數據
--驗證同步數據情況
cd /kafka
bin/kafka-console-consumer.sh --bootstrap-server 192.168.1.111:9092 --topic test_ogg --from-beginning
四、增量同步
1.創建增量抽取進程,此進程是在全量進程運行完畢之後纔會開始工作的,在全量進程運行過程中它的狀態爲STARTING,
GGSCI (db1) 5> view params E_KFK1
extract E_KFK1
setenv (NLS_LANG=AMERICAN_AMERICA.ZHS16GBK)
SETENV (ORACLE_HOME = /oracle/home)
userid ogg, password ogg
exttrail ./dirdat/ka
DYNAMICRESOLUTION
REPORTCOUNT EVERY 1 MINUTES, RATE
DISCARDFILE ./dirrpt/E_KFK1.rpt,APPEND,MEGABYTES 1024
WARNLONGTRANS 5h,CHECKINTERVAL 30m
FETCHOPTIONS NOUSESNAPSHOT
tranlogoptions dblogreader
table user_ogg.*
--添加抽取進程
GGSCI (db1) 6> add extract E_KFK1, tranlog, threads 2,begin now
GGSCI (db1) 7> add EXTTRAIL ./dirdat/ka, extract E_KFK1,MEGABYTES 100
--啓動抽取進程
GGSCI (db1) 8> start E_KFK1
2.創建投遞進程,此進程啓動之後只要網絡正常就是running狀態
GGSCI (db1) 9> view params P_KFK1
extract P_KFK1
userid ogg, password ogg
rmthost 192.168.1.111, mgrport 7809
rmttrail ./dirdat/ka
passthru
dynamicresolution
table user_ogg.*
--添加投遞進程
GGSCI (db1) 10> add extract P_KFK1,exttrailsource ./dirdat/ka ,begin now
GGSCI (db1) 11> add exttrail ./dirdat/ka,EXTRACT P_KFK1,MEGABYTES 100
--啓動投遞進程
GGSCI (db1) 12> start P_KFK1
3.生成表定義文件,傳輸到kafka目標端。
--抽取數據
GGSCI (db1) 13> edit param define_kfk
defsfile /backup/define/define_kfk1.txt
userid ogg,password ogg
table user_ogg.*
$cd /ogg
$./defgen paramfile dirprm/define_kfk.prm
4.kafka目標端操作
GGSCI (kafka) 4>edit param repkfk1
REPLICAT repkfk1
SOURCEDEFS ./dirdef/define_kfk1.txt
targetdb libfile libggjava.so set property=./dirprm/kafka.props
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 10000
map user.table, target user.table;
--添加repilicat進程
GGSCI (kafka) 5> add replicat repkfk1 exttrail ./dirdat/ka,checkpointtable ogg.checkpoint
--啓動replicate進程
GGSCI (kafka) 6> start replicat repkfk
5.kafka目標端校驗增量數據
cd /kafka
bin/kafka-console-consumer.sh --bootstrap-server 192.168.1.111:9092 --topic test_ogg
五、總結
ogg同步到kafka方式相對簡單,但是也有需要注意的地方:
①kafka目前不支持ddl操作,所有關於表結構的變更都需要重新傳輸define文件到kafka目標端
②在ogg同步的時候需要將表中的數據轉換成二進制,在一些表中可能存在一些問題,本次遇到的問題,生產中不建議更改數據,需瞭解清楚業務數據在做更改
SYMPTOMS:OGG-00735 Error converting Oracle numeric value to ASCII for column TCFD.
CAUSE:在將一列數據轉換成二進制過程中出現錯誤,MOS中並沒有可解決的方式,檢查數據發現數據問題
SOLUTION:
1.create new extract and insert new table
GGSCI (db1) 1> view params test
extract test
setenv (NLS_LANG=AMERICAN_AMERICA.ZHS16GBK)
SETENV (ORACLE_HOME = /oracle/home)
userid ogg, password ogg
exttrail ./dirdat/cs
DYNAMICRESOLUTION
REPORTCOUNT EVERY 1 MINUTES, RATE
DISCARDFILE ./dirrpt/test.rpt,APPEND,MEGABYTES 1024
WARNLONGTRANS 5h,CHECKINTERVAL 30m
FETCHOPTIONS NOUSESNAPSHOT
tranlogoptions dblogreader
table user_ogg.TEXT_TABLE
insert into text_table select * from table;
2.When the process abend check error
2020-05-13 13:40:09 ERROR OGG-01028 Formatting error on: table name BJSX_OGG.TEXT_TABLE, rowid AAAiApABUAAAjlBAAJ, XID 10.5.162284,
position (Seqno 9996, RBA 138674432). Invalid numeric data detected. Error converting numeric from Oracle to ASCII on column TCFD,
raw length 4, raw data: D7C027AF: c2 07 e0 29 |...)|.
3.Position data by rowid
SQL> select dbms_rowid.rowid_object('AAAiApABUAAAjlBAAJ') object_id#, dbms_rowid.rowid_relative_fno('AAAiApABUAAAjlBAAJ') file#,
dbms_rowid.rowid_block_number('AAAiApABUAAAjlBAAJ') block#, dbms_rowid.rowid_row_number('AAAiApABUAAAjlBAAJ') row# from dual;
OBJECT_ID# FILE# BLOCK# ROW#
---------- ---------- ---------- ----------
138831 84 3819582 21
SQL> select object_id,object_name ,object_type from user_objects where object_id=138831;
OBJECT_ID OBJECT_NAME OBJECT_TYPE
---------- ------------------------------ -------------------
138831 TEXT_TABLE TABLE
SQL> select dwjf, dbms_rowid.rowid_object(rowid) object_id#,dbms_rowid.rowid_relative_fno(rowid) file#,
dbms_rowid.rowid_block_number(rowid) block#, dbms_rowid.rowid_row_number(rowid) row# from bjsx_ogg.text_table
where dbms_rowid.rowid_relative_fno(rowid)=84 and dbms_rowid.rowid_block_number(rowid)=3819582
and dbms_rowid.rowid_row_number(rowid)=21;
4.Check that the data in this column is not empty and has no value,After changing to 0, restart the initialization process
SQL> update user_ogg.TEXT_TABLE set TCFD=0 where YWXH='61396130000051146' and JFNIAN='1997';