ORACLE OGG同步到KAFKA

ORACLE OGG同步到KAFKA

一、介绍


Kafka是一种高效的消息队列实现,通过订阅kafka的消息队列,下游系统可以实时获取在线Oracle系统的数据变更情况,实现业务系统

ogg同步全量数据方式:
①通过数据泵方式基于SCN导出并导入到目标端,此方式用于Oracle 到Oracle的ogg同步环境中。
②通过ogg本身的初始化方式,初始化全量数据到目标端,此方式通用于所有环境,但是速度相对较慢。

二、ogg环境搭建


ogg环境搭建并没太大变化,同oracle中相似


1.源端Oracle数据库中创建用户 表空间 赋权

2.源端Oracle数据库检查归档,附加日志开启

3.源端目标端安装软件(其中目标端安装ogg for bigdata软件),创建相关目录,开启表级附加日志

4.源端目标端创建mgr进程,添加检查点,管理端口相同

GGSCI (db1) 1>  edit params mgr

port 7809
DYNAMICPORTLIST 7810-8000
AUTOSTART EXTRACT * 
AUTORESTART EXTRACT *
PURGEOLDEXTRACTS ./dirdat/*,usecheckpoints, minkeepdays 7
LAGREPORTHOURS 1
LAGINFOMINUTES 30
LAGCRITICALMINUTES 45
dblogin userid ogg, password ogg
add checkpointtable ogg.checkpoint

edit params ./GLOBALS
checkpointtable ogg.checkpoint
GGSCHEMA ogg

三、全量同步

1.源端创建初始化进程

GGSCI (db1) 2> edit params initkfk1

EXTRACT initkfk1
setenv (NLS_LANG=AMERICAN_AMERICA.ZHS16GBK)
USERID ogg,PASSWORD ogg
rmthost 192.168.1.111, mgrport 7809
RMTFILE ./dirdat/ia,maxfiles 999, megabytes 500
table user_ogg.*

--添加进程

GGSCI (db1) 3>add extract initkfk1,sourceistable

2.目标端kafka配置

--添加全量应用进程

GGSCI (kafka) 2> ADD replicat initkfk1,specialrun   

GGSCI (kafka) 3> edit params initkfk1
SPECIALRUN
end runtime
setenv(NLS_LANG="AMERICAN_AMERICA.AL32UTF8")
targetdb libfile libggjava.so set property=./dirprm/kafka.props
SOURCEDEFS ./dirdef/define_kfk1.txt
REPLACEBADCHAR SKIP
SOURCECHARSET OVERRIDE ISO-8859-1
EXTFILE ./dirdat/ia
reportcount every 1 minutes, rate
grouptransops 10000
map user.table, target user.table;

--修改kafka props文件

vi ./dirprm/kafka.props

gg.handlerlist=kafkahandler
gg.handler.kafkahandler.type=kafka
gg.handler.kafkahandler.format.includePrimaryKeys=true
gg.handler.kafkahandler.KafkaProducerConfigFile=custom_kafka_producer.properties
gg.handler.kafkahandler.topicName=test_ogg  --旧版参数
#gg.handler.kafkahandler.topicMappingTemplate=test_ogg –-新版本参数
gg.handler.kafkahandler.format=json
gg.handler.kafkahandler.mode=op
gg.classpath=dirprm/:/kafka/libs/*:/ogg/:/ogg/lib/* --kafka 安装的位置ogg安装的位置


将./dirprm/kafka.props 文件复制到/ogg/AdapterExamples/big-data/kafka 目录下--修改properties文件

vi ./dirprm/custom_kafka_producer.properties

bootstrap.servers=192.168.1.111:9092  ---kafka地址
acks=-1
compression.type=gzip
reconnect.backoff.ms=1000
value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
key.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
batch.size=102400  --数据有堆积
linger.ms=10000    --数据传输kafka有延时

将./dirprm/custom_kafka_producer.properties 文件复制到/ogg/AdapterExamples/big-data/kafka

3.开启全量同步

--启动进程(在目标端中可看到ia  trail文件)

GGSCI (db1) 4>start initkfk1

4.验证全量同步数据

--验证同步数据情况

cd /kafka
bin/kafka-console-consumer.sh --bootstrap-server 192.168.1.111:9092 --topic test_ogg --from-beginning

四、增量同步

1.创建增量抽取进程,此进程是在全量进程运行完毕之后才会开始工作的,在全量进程运行过程中它的状态为STARTING,

GGSCI (db1) 5> view params E_KFK1
extract E_KFK1
setenv (NLS_LANG=AMERICAN_AMERICA.ZHS16GBK)
SETENV (ORACLE_HOME = /oracle/home)
userid ogg, password ogg
exttrail ./dirdat/ka
DYNAMICRESOLUTION
REPORTCOUNT EVERY 1 MINUTES, RATE
DISCARDFILE ./dirrpt/E_KFK1.rpt,APPEND,MEGABYTES 1024
WARNLONGTRANS 5h,CHECKINTERVAL 30m
FETCHOPTIONS NOUSESNAPSHOT
tranlogoptions dblogreader
table user_ogg.*

--添加抽取进程

GGSCI (db1) 6> add extract E_KFK1, tranlog, threads  2,begin now 
GGSCI (db1) 7> add EXTTRAIL ./dirdat/ka, extract E_KFK1,MEGABYTES 100   

--启动抽取进程

GGSCI (db1) 8> start E_KFK1

2.创建投递进程,此进程启动之后只要网络正常就是running状态

GGSCI (db1) 9> view params P_KFK1

extract P_KFK1
userid ogg, password ogg
rmthost 192.168.1.111, mgrport 7809
rmttrail ./dirdat/ka
passthru
dynamicresolution
table user_ogg.*

--添加投递进程

GGSCI (db1) 10> add extract P_KFK1,exttrailsource  ./dirdat/ka ,begin now 
GGSCI (db1) 11> add exttrail ./dirdat/ka,EXTRACT P_KFK1,MEGABYTES 100 

--启动投递进程

GGSCI (db1) 12> start P_KFK1

3.生成表定义文件,传输到kafka目标端。

--抽取数据

GGSCI (db1) 13> edit param define_kfk

defsfile  /backup/define/define_kfk1.txt
userid ogg,password ogg
table user_ogg.*

$cd /ogg    
$./defgen paramfile dirprm/define_kfk.prm

4.kafka目标端操作

GGSCI (kafka) 4>edit param repkfk1
REPLICAT repkfk1
SOURCEDEFS ./dirdef/define_kfk1.txt
targetdb libfile libggjava.so set property=./dirprm/kafka.props
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 10000
map user.table, target user.table;

--添加repilicat进程

GGSCI (kafka) 5> add replicat repkfk1 exttrail ./dirdat/ka,checkpointtable ogg.checkpoint

--启动replicate进程

GGSCI (kafka) 6> start replicat repkfk

5.kafka目标端校验增量数据

cd /kafka
bin/kafka-console-consumer.sh --bootstrap-server 192.168.1.111:9092 --topic test_ogg

五、总结

ogg同步到kafka方式相对简单,但是也有需要注意的地方:
①kafka目前不支持ddl操作,所有关于表结构的变更都需要重新传输define文件到kafka目标端
②在ogg同步的时候需要将表中的数据转换成二进制,在一些表中可能存在一些问题,本次遇到的问题,生产中不建议更改数据,需了解清楚业务数据在做更改


SYMPTOMS:OGG-00735  Error converting Oracle numeric value to ASCII for column TCFD. 
CAUSE:在将一列数据转换成二进制过程中出现错误,MOS中并没有可解决的方式,检查数据发现数据问题
SOLUTION:

1.create new extract and insert new table

GGSCI (db1) 1> view params test

extract test
setenv (NLS_LANG=AMERICAN_AMERICA.ZHS16GBK)
SETENV (ORACLE_HOME = /oracle/home)
userid ogg, password ogg
exttrail ./dirdat/cs
DYNAMICRESOLUTION
REPORTCOUNT EVERY 1 MINUTES, RATE
DISCARDFILE ./dirrpt/test.rpt,APPEND,MEGABYTES 1024
WARNLONGTRANS 5h,CHECKINTERVAL 30m
FETCHOPTIONS NOUSESNAPSHOT
tranlogoptions dblogreader
table user_ogg.TEXT_TABLE

insert into text_table select * from table;

2.When the process abend check error

2020-05-13 13:40:09  ERROR   OGG-01028  Formatting error on: table name BJSX_OGG.TEXT_TABLE, rowid AAAiApABUAAAjlBAAJ, XID 10.5.162284, 
position (Seqno 9996, RBA 138674432). Invalid numeric data detected. Error converting numeric from Oracle to ASCII on column TCFD, 
raw length 4, raw data: D7C027AF: c2 07 e0 29 |...)|.

3.Position data by rowid

SQL> select dbms_rowid.rowid_object('AAAiApABUAAAjlBAAJ') object_id#, dbms_rowid.rowid_relative_fno('AAAiApABUAAAjlBAAJ') file#,
     dbms_rowid.rowid_block_number('AAAiApABUAAAjlBAAJ') block#, dbms_rowid.rowid_row_number('AAAiApABUAAAjlBAAJ') row# from dual;

OBJECT_ID#      FILE#     BLOCK#       ROW#
---------- ---------- ---------- ----------
    138831         84    3819582         21


SQL>  select object_id,object_name ,object_type from user_objects where object_id=138831;

 OBJECT_ID OBJECT_NAME                    OBJECT_TYPE
---------- ------------------------------ -------------------
    138831 TEXT_TABLE                     TABLE


SQL>  select   dwjf, dbms_rowid.rowid_object(rowid) object_id#,dbms_rowid.rowid_relative_fno(rowid) file#,
 dbms_rowid.rowid_block_number(rowid) block#, dbms_rowid.rowid_row_number(rowid) row# from bjsx_ogg.text_table 
 where dbms_rowid.rowid_relative_fno(rowid)=84 and  dbms_rowid.rowid_block_number(rowid)=3819582
 and dbms_rowid.rowid_row_number(rowid)=21;

4.Check that the data in this column is not empty and has no value,After changing to 0, restart the initialization process

SQL> update user_ogg.TEXT_TABLE  set TCFD=0 where YWXH='61396130000051146' and JFNIAN='1997';

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章