goldengate extract ERROR OGG-01973 The redo record indicates data loss on object

oracle goldengate extract抽取進程異常終止,報錯OGG-01973 The redo record indicates data loss on object

問題描述:

    最近一段時間發現goldengate經常出現異常,extract進程終止,無法啓動,檢查日誌報錯如下:
ERROR   OGG-01973  The redo record indicates data loss on object 94,785.
可以看到錯誤ERROR OGG-01973 The redo record indicates data loss on object 94,785.

問題處理

    通過查找oracle mos文檔,找到一遍關於這個錯誤的介紹
OGG v11.2 Extract abends with "OGG-01965 The redo record indicates data loss on object " or "OGG-01973 The redo record indicates data loss on object " (Doc ID 1483431.1)

APPLIES TO:
Oracle GoldenGate - Version 11.2.1.0.1 and later
Information in this document applies to any platform.
SYMPTOMS
2012-08-02 09:10:09  ERROR   OGG-01965  The redo record indicates data loss on object xxx,xxx.

or

2012-18-02 05:14:03  ERROR   OGG-01973  The redo record indicates data loss on object xxx,xxx

CHANGES
 

CAUSE
Redo log records indicate data missing for the object in question and thus the error

SOLUTION
Prior to the code change in bug 14158453 Extract would abend with errors similar to following as described in KM note 1465720.1 for
tables/partitions with NOLOGGING regardless of whether the tablespace/ database is in logging/force logging mode.

ERROR OGG-01960 Failed to validate table <OWNER>.<TABLENAME>. The table is created with the NOLOGGING option, which is not supported. 
Extract may not be able to capture data from it.

And DBOPTIONS ALLOWNOLOGGING can be added in the extract to work-around the error with known limitations as described in 1465720.1

However post the code change in bug 14158453 the parameter DBOPTIONS ALLOWNOLOGGING is no longer valid
Extract no longer relies on the table metadata (logging/ nologging) and instead uses special records in the redo log to detect
missing data and when detected it would error as below indicating the object id of the table in question (same type of error is signaled when table data has been loaded via SQL*Loader with UNRECOVERABLE parameter):

2012-08-02 09:10:09  ERROR   OGG-01965  The redo record indicates data loss on object xxx,xxx.

The table can be excluded from the extract, enable logging at the table/ partition/ tablespace level (or if table data is loaded via SQL*Loader do not use UNRECOVERABLE) and include it back in the extract
if it needs to be replicated across

If using DDL extraction following can be run to see the offending object name

SELECT metadata_text, fragmentNo, optime FROM <owner>.GGS_DDL_HIST WHERE seqno = (SELECT MIN(seqno) FROM <owner>.GGS_DDL_HIST 
WHERE objectid = (SELECT objectId FROM <owner>.GGS_DDL_HIST_ALT  WHERE altObjectId = <object id reported> AND rownum=1));

 eg.,

SELECT metadata_text, fragmentNo, optime FROM ggs_owner.GGS_DDL_HIST WHERE seqno = (SELECT MIN(seqno) FROM ggs_owner.GGS_DDL_HIST 
WHERE objectid = (SELECT objectId FROM ggs_owner.GGS_DDL_HIST_ALT  WHERE altObjectId = xxxxxx  AND rownum=1));

Also following note can  be referred to prevent these nologging operations

The Gains and Pains of Nologging Operations (Doc ID 290161.1) 

The error can also be seen for Compression Advisor related tables (DBMS_TABCOMP* in 11.2.0.3 & CMP<x>$<object_id> in 11.2.0.4) and can be safely excluded from replication


REFERENCES
BUG:14405239 - ALLOWNOLOGGING IS NOT WORKING IN LATEST OGG V11.2.1.0.1 MLR VERSIONS

這是一個ogg11.2版本的bug,我本機使用的ogg版本是12.2,應該不是這個原因造成的,根據報錯的object_id,找到是oracle2019用戶下面的zs_ac06表
在這裏插入圖片描述
查看錶是loging模式
在這裏插入圖片描述
檢查附加日誌沒有啓動,將附加日誌開啓,之後再啓動ogg進程,發現還是不行報錯。
在這裏插入圖片描述
實在是找不到原因了,網上有一些貼子說過濾掉這張表的同步,但是該表是業務需要使用的到沒法過濾,最後不死心在mos上看到另外一遍文章Classic Extract Fails With Error “The redo record indicates data loss on object nnnnn” (Doc ID 2316712.1)

APPLIES TO:
Oracle GoldenGate - Version 12.2.0.1.0 and later
Information in this document applies to any platform.
SYMPTOMS
 Classic extract on 12.2 fails with ERROR OGG-01973 ...The redo record indicates data loss on object nnnnn

Note that, Table with same object id exists on database

SQL> select object_name,owner, object_type from dba_objects where object_id=nnnn;

 

 

 

 

CHANGES
 none

CAUSE
 Redo corruption, Activity logs may show messages like

2017-10-11 13:56:54.136 INFO |er.redo.ora.thread |Redo_Thread_2| 5875 oracle/redoorardr.c | DETECT ZERO LENGTH RECORD1 at 93185656!
2017-10-11 13:56:54.136 INFO |er.redo.ora.thread |Redo_Thread_2| 6867 oracle/redoorardr.c | Completed pdata buffering in sequence 1288, with nb = 1, tb = 1, lb = 182003.
2017-10-11 13:56:54.136 INFO |er.redo.ora.thread |Redo_Thread_2| 262 oracle/redoorapl.c | PL_get: len=104, vld=4, subscn=1, SCN = 2501.2438258045 (10744151465341), seqno=1288, rba=93185552
2017-10-11 13:56:54.136 ERROR|er.redo.ora.thread |Redo_Thread_2| 6993 oracle/redoorardr.c | Detected an SCN base increase greater than 1, last_record_scn = 2501.2438257951 (10744151465247) current_rechdr.scn = 2501.2438258045 (10744151465341)
2017-10-11 13:56:54.136 INFO |gglog.std.application |Redo_Thread_4| 69 ggapp/ggPrintLn.c | [rpt]:
2017-10-11 13:56:54 ERROR OGG-01973 The redo record indicates data loss on object 437,723.
 

SOLUTION
 Tranlogoptions allowdataloss will work for 12.2 OGG versions. This is made internal parameter on 12.2 onwards

If customer is on OGG 12.2, use _allowdataloss and restart the extract.

Tranlogoptions _allowdataloss

    版本和問題描述和我現在的問題一致,根據文檔的提示,在抽取進程上面加上Tranlogoptions _allowdataloss參數,之後再進行啓動,等了大概半個小時,總算看到抽取日誌有變動沒有報錯了,問題解決大功告成。

在這裏插入圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章