LGWR 阻塞 log file sync

os: centos 7.4
db: oracle 11.2.0.4

在這裏插入圖片描述
從上圖中,我們能夠清楚的看到整個流程。這裏能夠進行簡單的描寫敘述:

1、當user發起一個commit後;

2、前端進程(即Server 進程)會post一個信息給lgwr進程,告訴它,你應該去寫redo buffer了。

3、當LGWR進程得到指示後,開始調用操作系統函數進行物理寫,在進行物理寫的這段時間內,會出現

log file parallel write等待。這裏也許有人會有疑問,爲什麼12c之前僅僅有一個lgwr進程,這裏卻是parallel

write呢?這裏須要說明一下,lgwr進程在將redo buffer中的數據寫出到log file文件裏時,也是以batch方式

進程的(實際上,dbwN進程也是batch的模式),有相關的隱含參數控制。

4、當LGWR完畢wrtie操作之後,LGWR進程會返回一個信息給前端進程(Server進程),告訴它,我已經寫完了,

你能夠完畢提交了。

5、 user 完畢commit操作。

這裏補充一下,這是因爲Oracle 日誌寫優先的原則,如果在commit之前redo buffer的相關entry信息不馬上寫到redo

log file中,那麼假設數據庫出現crash,那麼這是會丟數據的。

11gr2 寫日誌緩衝區到文件方式保留 Post/wait方式 外,增加了 Polling 的方式,
11.2.0.3開始默認是兩種方式自動切換。它是通過隱含參數 _use_adaptive_log_file_sync 進行設置。

Post/wait 模式
lgwr進程在將日誌寫入磁盤後,會立刻通知前臺進程,log file sync等待時間相對較短,但因12C以下oracle數據庫中的lgwr進程只有1個,大併發的commit會導致lgwr進程非常繁忙,cpu居高不下。lgwr與其他進程相比,更可能成爲數據庫的性能瓶頸。

Polling模式
lgwr進程執行寫入操作後不再單獨通知前臺進程寫已經完成,前臺進程使用定時查詢的方式對寫出的進度進行檢查。優點是LGWR不必通知等待提交完成的許多進程,從而釋放LGWR的高CPU使用率,但是會使前臺進程長時間處於log file sync等待。對於交易型的系統該參數應設置爲false。

在高併發狀態時,非常容易發現 LGWR 阻塞 log file sync 的現象。

版本

# cat /etc/centos-release
CentOS Linux release 7.4.1708 (Core) 
# 
# su - oracle
Last login: Tue Jan 21 03:40:05 CST 2020 on pts/0
$ sqlplus / as sysdba;

SQL*Plus: Release 11.2.0.4.0 Production on Mon Feb 3 10:29:09 2020

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> set lines 300;
SQL> set pages 300;
SQL> 
SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE	11.2.0.4.0	Production
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 - Production

SQL> 

_use_adaptive_log_file_sync

SQL> set lines 200;
set pages 200;
col inst_id format 99;
col name for a35;
col description for a30;
col value for a10;

select
  x.inst_id,
  x.ksppinm  name,
  x.ksppdesc description,
  y.ksppstvl  value,
  y.ksppstdf  isdefault,
  decode(bitand(y.ksppstvf,7),1,'MODIFIED',4,'SYSTEM_MOD','FALSE')  ismod,
  decode(bitand(y.ksppstvf,2),2,'TRUE','FALSE')  isadj
from sys.x$ksppi x,
     sys.x$ksppcv y
where 1=1
  and x.inst_id = y.inst_id
  and x.indx = y.indx
  and x.ksppinm like '_use_adaptive_log_file_sync'
order by translate(x.ksppinm, ' _', ' '),x.inst_id
/

INST_ID NAME				    DESCRIPTION 		   VALUE      ISDEFAULT ISMOD	   ISADJ
------- ----------------------------------- ------------------------------ ---------- --------- ---------- -----
      1 _use_adaptive_log_file_sync	    Adaptively switch between post TRUE       TRUE	FALSE	   FALSE
					                    /wait and polling



將該參數修改爲 false,可以實時修改。

SQL> alter system set "_use_adaptive_log_file_sync"=false;

下面是 mos 系統 _use_adaptive_log_file_sync(3520420) 的解釋

When "_use_adaptive_log_file_sync" is set to true, Oracle switches between two methods of communication between the LGWR and foreground processes to acknowledge \
that a commit has completed:

 
Post/wait - conventional method available in previous Oracle releases
LGWR explicitly posts all processes waiting for the commit to complete.

The advantage of the post/wait method is that sessions should find out almost immediately when the redo has been flushed to disk.

 

Polling
Foreground processes sleep and poll to see if the commit is complete.
The advantage of this new method is to free LGWR from having to inform many processes waiting on commit to complete thereby freeing high CPU usage by the LGWR.

Initially the LGWR uses post/wait and according to an internal algorithm evaluates whether polling is better.

 
Under high system load polling may perform better because the post/wait implementation typically does not scale well.

If the system load is low, then post/wait performs well and provides better response times than polling.

Oracle relies on internal statistics to determine which method should be used.

Because switching between post/wait and polling incurs an overhead, safe guards are in place in order to ensure that switches do not occur too frequently.

All switches are recorded in LGWR's trace file with a time stamp and the string "Log file sync switching to ...":

For more information on this feature see:

note 1541136.1 Waits for "log file sync" with Adaptive Polling vs Post/Wait Choice Enabled

Best Regards,

Hemant

Please refer to the following document for more information about _use_adaptive_log_file_sync:
Waits for "log file sync" with Adaptive Polling vs Post/Wait Choice Enabled (Doc ID 1541136.1)

Kindly note that in 11.2.0.1 and 11.2.0.2 , the default value for the parameter was false. In 11.2.0.3, the default value has been changed to true.
_use_adaptive_log_file_sync was turned on in 11.2.0.3 which uses post/wait and polling method.  This may cause some performance issues that you should be aware of. Most of theses issues are solved in 11.2.0.4.

The above document will give you overview of "Adaptive Polling vs Post/Wait Choice" , also "Known Issues with "_use_adaptive_log_file_sync" set to TRUE" .

Hope this helps.

Regards,

Maha    

參考:
Document 1548261.1 High Waits for ‘Log File Sync’: Known Issue Checklist for 11.2

Document 1462942.1 Adaptive Switching Between Log Write Methods can Cause ‘log file sync’ Waits
Document 13707904.8 Bug 13707904 - LGWR sometimes uses polling, sometimes post/wait
Document 13074706.8 Bug 13074706 - Long “log file sync” waits in RAC not correlated with slow writes
Document 1541136.1 Adaptive Log File Sync Optimization

發佈了732 篇原創文章 · 獲贊 70 · 訪問量 51萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章