先來看看oralce官方文檔的解釋吧
Parallel execution enables the application of multiple CPU and I/O resources to the execution of a single database operation. It dramatically reduces response time for data-intensive operations on large databases typically associated with a decision support system (DSS) and data warehouses. You can also implement parallel execution on an online transaction processing (OLTP) system for batch processing or schema maintenance operations such as index creation. Parallel execution is sometimes called parallelism. Parallelism is the idea of breaking down a task so that, instead of one process doing all of the work in a query, many processes do part of the work at the same time.An example of this is when four processes combine to calculate the total sales for a year, each process handles one quarter of the year instead of a single process handling all four quarters by itself. The improvement in performance can be quite significant. Parallel execution improves processing for:
Queries requiring large table scans, joins, or partitioned index scans
Creation of large indexes
Creation of large tables (including materialized views)
Bulk insertions, updates, merges, and deletions
解釋一下吧,並行執行就是使用多個cpu和I/O資源去完成一個數據庫操作,看打標記的那句話,並行是將一個任務打碎,讓很多進程去執行原來應該有一個進程完成的動作。使用並行操作可以減少響應時間,但是這個和你的系統資源息息相關,如果系統資源缺乏,是用並行效果會跟差,並且增加資源的消耗。
oracle並行執行的機制
當cbo判斷一個會話的使用了並行,oracle會將server process 轉換爲一個並行協調進程,Oracle啓動時候,oracle使用默認參數parallel_min_servers來確定預先創建的slave process數,如果需要的slave process超出了oracle剛開始的創建的process,則並行協調進程將創建額外的slave process。然後並行協調進程將要處理的對象打碎,分給slave process處理,處理完成之後再彙總給server process,由server process將數據進行處理並返回給客戶。
我們來看一個圖:如圖1
Oracle 使用了並行度爲2來執行圖中的sql,那麼oracle使用了兩個slave process p1,p2 來掃描customer這張表,掃描完成後,Oracle又啓動了兩個進程p3,p4,然後p1,p2 進程將掃描的數據分別傳到對應的p3,p4進程中,由p3,p4進程執行group by操作。執行完成以後p3,p4進程,將數據送到p1,p2進程(因爲掃描完數據後,p1,p2進程已經空閒,所以oracle沒有啓動新的進程),然後進行order
by操作,最後將數據送到協調進程返回給用戶。
前面就是一個並行執行的典型例子,但是並行進程之間的交互是怎麼進行的了,oracle官方文檔中是如下描述的:
To execute a query in parallel, Oracle Database generally creates a set of producer parallel execution servers and a set of consumer parallel execution servers. The producer server retrieves rows from tables and the consumer server performs operations such as join, sort, DML, and DDL on these rows. Each server in the producer set has a connection to each server in the consumer set. The number of virtual connections between parallel execution servers increases as the square of the degree of parallelism.
Each communication channel has at least one, and sometimes up to four memory buffers, which are allocated from the shared pool. Multiple memory buffers facilitateasynchronous communication among the parallel execution servers.
A single-instance environment uses at most three buffers for each communication channel. An Oracle Real Application Clusters environment uses at most four buffers for each channel.Figure 8-3illustrates message buffers and how producer parallel execution servers connect to consumer parallel execution servers.
Figure 8-3 Parallel Execution Server Connections and Buffers
When a connection is between two processes on the same instance, the servers communicate by passing the buffers back and forth in memory (in the shared
pool). When the connection is between processes in different instances, the messages are sent using external high-speed network protocols over the interconnect. InFigure
8-3, the DOP equals the number of parallel execution servers, which in this case is n.Figure
8-3does not show the parallel execution coordinator. Each parallel execution server actually has an additional connection to the parallel execution coordinator. It is important to size the shared pool adequately when using
parallel execution. If there is not enough free space in the shared pool to allocate the necessary memory buffers for a parallel server, it fails to start.
讀懂並行執行計劃
先來看operation這一行,可以看到出現了PX,表示Oracle使用了並行了。根據執行計劃的讀法以及上面瞭解的並行的執行過程,可以對執行計劃做如下解讀:
1. 首先對t1表進行了全表掃描,但是此時不是一個進程進行掃描的,PX BLOCK ITERATOR表示slave process以迭代的方式掃描數據塊。
2. 接下來執行PX SEND RANGE ,表示oracle將掃描的結果推送到下一組進程。
3. 接下來oracle下一組進程開始接受數據(PX RECEIVE),並且並行進行排序。
4. 然後oracle將排序好的數據send到server process(PX SEND QC (ORDER
)
),然後server process 將數據返回給用戶。
那麼我們在解釋以下in-out這一列的意思:
P->S (Parallel to Serial):表示一個並行操作發送數據給一個串行操作,通常是並行incheng將數據發送給並行調度進程。
P->P (Parallel to Parallel):表示一個並行操作向另一個並行操作發送數據,比如兩個從屬進程之間的數據交流.。
PCWP (Parallel Combined with parent):相同slave process並行執行一個操作及其父操作,無通訊。
PCWC (Parallel Combined with Child) :相同slave process並行執行一個操作及其子操作,無通訊。
這個地方PCWC和PCWP比較難理解,對着執行計劃理解一下:
PCWC :TABLEACCESS
FULL
是PX BLOCK ITERATOR
的子進程,所以這個表示這兩個操作是相同process完成的。
PCWPPX RECEIVE 是SORTORDER
BY
的父進程 ,所以這個表示這兩個操作是相同process完成的。
這個是個人的理解,有什麼不對的,請指點。
接下來來看看如何在進程分發數據的方式:
range:生產者將執定範圍的記錄發給不同的消費者,會應用動態範圍分區決定哪條記錄給哪個消費者(對於orde by操作根據order by子句中字段range分區)。
loop:記錄會被平均分給每個消費者(即生產者每loop一次給一個消費者發一條記錄)。
hash:生產者用hash函數發送數據給消費者,動態應用hash分區來決定哪條記錄給哪個消費者(對於group by根據group by子句使用的字段進行hash )。
qc隨機:每個生產者將所有記錄發給query coordinator(隨機),這是常用方法。
qc順序:每個生產者將所有記錄發給query coordinator(順序很重要),並行orderby用這個給query coordinator(server process)發送數據。
oracle 11g中和並行有關的初始化參數
PARALLEL_ADAPTIVE_MULTI_USER:默認值true,根據oracle的負載情況來動態調整sql的並行度。
PARALLEL_DEGREE_LIMIT:默認值
CPU_COUNTXPARALLEL_THREADS_PER_CPU
X number of instances available,當oracle使用了自動調整並行度,則它表示oracle能使用的最大並行度。
PARALLEL_DEGREE_POLICY:默認值MANUAL,oracle使用該參數來啓動自動調整並行度。
PARALLEL_EXECUTION_MESSAGE_SIZE:默認值16 KB。Specifies the size of the buffers used by the parallel execution servers to communicate among themselves and with the query coordinator. These buffers are allocated out of the shared pool.
PARALLEL_MAX_SERVERS:oracle 11g下默認是80個,這個參數定義了oracle所能使用的最大並行進程,當數據庫實例啓動的進程不夠時,Oracle能夠啓動的最大進程數不能超過這個數目。
PARALLEL_MIN_SERVERS:默認值爲0,這個參數定義了oracle實例啓動時,啓動的並行進程的數目。
當然還有一些其它的參數,請查看oracle官方文檔,下面是我本地數據庫一個默認參數的配置:
SQL> show parameter parallel
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback string LOW
parallel_adaptive_multi_user boolean TRUE
parallel_automatic_tuning boolean FALSE
parallel_degree_limit string CPU
parallel_degree_policy string MANUAL
parallel_execution_message_size integer 16384
parallel_force_local boolean FALSE
parallel_instance_group string
parallel_io_cap_enabled boolean FALSE
parallel_max_servers integer 80
parallel_min_percent integer 0
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
parallel_min_servers integer 0
parallel_min_time_threshold string AUTO
parallel_server boolean FALSE
parallel_server_instances integer 1
parallel_servers_target integer 32
parallel_threads_per_cpu integer 2
recovery_parallelism integer 0
這裏重點了解一下一個Oracle 11g新增的參數PARALLEL_DEGREE_POLICY,讓oracle可以自動根據系統資源來調整Oracle的並行度。
這個參數有三個值: limited,auto,MANUAL
我們先來做以下操作:
SQL> create table t1 as select * from dba_objects;
表已創建。
SQL> create table t2 as select * from dba_objects;
表已創建。
SQL> alter table t1 parallel 4;
表已更改。
SQL> alter table t2 parallel(degree default);
表已更改。
SQL> select table_name,degree from user_tables where table_name in('T1','T2');
TABLE_NAME DEGREE
------------------------------ --------------------
T1 4
T2 DEFAULT
我們先來看當參數是使用默認值的時候,對並行是怎麼處理的。SQL> select count(*) from t1;
COUNT(*)
----------
75446
SQL> select * from v$pq_sesstat where statistic='Allocation Height';
STATISTIC LAST_QUERY SESSION_TOTAL
------------------------------ ---------- -------------
Allocation Height 4 0
SQL> select count(*) from t2;
COUNT(*)
----------
75447
SQL> select * from v$pq_sesstat where statistic='Allocation Height';
STATISTIC LAST_QUERY SESSION_TOTAL
------------------------------ ---------- -------------
Allocation Height 8 0
可以看到,當使用默認值的時候 ,oracle 不會去自動的調整並行度,完全是按照用戶的設置的並行度去處理的。
那麼當參數值爲limited時,Oracle又會如何處理了
SQL> alter session set parallel_degree_policy=limited;
會話已更改。
SQL> select count(*) from t1;
COUNT(*)
----------
75446
SQL> select * from v$pq_sesstat where statistic='Allocation Height';
STATISTIC LAST_QUERY SESSION_TOTAL
------------------------------ ---------- -------------
Allocation Height 4 0
SQL> select count(*) from t2;
COUNT(*)
----------
75447
SQL> select * from v$pq_sesstat where statistic='Allocation Height';
STATISTIC LAST_QUERY SESSION_TOTAL
------------------------------ ---------- -------------
Allocation Height 0 0
可以看到,當爲limited的時候oracle會對並行度爲default的進行調整,但是對已經設定好的不會調整,那麼現在我們就可以猜到,auto肯定是會對兩個都調整了。看下面,oracle會對所有的都會調整。
SQL> alter session set parallel_degree_policy=auto;
會話已更改。
SQL> select count(*) from t1;
COUNT(*)
----------
75446
SQL> select * from v$pq_sesstat where statistic='Allocation Height';
STATISTIC LAST_QUERY SESSION_TOTAL
------------------------------ ---------- -------------
Allocation Height 0 0
SQL> select count(*) from t2;
COUNT(*)
----------
75447
SQL> select * from v$pq_sesstat where statistic='Allocation Height';
STATISTIC LAST_QUERY SESSION_TOTAL
------------------------------ ---------- -------------
Allocation Height 0 0
可以使用並行執行的操作
Access methods:Some examples are table scans, index fast full scans, and partitioned index range scans.
Join methods:Some examples are nested loop, sort merge, hash, and star transformation.
DDL statements:
Some examples are CREATE
TABLE
AS
SELECT
,CREATE
INDEX
,REBUILD
INDEX
,REBUILD
INDEX
PARTITION
,
and MOVE
/SPLIT
/COALESCE
P ARTITION
.
You can typically use parallel DDL where you use regular DDL. There are, however, some additional details to consider when designing your database.One
important restriction is that parallel DDL cannot be used on tables with object or LOB
columns.(注意這一點,並行不能被使用在object或者lob字段上)
All of these DDL operations can be performed inNOLOGGING
mode for either parallel or serial execution.
TheCREATE
TABLE
statement for an index-organized table can be run with parallel execution either with or without anAS
SELECT
clause.
Different parallelism is used for different operations. Parallel CREATE
(partitioned) TABLE
AS
S
ELECT
and parallel CREATE
(partitioned) INDEX
statements run with a degree of parallelism (DOP) equal
to the number of partitions.(關於這一點我們會單獨去做一下)。
DML statements:
Parallel query:
Miscellaneous SQL operations:
Some examples
are GROUP
BY
,NOT
IN
,SELECT
DISTINCT
,UNION
,UNION
ALL
,CUBE
,
and ROLLUP
, plus aggregate and table functions.
SQL*Loader
並行查詢:
一個查詢能夠並行執行,需要滿足以下條件:
SQL 語句中有 Hint 提示,比如 parallell 或者 PARALLEL_INDEX 。
SQL 語句中引用的對象被設置了並行屬性。
多表關聯中 , 至少有一個表執行全表掃描 ( full table scan ) 或者跨越分區的 INDEX RANGE SACN 。
並行ddl:
並行DDL依賴於直接路徑操作。也就是說,數據不傳遞到緩衝區緩存以便以後寫出;而是由一個操作(如CREATE TABLE AS SELECT)來創建新的區段,並直接寫入這些區段,數據直接從查詢寫到磁盤(放在這些新分配的區段中)。所以並行在表空間中容易造成空間碎片,在字典管理時代,會造成空間浪費。但是在本地表空間的管理中,兩種不同的區分配方式會有不同的結果,相對並行來說,oracle更傾向與使用自動區分配。
以下ddl可以使用並行執行
CREATE
INDEX
CREATE
TABLE
... AS
SELECT
ALTER
INDEX
... REBUILD
ALTER
TABLE
... [MOVE|SPLIT|COALESCE]
PARTITION
ALTER INDEX
... [REBUILD|SPLIT
] PARTITION
SQL> create index idx_tt1 on t1(object_id) parallel 4; 索引已創建。 SQL ID: aq91k6zr8au5q Plan Hash: 1439620960 create index idx_tt1 on t1(object_id) parallel 4 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.01 0.34 2 21 0 0 Execute 1 0.01 1.46 23 9 1044 0 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.03 1.81 25 30 1044 0 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 84 Rows Row Source Operation ------- --------------------------------------------------- 4 PX COORDINATOR (cr=5 pr=0 pw=0 time=51 us) 0 PX SEND QC (ORDER) :TQ10001 (cr=0 pr=0 pw=0 time=0 us) 0 INDEX BUILD NON UNIQUE IDX_TT1 (cr=0 pr=0 pw=0 time=0 us)(object id 0) 0 SORT CREATE INDEX (cr=0 pr=0 pw=0 time=0 us) 0 PX RECEIVE (cr=0 pr=0 pw=0 time=0 us cost=83 size=1165905 card=89685) 0 PX SEND RANGE :TQ10000 (cr=0 pr=0 pw=0 time=0 us cost=83 size=1165905 card=89685) 0 PX BLOCK ITERATOR (cr=0 pr=0 pw=0 time=0 us cost=83 size=1165905 card=89685) 0 TABLE ACCESS FULL T1 (cr=0 pr=0 pw=0 time=0 us cost=83 size=1165905 card=89685)
ctas使用並行:
上面這個例子是Oracle官方文檔提供的,oracle先去並行掃描源表,然後再去並行的創建目標表。
並行dml
要是並行dml必須顯式的指定:
SQL> alter session enable parallel dml;
會話已更改。
對於這個oracle官方文檔做了如下的解釋:
This mode is required because parallel DML and serial DML have different locking, transaction, and disk space requirements and parallel DML is disabled for a session by default.