概述
今天主要介紹Oracle資料庫在並行操作過程中 slave 進程和 QC 進程經常遇到的等待事件及常用腳本。
一、PX Deq: Execution Msg,PX Deq: Execute Reply等待事件
1、PX Deq: Execution Msg
Occurs when a parallel slave is waiting to be told what to do. This is normally considered an idle event, but can cause excessive CPU in some cases.
該事件是並行查詢中的常見事件。當PQ slave進程在等待QC告訴它要做什麼的時候就會出現此事件(eg: when waiting to be told parse / execute / fetch etc..)
v$session_wait 中該等待事件對應的參數:
P1 = sleeptime/senderid
P2 = passes
P3 = not used
我們可以使用如下語句獲取轉換sleeptime/senderid的相關信息:
set SERVEROUTPUT on
undef p1
declare
inst varchar(20);
sender varchar(20);
begin
select bitand(&&p1, 16711680) - 65535 as SNDRINST,
decode(bitand(&&p1, 65535),65535, 'QC', 'P'||to_char(bitand(&&p1, 65535),'fm000') ) as SNDR
into inst , sender
from dual
where bitand(&&p1, 268435456) = 268435456;
dbms_output.put_line('Instance = '||inst);
dbms_output.put_line('Sender = '||sender );
end;
/
如果P1的值為空,則意味slave 不需要等待任何進程
比如p1的值為268501004,則上面的sql會返回:
Instance = 1
Sender = P012
passes 進程在得到信息之前循環輪轉等待的次數
該等待事件是一個空閒等待事件,當此等待事件出現,進程會持續等待並逐漸增加等待次數直到獲取信息!
解決方法:
作為 Coordinator 的 Process 在獲取 Slave 進程的數據時,反應太慢了,導致某些 Slave進行因為 Queue 滿而不得不等待,進而拖慢了整個並行執行的速度。
這常常是由於 CPU 數目不足或者 系統中運行的 進程太多導致。可考慮 減小並行度。
2、PX Deq: Execute Reply
Occurs when the query coordinator is waiting for a response from a parallel slave. This is normally considered an idle event, but can cause excessive CPU in some cases.
Waiting Process: QC
協調器正在等待一個 從slaves 進程對控制信息的響應(確認通知)或者期望從slave進程集中獲取數據。這個等待事件意味著QC等待slaves結束執行sql 並且將結果集發送給QC
v$session_wait 中該等待事件對應的參數:
P1 = sleeptime/senderid
P2 = passes
P3 = not used
我們可以使用如下語句獲取轉換sleeptime/senderid的相關信息:
set SERVEROUTPUT on
undef p1
declare
inst varchar(20);
sender varchar(20);
begin
select bitand(&&p1, 16711680) - 65535 as SNDRINST,
decode(bitand(&&p1, 65535),65535, 'QC', 'P'||to_char(bitand(&&p1, 65535),'fm000') ) as SNDR
into inst , sender
from dual
where bitand(&&p1, 268435456) = 268435456;
dbms_output.put_line('Instance = '||inst);
dbms_output.put_line('Sender = '||sender );
end;
/
如果P1的值為空,則意味slave 不需要等待任何進程
比如p1的值為268501004,則上面的sql會返回:
Instance = 1
Sender = P012
等待時間:
這是非空閒等待時間,QC 等待從slave 的響應或者查詢的數據結果
解決辦法:
非優化的sql語句肯能是導致此等待事件的原因:slaves 需要花費很長時間來執行sql 語句而qc又在等待slave返回數據。
優化sql,查看slave 在執行的語句以及其執行計劃,並做出儘量的優化,以便減少slave執行sql語句的時間!
二、相關腳本
1、gives an overview of all running parallel queries with all slaves.It shows the if a slave is waiting and for what event it waits.
select decode(px.qcinst_id,
NULL,
username,
' - ' ||
lower(substr(pp.SERVER_NAME, length(pp.SERVER_NAME) - 4, 4))) "Username",
decode(px.qcinst_id, NULL, 'QC', '(Slave)') "QC/Slave",
to_char(px.server_set) "SlaveSet",
to_char(s.sid) "SID",
to_char(px.inst_id) "Slave INST",
decode(sw.state, 'WAITING', 'WAIT', 'NOT WAIT') as STATE,
case sw.state
WHEN 'WAITING' THEN
substr(sw.event, 1, 30)
ELSE
NULL
end as wait_event,
decode(px.qcinst_id, NULL, to_char(s.sid), px.qcsid) "QC SID",
to_char(px.qcinst_id) "QC INST",
px.req_degree "Req. DOP",
px.degree "Actual DOP"
from gv$px_session px, gv$session s, gv$px_process pp, gv$session_wait sw
where px.sid = s.sid(+)
and px.serial# = s.serial#(+)
and px.inst_id = s.inst_id(+)
and px.sid = pp.sid(+)
and px.serial# = pp.serial#(+)
and sw.sid = s.sid
and sw.inst_id = s.inst_id
order by decode(px.QCINST_ID, NULL, px.INST_ID, px.QCINST_ID),
px.QCSID,
decode(px.SERVER_GROUP, NULL, 0, px.SERVER_GROUP),
px.SERVER_SET,
px.INST_ID /
2、shows for the PX Deq events the processes that are exchange data.
select sw.SID as RCVSID,
decode(pp.server_name, NULL, 'A QC', pp.server_name) as RCVR,
sw.inst_id as RCVRINST,
case sw.state
WHEN 'WAITING' THEN
substr(sw.event, 1, 30)
ELSE
NULL
end as wait_event,
decode(bitand(p1, 65535),
65535,
'QC',
'P' || to_char(bitand(p1, 65535), 'fm000')) as SNDR,
bitand(p1, 16711680) - 65535 as SNDRINST,
decode(bitand(p1, 65535),
65535,
ps.qcsid,
(select sid
from gv$px_process
where server_name =
'P' || to_char(bitand(sw.p1, 65535), 'fm000')
and inst_id = bitand(sw.p1, 16711680) - 65535)) as SNDRSID,
decode(sw.state, 'WAITING', 'WAIT', 'NOT WAIT') as STATE
from gv$session_wait sw, gv$px_process pp, gv$px_session ps
where sw.sid = pp.sid(+)
and sw.inst_id = pp.inst_id(+)
and sw.sid = ps.sid(+)
and sw.inst_id = ps.inst_id(+)
and p1text = 'sleeptime/senderid'
and bitand(p1, 268435456) = 268435456
order by decode(ps.QCINST_ID, NULL, ps.INST_ID, ps.QCINST_ID),
ps.QCSID,
decode(ps.SERVER_GROUP, NULL, 0, ps.SERVER_GROUP),
ps.SERVER_SET,
ps.INST_ID
3、shows for long running processes what are the slaves do.
select decode(px.qcinst_id,
NULL,
username,
' - ' ||
lower(substr(pp.SERVER_NAME, length(pp.SERVER_NAME) - 4, 4))) "Username",
decode(px.qcinst_id, NULL, 'QC', '(Slave)') "QC/Slave",
to_char(px.server_set) "SlaveSet",
to_char(px.inst_id) "Slave INST",
substr(opname, 1, 30) operation_name,
substr(target, 1, 30) target,
sofar,
totalwork,
units,
start_time,
timestamp,
decode(px.qcinst_id, NULL, to_char(s.sid), px.qcsid) "QC SID",
to_char(px.qcinst_id) "QC INST"
from gv$px_session px, gv$px_process pp, gv$session_longops s
where px.sid = s.sid
and px.serial# = s.serial#
and px.inst_id = s.inst_id
and px.sid = pp.sid(+)
and px.serial# = pp.serial#(+)
order by decode(px.QCINST_ID, NULL, px.INST_ID, px.QCINST_ID),
px.QCSID,
decode(px.SERVER_GROUP, NULL, 0, px.SERVER_GROUP),
px.SERVER_SET,
px.INST_ID
覺得有用的朋友多幫忙轉發哦!後面會分享更多devops和DBA方面的內容,感興趣的朋友可以關注下~