oracle并行查詢一列的實現(xiàn)
在oracle數(shù)據(jù)庫中碰到SQL執(zhí)行時間過長。根本無法得到結(jié)果集的問題。服務(wù)器壓力也沒有很高,估計又是一個非常消耗磁盤的查詢。通過oracle并行查詢一列的方法,解決了這個問題。
果然,發(fā)現(xiàn)是一個200w的表和一個超過1100w表的HASH JOIN .
簡單的幫助優(yōu)化了一個SQL后,SQL如下:
- select count(ui.usin_uid_fk)
- from table1 av, table2 ui
- where av.av_usse_activatedate >= to_date('20090102', 'yyyymmdd')
- and av.av_usse_activatedate < to_date('20090401', 'yyyymmdd')
- and av.av_usse_uid_fk = ui.usin_uid_fk
- and ui.usin_mcnc_fk =XXX%'
不難想象執(zhí)行的不是很理想。近20分鐘的執(zhí)行時間,真是讓人崩潰。
- COUNT(UI.USIN_UID_FK)
- ---------------------
- 1918591
- Elapsed: 00:19:03.07
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 32921639 consistent gets
- 352073 physical reads
- 0 redo size
- 395 bytes sent via SQL*Net to client
- 503 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
對于那張TABLE2的大表(符合條件的超過1100w),決定試圖通過并行來提高執(zhí)行速度。SQL如下:
- select /*+parallel (tbl_userinfo 4)*/ count(ui.usin_uid_fk)
- from table1 av, table2 ui
- where av.av_usse_activatedate >= to_date('20090101', 'yyyymmdd')
- and av.av_usse_activatedate < to_date('20090401', 'yyyymmdd')
- and av.av_usse_uid_fk = ui.usin_uid_fk
- and ui.usin_mcnc_fk like 'XXX%';
執(zhí)行效果還是非常明顯的。從19分鐘多到1分45秒!其中consistent gets更是減少了一個數(shù)量級。
- COUNT(UI.USIN_UID_FK)
- ---------------------
- 1918591
- Elapsed: 00:01:45.15
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 2571109 consistent gets
- 124523 physical reads
- 0 redo size
- 395 bytes sent via SQL*Net to client
- 504 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
因為這個服務(wù)器為2×4核心的cpu,應(yīng)該可以算是8個CPU,所以應(yīng)該可以通過增加并行度來進(jìn)一步減少執(zhí)行時間。如下SQL:
- SQL> select /*+parallel (tbl_userinfo 8)*/ count(ui.usin_uid_fk)
- 2 from table1 av, table2 ui
- 3 where av.av_usse_activatedate >= to_date('20090101', 'yyyymmdd')
- 4 and av.av_usse_activatedate < to_date('20090401', 'yyyymmdd')
- 5 and av.av_usse_uid_fk = ui.usin_uid_fk
- 6 and ui.usin_mcnc_fk like '460%';
- COUNT(UI.USIN_UID_FK)
- ---------------------
- 1949033
- Elapsed: 00:00:20.60
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 2607524 consistent gets
- 55050 physical reads
- 0 redo size
- 395 bytes sent via SQL*Net to client
- 503 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
可以說還是比較理想的。只有20S左右了。雖然最大并行度可以到CPU*2,但是效果未必會好。進(jìn)一步做一個16個并行度的SQL執(zhí)行測試。
- COUNT(UI.USIN_UID_FK)
- ---------------------
- 1949033
- Elapsed: 00:00:20.64
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 2607524 consistent gets
- 55299 physical reads
- 0 redo size
- 395 bytes sent via SQL*Net to client
- 504 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
沒有任何提高,并且執(zhí)行時間還稍高于并行度為8的SQL。
通過以上測試我們不難發(fā)現(xiàn):
在處理大量數(shù)據(jù)查詢,例如出現(xiàn)HASH JOIN的情況下,oracle并行非常有效果的。也就是說并行查詢在數(shù)據(jù)倉庫這樣的應(yīng)用中會“大顯身手”。
但是oracle并行的使用還是有很多限制的。例如相對較小的數(shù)據(jù)查詢和連接是會適得其反的。盲目增加并行度也是大忌,相對來講,并行度和CPU數(shù)相同比較好。這里的CPU數(shù)應(yīng)該是指的核心數(shù)。例如服務(wù)器中有一個CPU是4核心的,并行度為4是好的。
技術(shù)很難有十全十美的,最重要的是對于特定技術(shù)的使用要恰到好處,保證揚長避短。
【編輯推薦】