原文:http://www.cnblogs.com/yumiko/p/6060485.html
十年的威信網(wǎng)站建設(shè)經(jīng)驗,針對設(shè)計、前端、開發(fā)、售后、文案、推廣等六對一服務(wù),響應(yīng)快,48小時及時工作處理。成都營銷網(wǎng)站建設(shè)的優(yōu)勢是能夠根據(jù)用戶設(shè)備顯示端的尺寸不同,自動調(diào)整威信建站的顯示方式,使網(wǎng)站能夠適用不同顯示終端,在瀏覽器中調(diào)整網(wǎng)站的寬度,無論在任何一種瀏覽器上瀏覽網(wǎng)站,都能展現(xiàn)優(yōu)雅布局與設(shè)計,從而大程度地提升瀏覽體驗。成都創(chuàng)新互聯(lián)從事“威信網(wǎng)站設(shè)計”,“威信網(wǎng)站推廣”以來,每個客戶項目都認真落實執(zhí)行。
對于ORACLE而言,CBO優(yōu)化器可以根據(jù)直方圖收集的列值分布信息,讓選擇性高(返回數(shù)據(jù)行比例少)的列值使用索引,而選擇性低(返回數(shù)據(jù)行比例多)的列值不使用索引。尤其對于存在數(shù)據(jù)傾斜嚴重的列而言,直方圖很重要。
注:數(shù)據(jù)傾斜,主要指某列上的一個數(shù)值,相較于該列其他數(shù)值,出現(xiàn)比例高,如:“性別”列,“男性”占到該列整體數(shù)值(男性、女性)的80%,存在明顯的數(shù)據(jù)傾斜現(xiàn)象。
一般而言,直方圖不受是否使用索引的限制,即可以用來統(tǒng)計索引列,也可以統(tǒng)計非索引列。但對于非索引列的統(tǒng)計,意義不大。
直方圖類型的視圖:DBA_TAB_COL_STATISTICS,USER_TAB_COL_STATISTICS,ALL_TAB_COL_STATISTICS
直方圖具體信息的視圖:DBA_TAB_HISTOGRAMS,USER_TAB_HISTOGRAMS,ALL_TAB_HISTOGRAMS
本示例中使用的數(shù)據(jù)庫版本為ORACLE 11.2.0.4。
Yumiko_sunny@OA01> select distinct owner ,count(*) as col_rows, (select count(*) from test) as tab_rows, to_char(round(count(*)/(select count(*) from test)*100,2),'90.99')||'%' as data_ratio from test group by owner; OWNER COL_ROWS TAB_ROWS DATA_RATIO -------------------- --------------- ---------------- ---------- HR 476 535164 0.09% OE 1988 535164 0.37% ORDDATA 3598 535164 0.67% SCOTT 98 535164 0.02% SYS 529004 535164 98.85% |
從上圖中可以看到,該列的SYS值分布占到了整體的98%,表明存在嚴重的傾斜。
為OWNER列創(chuàng)建索引,并使用ANALYZE TABLE的方法收集統(tǒng)計信息。
--收集統(tǒng)計信息 Yumiko_sunny@OA01> analyze table test compute statistics; Table analyzed. --驗證最后的統(tǒng)計收集的時間 Yumiko_sunny@OA01> select table_name, to_char(LAST_ANALYZED,'YYYY-MM-DD HH24:MI:SS') LAST_ANALYZED from dba_tables where TABLE_NAME='TEST'; TABLE_NAME LAST_ANALYZED ------------------------------ ------------------- TEST 2016-11-13 21:23:19 --查看直方圖的統(tǒng)計情況 Yumiko_sunny@OA01> select column_name,histogram from dba_tab_col_statistics where table_name='TEST'; COLUMN_NAME HISTOGRAM -------------------- --------------- OWNER NONE OBJECT_NAME NONE SUBOBJECT_NAME NONE OBJECT_ID NONE DATA_OBJECT_ID NONE OBJECT_TYPE NONE CREATED NONE LAST_DDL_TIME NONE TIMESTAMP NONE STATUS NONE TEMPORARY NONE COLUMN_NAME HISTOGRAM -------------------- --------------- GENERATED NONE SECONDARY NONE NAMESPACE NONE EDITION_NAME NONE |
這里先忽略,后面可以對比for all columns子句的情況再看下。
查看此時索引列執(zhí)行計劃的選擇情況,這里以傾斜數(shù)據(jù)SYS為條件進行檢索。
Yumiko_sunny@OA01> select * from test where owner='SYS'; rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3856466897 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 107K| 10M| 1799 (1)| 00:00:22 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST | 107K| 10M| 1799 (1)| 00:00:22 | |* 2 | INDEX RANGE SCAN | IND_TEST | 107K| | 228 (1)| 00:00:03 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- - access("OWNER"='SYS') |
從上面返回的結(jié)果看,529K行的SYS數(shù)據(jù),僅僅返回107K行,顯然存在很大的誤差。
此外,對于數(shù)據(jù)傾斜達到98%的SYS而言,顯然全表掃描的效率應(yīng)該更高,這里應(yīng)該與錯誤的統(tǒng)計信息有關(guān)。
使用dbms_stats.gather_table_stats的方式再次收集表的統(tǒng)計信息。
Yumiko_sunny@OA01> exec dbms_stats.gather_table_stats('SCOTT','TEST',cascade=>true); PL/SQL procedure successfully completed. Yumiko_sunny@OA01> select table_name, to_char(LAST_ANALYZED,'YYYY-MM-DD HH24:MI:SS') LAST_ANALYZED from dba_tables where TABLE_NAME='TEST'; TABLE_NAME LAST_ANALYZED ------------------------------ ------------------- TEST 2016-11-13 21:50:01 Yumiko_sunny@OA01> select column_name,histogram from dba_tab_col_statistics where table_name='TEST'; COLUMN_NAME HISTOGRAM -------------------- --------------- OWNER FREQUENCY OBJECT_NAME NONE SUBOBJECT_NAME NONE OBJECT_ID NONE DATA_OBJECT_ID NONE OBJECT_TYPE NONE CREATED NONE LAST_DDL_TIME NONE TIMESTAMP NONE STATUS NONE TEMPORARY NONE COLUMN_NAME HISTOGRAM -------------------- --------------- GENERATED NONE SECONDARY NONE NAMESPACE NONE EDITION_NAME NONE |
從上圖可以看到,此時完成了對表的最新統(tǒng)計,同時收集了索引列的直方圖信息,且該直方圖為”等頻直方圖“。
再次查看此時索引列的執(zhí)行計劃選擇情況,這里分別以選擇性差的傾斜數(shù)據(jù)SYS為條件,以及以選擇性好的SCOTT為條件分別進行檢索。
--以SYS為條件進行查詢 Yumiko_sunny@OA01> select * from test where owner='SYS'; rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1357081020 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 530K| 49M| 2098 (1)| 00:00:26 | |* 1 | TABLE ACCESS FULL| TEST | 530K| 49M| 2098 (1)| 00:00:26 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- - filter("OWNER"='SYS') --以SCOTT為條件進行查詢 Yumiko_sunny@OA01> select * from test where owner='SCOTT'; rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3856466897 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 98 | 9506 | 5 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST | 98 | 9506 | 5 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IND_TEST | 98 | | 3 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- - access("OWNER"='SCOTT') |
對于選擇性差的SYS值,CBO優(yōu)化器采用了全表掃描的方式進行數(shù)據(jù)的訪問
對于選擇性好的SCOTT值,CBO優(yōu)化器則采用了索引掃描的方式進行數(shù)據(jù)的訪問
如果采用索引的方式訪問SYS相關(guān)的數(shù)據(jù)行,真實的代價會是怎樣呢,這里,通過hint的方式進行一次索引掃描的訪問
Yumiko_sunny@OA01> select /*+index(test,ind_test) */* from test where owner='SYS'; rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3856466897 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 529K| 49M| 8885 (1)| 00:01:47 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST | 529K| 49M| 8885 (1)| 00:01:47 | |* 2 | INDEX RANGE SCAN | IND_TEST | 529K| | 1115 (1)| 00:00:14 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- - access("OWNER"='SYS') |
從上圖中可以看到,在信息收集無誤的情況下,若采用索引掃描,其真實開銷是全表掃描的4倍。
通過上面這個執(zhí)行計劃,也說明了,對于CBO優(yōu)化器,準確無誤的統(tǒng)計信息對于執(zhí)行計劃選擇的重要性。
通過DBA_TAB_HISTOGRAMS視圖,查看此時直方圖的詳細信息
Yumiko_sunny@OA01> select TABLE_NAME,COLUMN_NAME,ENDPOINT_NUMBER, to_char(ENDPOINT_VALUE,'999999999999999999999999999999999999') as ENDPOINT_VALUE,ENDPOINT_ACTUAL_VALUE from DBA_TAB_HISTOGRAMS where TABLE_NAME='TEST' and COLUMN_NAME='OWNER'; TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALU ---------------------------------------------------------------------------------------- TEST OWNER 19 41159093808690300000000000 TEST OWNER 61 41186001805076000000000000 TEST OWNER 62 43232584582496500000000000 TEST OWNER 5 37550853140200700000000000 TEST OWNER 5518 43277234965060400000000000 |
可以看到,雖然ENDPOINT_VALUE收集到了唯OWNER列唯一值的hash值,但真實列ENDPOINT_ACTUAL_VALUE顯示為空。
后面對比analyze table for all columns操作后再看。
刪除SYS值相關(guān)的數(shù)據(jù)行,觀察直方圖統(tǒng)計的變化
Yumiko_sunny@OA01> delete from test where owner='SYS'; rows deleted. Yumiko_sunny@OA01> commit; Commit complete. Yumiko_sunny@OA01> select TABLE_NAME,COLUMN_NAME,ENDPOINT_NUMBER, to_char(ENDPOINT_VALUE,'999999999999999999999999999999999999') as ENDPOINT_VALUE,ENDPOINT_ACTUAL_VALUE from DBA_TAB_HISTOGRAMS where TABLE_NAME='TEST' and COLUMN_NAME='OWNER'; TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALU ---------------------------------------------------------------------------------------- TEST OWNER 19 41159093808690300000000000 TEST OWNER 61 41186001805076000000000000 TEST OWNER 62 43232584582496500000000000 TEST OWNER 5 37550853140200700000000000 TEST OWNER 5518 43277234965060400000000000 |
可以看到,對表數(shù)據(jù)的DML操作,直方圖信息并未自動更改。
再次使用dbms_stats.gather_table_stats收集統(tǒng)計信息,此時直方圖得到了更新,如下圖:
Yumiko_sunny@OA01> exec dbms_stats.gather_table_stats('SCOTT','TEST',cascade=>true); PL/SQL procedure successfully completed. Yumiko_sunny@OA01> select TABLE_NAME,COLUMN_NAME,ENDPOINT_NUMBER, to_char(ENDPOINT_VALUE,'999999999999999999999999999999999999') as ENDPOINT_VALUE,ENDPOINT_ACTUAL_VALUE from DBA_TAB_HISTOGRAMS where TABLE_NAME='TEST' and COLUMN_NAME='OWNER'; TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALU -------------------- ---------------------------- --------------------------------------- TEST OWNER 476 37550853140200700000000000 TEST OWNER 2464 41159093808690300000000000 TEST OWNER 6062 41186001805076000000000000 TEST OWNER 6160 43232584582496500000000000 |
上述說明了,對于直方圖的信息,需要定期進行收集工作。
本示例承接上面示例內(nèi)容,數(shù)據(jù)庫版本一致。
在上面內(nèi)容中,已經(jīng)演示了analyze table table_name compute statistics無法針對表進行直方圖的信息收集。
那么,對于已存在直方圖的表,該操作又會有何影響呢。
首先,承接上面內(nèi)容,再次執(zhí)行analyze table table_name compute statistics的操作,觀察直方圖信息的變化。
Yumiko_sunny@OA01> analyze table test compute statistics; Table analyzed. Yumiko_sunny@OA01> select column_name,histogram from dba_tab_col_statistics where table_name='TEST'; COLUMN_NAME HISTOGRAM -------------------- --------------- OWNER NONE OBJECT_NAME NONE SUBOBJECT_NAME NONE OBJECT_ID NONE DATA_OBJECT_ID NONE OBJECT_TYPE NONE CREATED NONE LAST_DDL_TIME NONE TIMESTAMP NONE STATUS NONE TEMPORARY NONE COLUMN_NAME HISTOGRAM -------------------- --------------- GENERATED NONE SECONDARY NONE NAMESPACE NONE EDITION_NAME NONE |
可以明顯的發(fā)現(xiàn),OWNER列的直方圖信息消失了,說明該操作會刪除已存在的直方圖信息。
如果這是一個生產(chǎn)環(huán)境,對于這樣一張存在數(shù)據(jù)傾斜列的表,可能會帶來不可預(yù)估的影響。
對于analyze table table_name compute statistics for all indexes的操作,這里不再演示,會在下面的總結(jié)中,直接給出對直方圖影響的結(jié)論。
下面看一下analyze table table_name compute statistics for all indexes for all columns操作的影響。
Yumiko_sunny@OA01> analyze table test compute statistics for all indexes for all columns; Table analyzed. Yumiko_sunny@OA01> select column_name,histogram from dba_tab_col_statistics where table_name='TEST'; COLUMN_NAME HISTOGRAM -------------------- --------------- OWNER FREQUENCY OBJECT_NAME HEIGHT BALANCED SUBOBJECT_NAME NONE OBJECT_ID HEIGHT BALANCED DATA_OBJECT_ID HEIGHT BALANCED OBJECT_TYPE FREQUENCY CREATED FREQUENCY LAST_DDL_TIME FREQUENCY TIMESTAMP FREQUENCY STATUS FREQUENCY TEMPORARY FREQUENCY COLUMN_NAME HISTOGRAM -------------------- --------------- GENERATED FREQUENCY SECONDARY FREQUENCY NAMESPACE FREQUENCY EDITION_NAME NONE |
從上面可以看到,當(dāng)執(zhí)行for all columns子句的時候,不但收集了索引列的直方圖信息,還收集了非索引列的直方圖信息。
再看下此時DBA_TAB_HISTOGRAMS視圖的詳細信息
Yumiko_sunny@OA01> select TABLE_NAME,COLUMN_NAME,ENDPOINT_NUMBER, to_char(ENDPOINT_VALUE,'999999999999999999999999999999999999') as ENDPOINT_VALUE,ENDPOINT_ACTUAL_VALUE from DBA_TAB_HISTOGRAMS where TABLE_NAME='TEST' and COLUMN_NAME='OWNER'; TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALU -------------------- ---------------------------- --------------------------------------- TEST OWNER 476 37550853140200700000000000 HR TEST OWNER 2464 41159093808690300000000000 OE TEST OWNER 6062 41186001805076000000000000 ORDDATA TEST OWNER 6160 43232584582496500000000000 SCOTT |
可以看到,此時可以看見DBA_TAB_HISTOGRAMS視圖上,ENDPOINT_ACTUAL_VALUE列真實值的信息。
最后一點,起碼從直方圖的收集情況看:analyze table table_name compute statistics并不等價于analyze table table_name compute statistics for all indexes for all columns
謹慎使用analyze table table_name compute statistics這個操作。
重要的事情說三遍!??!
新聞標題:直方圖使用技巧及analyzetable操作對直方圖統(tǒng)計的影響
文章路徑:http://jinyejixie.com/article18/ijccdp.html
成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供建站公司、App開發(fā)、面包屑導(dǎo)航、App設(shè)計、網(wǎng)站排名、微信公眾號
聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請盡快告知,我們將會在第一時間刪除。文章觀點不代表本網(wǎng)站立場,如需處理請聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時需注明來源: 創(chuàng)新互聯(lián)