Histograms are a neat little gem in the world of Oracle. They are a great tool that the Optimizer uses with Oracle statistics.
Exert from PSOUG:
Oracle uses two types of histograms for column statistics: height-balanced histograms and frequency histograms.There are a handful of views to take a look at histograms as specified above. Here are some queries I found useful:
The type of histogram is stored in the HISTOGRAM column of the *TAB_COL_STATISTICS views (USER and DBA). This column can have values of HEIGHT BALANCED, FREQUENCY, or NONE.In a height-balanced histogram, the column values are divided into bands so that each band contains approximately the same number of rows. The useful information that the histogram provides is where in the range of values the endpoints fall. In a frequency histogram, each value of the column corresponds to a single bucket of the histogram. Each bucket contains the number of occurrences of that single value. Frequency histograms are automatically created instead of height-balanced histograms when the number of distinct values is less than or equal to the number of histogram buckets specified. Frequency histograms can be viewed using the *TAB_HISTOGRAMS views.
Checking to see if a column has a histogram:
SELECT OWNER, TABLE_NAME, COLUMN_NAME, HISTOGRAM, LAST_ANALYZED FROM all_tab_col_statistics WHERE OWNER LIKE upper('%&owner%') AND TABLE_NAME LIKE upper('%&table_name%') AND COLUMN_NAME LIKE UPPER('%&column_name%') ORDER BY 1,2,3
Viewing the frequency histogram data:
BEGIN DBMS_STATS.GATHER_table_STATS (OWNNAME => 'OE', TABNAME => 'INVENTORIES', METHOD_OPT => 'FOR COLUMNS SIZE 20 warehouse_id'); END; / SELECT column_name, num_distinct, num_buckets, histogram FROM USER_TAB_COL_STATISTICS WHERE table_name = 'INVENTORIES' AND column_name = 'WAREHOUSE_ID'; COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM ------------------------------ ------------ ----------- --------------- WAREHOUSE_ID 9 9 FREQUENCY SELECT endpoint_number, endpoint_value FROM USER_HISTOGRAMS WHERE table_name = 'INVENTORIES' and column_name = 'WAREHOUSE_ID' ORDER BY endpoint_number; ENDPOINT_NUMBER ENDPOINT_VALUE --------------- -------------- 36 1 213 2 261 3 370 4 484 5 692 6 798 7 984 8 BEGIN DBMS_STATS.GATHER_table_STATS (OWNNAME => 'OE', TABNAME => 'INVENTORIES', METHOD_OPT => 'FOR COLUMNS SIZE 10 quantity_on_hand'); END; SELECT column_name, num_distinct, num_buckets, histogram FROM USER_TAB_COL_STATISTICS WHERE table_name = 'INVENTORIES' AND column_name = 'QUANTITY_ON_HAND'; COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM ------------------------------ ------------ ----------- --------------- QUANTITY_ON_HAND 237 10 HEIGHT BALANCED SELECT endpoint_number, endpoint_value FROM USER_HISTOGRAMS WHERE table_name = 'INVENTORIES' and column_name = 'QUANTITY_ON_HAND' ORDER BY endpoint_number; ENDPOINT_NUMBER ENDPOINT_VALUE --------------- -------------- 0 0 1 27 2 42 3 57 4 74 5 98 6 123 7 149 8 175 9 202
Basically what the histogram tells you is how frequent a given value is. Based on its findings it will figure out if it should use the index that you have on the column or if it would be better to do the FTS.
How a Histogram could mess the optimizer up:
Histograms work best with literal variables (select * from test_table where some_ind = 'Y';). If you try using bind variables (select * from test_table where some_ind = :bindx;) Oracle should bind peak to check what they are, pick the plan based on those binds. What could be bad is if you have a table with a indicator column that 99% is Y and 1% N. If the bind peak happens on the N its more likely to use an index because that makes sense. However the next time that is ran and is not hard parsed it will use the same plan even if the bind is Y, where a full table scan makes more sense there.
The default setting for anything 10g is setting histogram collection to auto. This is getting better with each release but they are still working on it. You can always collect a histogram on specific columns by hand if you want.
I could go on forever about histograms, but I think this is a good summary.
:::ROLE THE CREDITS:::
Sources:
- Ask Tom Article
- PSOUG Article
- Random Blog that has good facts
- Oracle Documentation
- Rwave solution presentation
No comments:
Post a Comment