This post answers the following specific question:
Why “distinct query” is not using the index (in some cases)?
Let us try the following test case:
Create sample table “t_distinct” and add an index to the “ID” column and collect statistics for table and index:
Let us run the following basic query:
It is clear that optimizer selected a Full Table Scan to retrieve the results, although there is an index on the ID column.
If we repeat the same query with the “Index” hint:
Again, the optimizer is still using full table scan to access the table and ignored the Hint!
What is the reason behind the optimizer choice of Full Table Scan?
The reason is that the B*Tree Index by nature doesn’t store the NULL values, so, Oracle Optimizer assumes that not all distinct values could be retrieved from scanning the index only.
So, what if the “ID” column is NOT NULL, will the optimized change his mind? Let us try:
Now, the Oracle Optimizer is able to use the index.
If you omit the Index hint:
Note that “Index FAST FULL SCAN” is using multi-block read instead of single-block sequential read, which is much faster.
Note the big difference in the cost (and consistent gets) between the three cases:
|Access Method||Cost||Consistent Gets|
|TABLE FULL ACCESS||1736||481|
|INDEX FULL SCAN||242||323|
|INDEX FAST FULL SCAN||63||230|
If you cannot add a not null constraint to the table, you may achieve the same result by adding a not null predicate to the query, as follows:
alter table t_distinct modify(id null);
Now if you tried to select from the table, you will get the old full table scan:
But, when you add a not null query predicate:
Note that for all the following queries, the optimizer will choose FULL TABLE SCAN, although there is an index and the id column is not null:
This is ok, as the index doesn’t contain all columns, and the query references more than 5% of the table rows, so full table scan will be more efficient.