1

I have a large fact table with 300M rows and 50 columns in it. There are multiple reports over this table and each report uses only couple out of 50 columns from the table.

Each column in the fact table is indexed with BITMAP INDEX. The idea is to use these indexes as a one-column version of the original table assuming that oracle could merge BITMAP INDEXes easily.

If I use several columns from the table in WHERE statement, I can see that oracle is able to merge these indexes effectively. There is BITMAP AND operation in execution plan as expected.

If I use several columns from the table in SELECT statement, I can see that depending on columns selectivity, oracle is either performing unneeded TABLE ACCESS or BITMAP CONVERSION [to rowids] and then HASH JOIN of these conversions.

Is there any way to eliminate the HASH JOIN in case of joining several BITMAP INDEXes? Is there any hint in oracle to force BITMAP MERGE when columns appear in SELECT statement rather than WHERE?

Intuitively it seems like the HASH JOIN for BITMAP INDEXes is unneeded operation in SELECT statement taking into account it is indeed unneeded in WHERE statement. But I couldn't find any evidence that oracle could avoid it.

Here are some examples:

SELECT a, b, c /* 3 BITMAP CONVERSIONs [to rowids] and then 2 unneeded HASH JOINS */
  FROM fact;

SELECT a, b, c, d, e /* TABLE ACCESS [full] instead of reading all the data from indexes */
  FROM fact;

SELECT a /* BITMAP INDEX [fast full scan] as expected*/
  FROM fact
  WHERE b = 1 and c = 2; /* BITMAP AND over two BITMAP INDEX [single value] as expected */

Are there any hints to optimize examples #1 and #2?

In production I use oracle11g but I tried similar queries on oracle12c and it look like in both versions of oracle behave the same.

Volodymyr Frolov
  • 1,256
  • 5
  • 16
  • 25
  • Can you post the full query? I'm not sure what mean by getting a unneeded HASH JOINS if you are only selecting from the FACT? – BobC Feb 17 '17 at 21:31
  • @BobC, the full query is not needed here, as I can illustrate the problem on a smaller example. I have the following query: `SELECT a, b FROM fact;`. There are 2 `BITMAP INDEX`es for `a` and `b`, and oracle reads the values from these indexes instead of `fact` table (as the table has lots of other columns). Then, I assume that oracle should able to perform `BITMAP MERGE` to match values from `a` with values in `b`. But oracle performs `BITMAP CONVERSION [to rowids]` and then `HASH JOIN` instead; which is inefficient for `BITMAP INDEX`es. – Volodymyr Frolov Feb 17 '17 at 22:18

2 Answers2

1

After some research it looks like oracle12c is incapable of joining BITMAP INDEXes if they are used in SELECT clause efficiently.

There is no dedicated access path to join BITMAP INDEXes in SELECT clause and so HASH JOIN is used in this case.

Oracle cannot use BITMAP MERGE access path in this case as it performs OR operation between two bitmaps:

How Bitmap Merge Works A merge uses an OR operation between two bitmaps. The resulting bitmap selects all rows from the first bitmap, plus all rows from every subsequent bitmap.

Detailed analysis showed that only HASH JOIN was considered by cost optimizer in my case. I wasn't able to find any evidence that BITMAP INDEXes could be used efficiently in SELECT statement. Oracle documentation suggests using BITMAP INDEXes only in WHERE clause or joining fact to dimensions.

And either of the following are true:

  • The indexed column will be restricted in queries (referenced in the WHERE clause).

or

  • The indexed column is a foreign key for a dimension table. In this case, such an index will make star transformation more likely.

In my case it is neither of the two.

Volodymyr Frolov
  • 1,256
  • 5
  • 16
  • 25
  • Perhaps I'm being a bit dense, but I'm still not sure I understand what problem you are trying to solve? If you have SELECT A, B FROM FACT, (ie no WHERE clause), what index do you expect Oracle to use? – BobC Feb 21 '17 at 01:06
  • @BobC, Oracle is able to read values from indexes on `A` and `B`, instead of table `FACT` if the table is big. In this case 2 index fast full scans are much quicker than full table scan. But then oracle uses ineffective `HASH JOIN` to join values from 2 indexes, where in reality no join is needed, bitmap indexes have the same order for ROWIDs. – Volodymyr Frolov Feb 22 '17 at 14:00
  • @BobC, so for example, if you are doing `SELECT A FROM FACT` (no `WHERE` clause) and column `A` is indexed, oracle performs `INDEX [fast full scan]` instead of `TABLE ACCESS [full scan]`. But then, if you select two columns instead of just one and if both indexes are bitmap indexes, oracle performs unneeded `HASH JOIN` between the two indexes. – Volodymyr Frolov Feb 22 '17 at 14:15
  • Ok, I think I see what you are saying. Can you show me the execution plan for the case where you select the two index values. – BobC Feb 22 '17 at 14:26
  • Is this a common use case? – BobC Feb 22 '17 at 14:27
  • @BobC, Test Case: create table test_bindex ( a varchar2(8), b varchar2(4), unused varchar2(4000) ); create bitmap index test_bindex_a_bidx on test_bindex(a); create bitmap index test_bindex_b_bidx on test_bindex(b); declare i number; begin for i in 1..1000 loop insert into test_bindex(a, b, unused) values(Dbms_Random.String('X', 8), Dbms_Random.String('X', 4), Dbms_Random.String('X', 4000)); end loop; commit; end; / explain plan for select a, b from test_bindex v; select * from table( dbms_xplan.display('plan_table',null,'typical outline') ); – Volodymyr Frolov Feb 23 '17 at 16:21
  • @BobC, Execution Plan for the Test Case: | 0 | SELECT STATEMENT | | 1000 | 10000 | 26 (0)| 00:00:01 | | 1 | VIEW | index$_join$_001 | 1000 | 10000 | 26 (0)| 00:00:01 | |* 2 | HASH JOIN | | | | | | | 3 | BITMAP CONVERSION TO ROWIDS| | 1000 | 10000 | 13 (0)| 00:00:01 | | 4 | BITMAP INDEX FULL SCAN | TEST_BINDEX_A_BIDX | | | | | | 5 | BITMAP CONVERSION TO ROWIDS| | 1000 | 10000 | 13 (0)| 00:00:01 | | 6 | BITMAP INDEX FULL SCAN | TEST_BINDEX_B_BIDX | – Volodymyr Frolov Feb 23 '17 at 16:22
  • Thanks. I will take a look. – BobC Feb 23 '17 at 22:41
  • Repeating my earlier comment; is this a common use case? – BobC Feb 23 '17 at 22:42
  • @BobC, What do you mean? Yes, we have a lot of reports reading just several fields out of 50+, and all the reports have the same performance problem. Also, I created Support Request for oracle, but didn't get any answer so far. – Volodymyr Frolov Feb 25 '17 at 01:09
  • How big is the performance impact? How long do the queries take? Do you really have no WHERE clause, and return all 300M rows each time? What platform are you running on; ie how many CPU cores, storage bandwidth etc. – BobC Feb 25 '17 at 02:13
0

I think what you are seeing is essentially the "index join access path" in action :) Oracle needs to join the data from both scans on ROWID, to stitch the rows together. The hash join is the only method open to Oracle. The fact that you are using bitmap indexes is actually irrelevant; you see the same behaviour with b-tree indexes

-------------------------------------------------------------------------------------------
| Id  | Operation              | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                  |  1973K|    43M|   137K (30)| 00:00:06 |
|   1 |  VIEW                  | index$_join$_001 |  1973K|    43M|   137K (30)| 00:00:06 |
|*  2 |   HASH JOIN            |                  |       |       |            |          |
|*  3 |    INDEX FAST FULL SCAN| IO               |  1973K|    43M| 17201  (78)| 00:00:01 |
|*  4 |    INDEX FAST FULL SCAN| IT               |  1973K|    43M| 17201  (78)| 00:00:01 |
-------------------------------------------------------------------------------------------
BobC
  • 4,208
  • 1
  • 12
  • 15
  • As I've said `HASH JOIN` for `BITMAP INDEX`es seems to be ineffective to me, because bits in each bitmap card in each `BITMAP INDEX` of the same table must already have the same order, they are sorted by `ROWID`. There is no point in converting `ROWID`s to hashes and then join it as bits in the same position belong to the same row in all indexes. But as I described in in my answer, oracle doesn't have a dedicated join path for this case and uses more general and thus ineffective `HASH JOIN`. – Volodymyr Frolov Feb 25 '17 at 15:07
  • I'm not sure if you will get a "better" answer from support; they are essentially "break-fix". Since it's working "as designed", I doubt you will get much traction, unless you can demonstrate a big performance impact. I posed another set of questions for you in my comments below your own answer. – BobC Feb 27 '17 at 15:27