Tuesday, February 24, 2009

Oracle :Outer Join pitfalls

Starting Oracle 10G, Optimizer processes outer joins in much efficient way. Execution plan now has "HASH JOIN RIGHT OUTER", which means that the table with scant records will be scanned first. Table which it is outer joined to will be scanned in the end. This is especially use full for Datawarehouse queries where Fact table is outer joined to lot of dimension tables. So, DB now does not have to temp read and write the fact table records to join them later to the dimension tables.

But, before one jumps into concluding that this new method is almost sure way to win, there are few things one should keep in mind.
1. Ensure NOPARALLEL query execution. This new execution plan for outer joins is severely affected by parallel query i.e. if dimension tables are scanned in parallel and then joined to the fact tables, query exhibits worst performance. To avoid parallel execution, check that tables or any participating indexes have degree set to "1". Alternatively, one an also disable parallel query execution at instance level by setting init parameter "max_parallel_servers" to "0". But this is no advisable as there could be other kind of queries which might want to leverage the advantages of parallel query execution.

2. Keep huge dimension tables out of the outer joins. If Dimension tables (the one which is outer joined) is huge, lets say...1 Million records (for pga 4 gig), outer join using new execution plan may not help. This is because now hash tables for all the joining dimension could not be kept in memory and DB anyways have to do multipass hash join. "HASH JOIN RIGHT OUTER" based execution plan is effective only if the outer joined tables/dimensions are smaller in size.

3. Ensure that all the participating tables has the statistics collected. Missing statistics on tables would cause optimizer to pick up execution plan as it existed before 10G and you may not see the new execution plan for outer joined query.