I have created a prototype to demonstrate this.
SILOS.TEST_BULK_SIL mapping is created as a copy of SILOS.SIL_PayrollFact (loads W_PAYROLL_F from W_PAYROLL_FS).
Original mapping had a mapplet mplt_SIL_PayrollFact. This mapplet has 12 lookups over various dimensions. It takes input (datasource_num_id, integration_id etc) from the parent sql, looks up the ROW_WID and loads into the fact table.
I removed this mapplet completely and incorporated all the 12 dimensions in the parent sql itself, Outer Joining them to W_PAYROLL_FS. All the expressions which were built in the mapplet were taken care in the main mapping itself (some of them may require more finishing).
Following are the results:
|Mapping||Records Loaded (million)||Time Taken (hr.)||RPS (Reader) |
(uses Outer Join)
Results show that Outer join based mapping ran approx 2.7 times faster than the one based on lookups.
Again, lookups which involve some complex calculations may not be replaced by outer join.