This is very general practice in warehouse project: Sometimes we will get the transactional data without dimensional data. It is called as “early arriving facts” or “Late Arriving Dimension”. There are a few of good solutions though, all of which are handled during ETL:
- Hold onto the fact until all dimensions arrive
- Create a dimension called ‘unknown’ or ‘not available’ with a primary key of -1
- Insert a dummy row in the dimension and populate it with whatever you can
Usually, the first approach is not applied, as business want to get result even some dimensions not available. The exception mainly is handled during the populate fact table when there is NO match found in dimension table. An inferred member is used in third approach. This document is intended to provide overview to handle Inferred member, along with ETL process.
- Add new table as Admin_Exception to save all exception records
Column name Data type Description Log_Id Bigint Auto ID SrcTableName varchar(80) Staging table name SrcColumnName varchar(80) Staging column name Type Int 1: Null or Blank; 2: Inferred Dimension RecordID Int Unique record ID from each staging table Batch_Number Int Reference to Admin_Batch_Status.Batch_Number Message Text Datetime Datetime Default as getdate()
- Add new SP to insert data into Admin_Exception
Output: NONELogic: add Batch_Number
- [@RecordID = ] RecordID
- [@SrcTableName = ]’ SrcTableName’
- [@SrcColumnName = ]’ SrcColumnName’
- [@Type = ]’ Type’
- [@Message = ] ‘Message’
- While there will not be any changing for current existing package at all, a few adjustments will be added, When populating fact tables, all records that are not matched to individual dimension, will be insert into Admin_Exception using usp_InsertExecption. Please see sample below:
- Use conditional split to split not matched dimension into NULL or Blank and inferred Member
- Use Ole Command to add Exception into table Admin_Exception using usp_InsertExecption, the difference between two exceptions are the description.
- Use derived column to set Surrogate key as 0
- Use Ole Command to insert Inferred member using usp_InsertInferred<Dimtable> to Insert new dimension into correspondent dimension table, with IsInferred = 1. Each dimension has one stored procedure accordingly. In addition, Logic is already built in to avoid duplicated insertion.
- Use lookup again to get Surrogate key, while the lookup condition is with NO cache, and only apply for inferred member
- Use Union all to combine all records from three channels
NoteUse the same stored procedure to insert the inferred dimension and return the newly inserted dimension’s key. Sample code of member dimension:
CREATE PROCEDURE [dbo].[usp_TestDimMember]
( @MemberAlternateKey Varchar(16),@outval int OUTPUT)
IF not exists (select MemberAlternateKey from DimMember where MemberAlternateKey = @MemberAlternateKey and toDate is NULL)
insert into DimMember ( MemberAlternateKey,isInferred ) values (@MemberAlternateKey,1)
SET @outval = SCOPE_IDENTITY()
SELECT @outval = MAX(MemberKey) from DimMember where MemberAlternateKey = @MemberAlternateKey and toDate is NULL
DECLARE @errorString nvarchar(4000), @errorSeverity int, @errorState int
-- Common handler for exception message formatting and transaction rollback.
EXEC usp_Process_Exception @errorString OUT, @errorSeverity OUT, @errorState OUT
-- Pass the collected error information up the stack.
RAISERROR (@errorString,@errorSeverity,@errorState);END CATCH
- Create index for the business key column. For example, the MemberAlterKey column of DimMember table.
- Use Multicast component and OLEDB component to log the exception instead of using a stored procedure.