Tuesday, January 1, 2019

How to model Cognos FM to join two fact tables at different granularity with not shared dimension(s)

First of all, joining two fact tables at different granularity is a big no-no. In addition, we can't use a determinant on a fact table because it will aggregate your measure. However, this is not a valid case as data with different granularities should be defined as different tables from        data warehousing design point of view. This document is intended to share a solution to provide an effective Cognos FM model.

Problem
In order to explain this issue in detail, we create a sample below.
C3 is only used for FACT1. The M2 in FACT2, or denominator  is based on C1 and C2 only. The challenge is how to make Cognos FM model be flexible enough to allow report authors to generate report by simply drag and drop, using relational model or DMR model.
 
Analysis
It looks like there are a few options to try
Option1: Using UNION to bring both datasets together
 

Data looks like below

This is a problem, as data cannot be aligned in report.
 
 Option2: Using JOIN to bring both dataset together

Report

Even we apply semi aggregate function, we still cannot generate the current report.
 
Option3: Using STICTCH QUARY to bring both dataset together
Cognos has two core concepts, stitched query and determinant. However, this sample is not a typical sample, as C3 is not shared dimension. To resolve this problem, we can “FAKE” an extra column for FACT2 with value “total”. We can also “FAKE” an extra column for C3 with value “total”. Doing this way will bring all three dimensions as confirmed and shared dimensions, where the relationship between C3 and FACT2 is connect at “Total” level. This relationship can be further adjusted by determinant to avoid the double count.

Solution
Below is the FM design screenshot, where we get 3 dimensions are designed and turn the design as multiple fact tables with different granularities.



Create a faked column as C3_Total   in FACT2
[Data Source Layer].[ FACT2]: select C1, C2, M2, 'Total' as C3_Total  from [DATASOURCE1].FACT2

Create a view as V_C3  with a fake column Total, the reason we create this view is because only table or view can be supported for Cognos determinant.  
[Data Source Layer].[ C3]: select *  from [DATASOURCE1].V_C3
CREATE OR REPLACE VIEW SCHEMEX.V_C3 AS SELECT 'Total' AS TOTAL, SCHEMAX.C3.ID FROM C3



Result analysis
Case 1:
Report:

 SQL:
 select
 D2.ID ID ,
 D2.M1 M1 ,
 D3.M2 M2
from
(
 select
  C3.ID ID ,
  sum(FACT1.M1) M1
 from SCHEMAX.C3 C3, SCHEMAX.FACT1 FACT1
 where FACT1.C3 = C3.ID group by C3.ID
) D2,
(
 select
  sum(M2) M2
 from SCHEMAX.FACT2
  having count(*) > 0
) D3


 Data:


 Case 2:Report:

 SQL:
select
 D2.ID ID ,
 D2.M1 M1 ,
 D3.M2 M2 ,
 (D2.rc * 1.0e0) / nullif(D3.rc, 0) C____M1__M2_ ,
 min(D2.Summary_M1_) over () Summary_M1_ ,
 min(D3.Summary_M2_) over () Summary_M2_ ,
 (min(D2.rc5) over () * 1.0e0) / nullif(min(D3.rc) over (), 0) Summary____M1__M2__
from (
  select
   T0.C0 ID ,
   T0.C1 M1 ,
   T0.C1 rc
   , sum(T0.C1) over () Summary_M1_ ,
   sum(T0.C1) over () rc5
   from (
    select C3.ID C0 , sum(FACT1.M1) C1
    from SCHEMAX.C3 C3, SCHEMAX.FACT1 FACT1
    where FACT1.C3 = C3.ID group by C3.ID
   ) T0
 ) D2,
 (
  select
   distinct T0.C0 M2 ,
   T0.C0 rc ,
   T0.C0 Summary_M2_
   from
  (
   select sum(M2) C0 from SCHEMAX.FACT2
  ) T0
 ) D3


Data:

 Case 3:Report:
 SQL:
select
 (coalesce(D2.ID1, D3.ID1)) ID1 ,
 (coalesce(D2.ID2, D3.ID2)) ID2 ,
 D2.ID ID ,
 D2.M1 M1 ,
 D3.M2 M2
from
 (
  select
   C1.ID ID1 ,
   C2.ID ID2 ,
   C3.ID ID ,
   sum(FACT1.M1) M1
  from SCHEMAX
.C1 C1,
   SCHEMAX.C2 C2,
   SCHEMAX.V_C3 C3,
   SCHEMAX.FACT1 FACT1
  where FACT1.C1 = C1.ID
   and FACT1.C2 = C2.ID
   and FACT1.C3 = C3.ID
  group by C3.ID, C1.ID, C2.ID
 ) D2
 FULL OUTER JOIN
 (
  select C1.ID ID1 ,
   C2.ID ID2 ,
   sum(FACT2.M2) M2
  from SCHEMAX.C1 C1,
   SCHEMAX.C2 C2,
   SCHEMAX.FACT2 FACT2,
   (
    select C3.TOTAL TOTAL
    from SCHEMAX.V_C3 C3
    group by C3.TOTAL
   ) C3
  where C1.ID = FACT2.C1
   and C2.ID = FACT2.C2
   and C3.TOTAL = 'Total'
  group by C1.ID, C2.ID
 ) D3
 on D2.ID1 = D3.ID1 and D2.ID2 = D3.ID2;


Data:

 
  Case 4:Report:












SQL:

select
  (coalesce(D2.ID1, D3.ID1)) ID1 ,
  (coalesce(D2.ID2, D3.ID2)) ID2 ,
  D2.ID ID ,
  D2.M1 M1 ,
  D3.M2 M2
from
 (
   select
    C1.ID ID1 ,
    C2.ID ID2 ,
    C3.ID ID ,
    sum(FACT1.M1) M1
   from SCHEMAX.C1 C1,
    SCHEMAX.C2 C2,
    SCHEMAX.V_C3 C3,
    SCHEMAX.FACT1 FACT1
   where FACT1.C1 = C1.ID
    and FACT1.C2 = C2.ID
    and FACT1.C3 = C3.ID
   group by C3.ID, C1.ID, C2.ID
 ) D2
 FULL OUTER JOIN
 (
   select C1.ID ID1 ,
    C2.ID ID2 ,
    sum(FACT2.M2) M2
   from SCHEMAX.C1 C1,
    SCHEMAX.C2 C2,
    SCHEMAX.FACT2 FACT2,
    (
     select C3.TOTAL TOTAL
     from SCHEMAX.V_C3 C3
     group by C3.TOTAL
    ) C3
   where C1.ID = FACT2.C1
    and C2.ID = FACT2.C2
    and C3.TOTAL = 'Total'
   group by C1.ID, C2.ID
 ) D3
 on D2.ID1 = D3.ID1 and D2.ID2 = D3.ID2;

Data:

Saturday, December 1, 2018

How to build a practical planning and forecasting data model for Cognos reports

Requirement
Given that Actual data is provided monthly,  forecast data is provided quarterly for months up to next October, Plan data is provided annually for current October to next October. Sample below

 It is requested to report for current quarter, current fiscal year and current fiscal year to date




Approach       


Assume that Historical reports is not needed to reproduce.  
Report authors don't need to create measure on the fly, instead, the logic will be embedded in data mart.       
Report data should be designed as same granularity and addable.       

Solution
The idea is to wrap the logic into ETL as below
Combined actual: move forecast data from future periods into actual to report actual data for aggregated periods such as quarter, FY.    
Combine forecast: move forecast data from previous quarters to align to current quarter to report forecast data for aggregated periods such as quarter, FY and FYTD
Combined plan: move plan data from previous year to align to current year to report plan data for aggregated periods such as quarter, FY and FYTD.






Dataflow       
Step 1: Load combined plan data
Step 2: Normalize forecast data
Step 3: Load actual data
Step 4: load combined forecast data
Step 6: load combined actual data

Thursday, November 1, 2018

How to Transpose / Pivot millions of records efficiently in SQL/Netezza

SAS has the transpose process, SQL server and Oracle has pivot function, while both  transpose and pivot functions are not supported in Netezza version. This short document is intended to introduce a very efficient way to query data to retrieve the pivot result. Please see sample below

CREATE TABLE STUDENTS (SID VARCHAR(11)) ;
INSERT INTO STUDENTS VALUES ('ADAM') ;
INSERT INTO STUDENTS VALUES ('BRIAN') ;
INSERT INTO STUDENTS VALUES ('CHARLES') ;
INSERT INTO STUDENTS VALUES ('DAVE') ;
SELECT * FROM STUDENTS;
 

SID
DAVE
CHARLES
BRIAN
ADAM

 

CREATE TABLE MARKS (SID VARCHAR(11), SUBJECT VARCHAR(11), MARK INT2) ;
INSERT INTO MARKS VALUES ('ADAM', 'ENGLISH', 20) ;
INSERT INTO MARKS VALUES ('ADAM', 'HISTORY', 40) ;
INSERT INTO MARKS VALUES ('ADAM', 'SCIENCE', 60) ;
INSERT INTO MARKS VALUES ('BRIAN', 'HISTORY', 60) ;
INSERT INTO MARKS VALUES ('BRIAN', 'SCIENCE', 80) ;
INSERT INTO MARKS VALUES ('CHARLES', 'ENGLISH', 60) ;
INSERT INTO MARKS VALUES ('CHARLES', 'SCIENCE', 20) ;
INSERT INTO MARKS VALUES ('DAVE', 'ENGLISH', 80) ;
INSERT INTO MARKS VALUES ('DAVE', 'HISTORY', 20) ;
INSERT INTO MARKS VALUES ('DAVE', 'SCIENCE', 40) ;
SELECT * FROM MARKS;

 
SID
SUBJECT
MARK
ADAM
ENGLISH
20
ADAM
HISTORY
40
ADAM
SCIENCE
60
BRIAN
HISTORY
60
BRIAN
SCIENCE
80
CHARLES
ENGLISH
60
CHARLES
SCIENCE
20
DAVE
ENGLISH
80
DAVE
HISTORY
20
DAVE
SCIENCE
40

 

SELECT S.SID, E.MARK AS ENGLISH, H.MARK AS HISTORY, C.MARK AS SCIENCE
FROM STUDENTS AS S
LEFT OUTER JOIN MARKS AS E ON (S.SID = E.SID AND E.SUBJECT = 'ENGLISH')
LEFT OUTER JOIN MARKS AS H ON (S.SID = H.SID AND H.SUBJECT = 'HISTORY')
LEFT OUTER JOIN MARKS AS C ON (S.SID = C.SID AND C.SUBJECT = 'SCIENCE')
ORDER BY 1 ;

 
SID
ENGLISH
HISTORY
SCIENCE
ADAM
20
40
60
BRIAN
NULL
60
80
CHARLES
60
NULL
20
DAVE
80
20
40

  

Actually, this sample above is the excerpt from Netezza support. However,the most efficient way is use the query below with two advantages:
1.    Only one table is used
2.    Not any joins are needed
3.    The performance is much better than approach above

 
SELECT
SID
,MAX(CASE WHEN SUBJECT = 'ENGLISH' THEN MARK ELSE NULL END) AS ENGLISH
,MAX(CASE WHEN SUBJECT = 'HISTORY' THEN MARK ELSE NULL END) AS HISTORY
,MAX(CASE WHEN SUBJECT = 'SCIENCE' THEN MARK ELSE NULL END) AS SCIENCE
FROM MARKS
GROUP BY SID
ORDER BY 1 ;

SID
ENGLISH
HISTORY
SCIENCE
ADAM
20
40
60
BRIAN
NULL
60
80
CHARLES
60
NULL
20
DAVE
80
20
40

 

Monday, October 1, 2018

How to automatically create summary table based on existing FM Model

As Known, summary tables provides a big performance advantage. This document will share a solution to automatically generate summary table with the help of Excel spreadsheet and Cognos FM, assuming that all detail tables are ready and modeled from Cognos Framework Manager.

Assumption:
The existing ETL and reports based on detail tables represent a significant investment.
All these detail tables can be consolidated.

Goal:
We will make summary tables with analytical capacity with much better performance;
We can also use summary tables to speed up any new report development and dashboard; and
We can make summary tables to support both Cognos and Tableau.

Approach:
To make current existing ETL code and report not touched or with minimal adjustment, we will reuse conformed measures and conformed dimensions.
Overall the effort to create summary tables is assessed to be considerably smaller than expected.
Identify all conformed dimensions
Identify all conformed measures
create Summary tables
We can list all measures and then identify most used measures for summary tables.

Implementation:

Step 1: create a Data Warehouse Bus Matrix as below

Step 2: create a excel template to generate SQL

The basic logic is to define new table, and insert data from detail tables into new summary table. All column names are defined as column names and types, including both conformed dimensions and measures.

DROP TABLE FACT_AGGREGATE_PERFORMANCE ;
CREATE TABLE FACT_AGGREGATE_PERFORMANCE ....
INSERT INTO FACT_AGGREGATE_PERFORMANCE
SELECT cast(to_char(FILEDATE,'YYYYMM') as integer) AS YYYYMM..........
UNION
SELECT YEAR_MONTH AS YYYYMM, .........
 Create table for example: CREATE TABLE MPL_BI."&B14&" ("&C6&" "&C7&","&D6&" "&D7&","&E6&" "&E7&","&F6&" "&F7&","&G6&" "&G7&","&H6&" "&H7&","&I6&" "&I7&","&J6&" "&J7&","&K6&" "&K7&","&L6&" "&L7&","&M6&" "&M7&","&N6&" "&N7&","&O6&" "&O7&","&P6&" "&P7&","&Q6&" "&Q7&","&R6&" "&R7&","&S6&" "&S7&","&T6&" "&T7&","&C14&" "&C15&","&D14&" "&D15&","&E14&" "&E15&","&F14&" "&F15&","&G14&" "&G15&","&H14&" "&H15&","&I14&" "&I15&","&J14&" "&J15&","&K14&" "&K15&","&L14&" "&L15&","&M14&" "&M15&","&N14&" "&N15&");
Insert data for example: SELECT "&C4& " AS "&$C$6&","&D4& " AS "&$D$6&","&E4& " AS "&$E$6&","&F4& " AS "&$F$6&","&G4& " AS "&$G$6&","&H4& " AS "&$H$6&","&I4& " AS "&$I$6&","&J4& " AS "&$J$6&","&K4& " AS "&$K$6&","&L4& " AS "&$L$6&","&M4& " AS "&$M$6&","&N4& " AS "&$N$6&","&O4& " AS "&$O$6&","&P4& " AS "&$P$6&","&Q4& " AS "&$Q$6&","&R4& " AS "&$R$6&","&S4& " AS "&$S$6&","&T4& " AS "&$T$6&","&"SUM("&C12&")"&" AS " &$C$14& ","&"SUM("&D12&")"&" AS " &$D$14& ","&"SUM("&E12&")"&" AS " &$E$14& ","&"SUM("&F12&")"&" AS " &$F$14& ","&"SUM("&G12&")"&" AS " &$G$14& ","&"SUM("&H12&")"&" AS " &$H$14 &","&"SUM("&I12&")"&" AS " &$I$14 &","&"SUM("&J12&")"&" AS " &$J$14 &","&"SUM("&K12&")"&" AS " &$K$14 &","&"SUM("&L12&")"&" AS " &$L$14 &","&"SUM("&M12&")"&" AS " &$M$14 &","&"SUM("&N12&")"&" AS " &$N$14 & " FROM  MPL_BI."&B12 &C18 & " GROUP BY " &$C4& "," &$D4& "," &$E4& "," &$F4& "," &$G4& "," &$H4& "," &$I4& "," &$J4& "," &$K4& "," &$L4& "," &$M4& "," &$N4& "," &$O4& "," &$P4& "," &$Q4& "," &$R4& "," &$S4& "," &$T4 & " UNION "
Step 3:  Fill in logic for columns and measures
The logic is actually built into Cognos Framework Manager, we need to simply find the column and test result, and find the native SQL, and then fill into column.

Step 4:  handle special cases such as to combine detail fact tables with different granularity Combine detail fact tables with different granularity

Sometimes, the detail tables have different granularities and different columns. One detail table includes all data, but not all columns, while other fact table includes only partial data only but with much more related columns. The challenge is to combine both data together to present the single interface which should not make end users aware of two different data sets.
The solution is to separate measures into two different groups:
1.Make sure all conformed dimensions are  assigned to both  detail tables.  This way will make both detail fact tables share the same conformed dimensions. (This is very much act as the stitched query in Cognos with multiple fact table.) if column(s) is not available, then assign -1, or “-1” for the column.
2.Assume that there are two detail fact tables, or Fact 1 and Fact2. Fact 1 handles measure A, B and C, while Fact 2 handles measure D, E and F.  In this case, then assign Measure A, B and C, and assign 0, or NULL for measure D, E and F for  Fact 1; Assign Measure D, E and F, and assign 0, or NULL for measure A, B and C for  Fact 2 able.