Thursday, November 1, 2018

How to Transpose / Pivot millions of records efficiently in SQL/Netezza

SAS has the transpose process, SQL server and Oracle has pivot function, while both  transpose and pivot functions are not supported in Netezza version. This short document is intended to introduce a very efficient way to query data to retrieve the pivot result. Please see sample below

CREATE TABLE STUDENTS (SID VARCHAR(11)) ;
INSERT INTO STUDENTS VALUES ('ADAM') ;
INSERT INTO STUDENTS VALUES ('BRIAN') ;
INSERT INTO STUDENTS VALUES ('CHARLES') ;
INSERT INTO STUDENTS VALUES ('DAVE') ;
SELECT * FROM STUDENTS;
 

SID
DAVE
CHARLES
BRIAN
ADAM

 

CREATE TABLE MARKS (SID VARCHAR(11), SUBJECT VARCHAR(11), MARK INT2) ;
INSERT INTO MARKS VALUES ('ADAM', 'ENGLISH', 20) ;
INSERT INTO MARKS VALUES ('ADAM', 'HISTORY', 40) ;
INSERT INTO MARKS VALUES ('ADAM', 'SCIENCE', 60) ;
INSERT INTO MARKS VALUES ('BRIAN', 'HISTORY', 60) ;
INSERT INTO MARKS VALUES ('BRIAN', 'SCIENCE', 80) ;
INSERT INTO MARKS VALUES ('CHARLES', 'ENGLISH', 60) ;
INSERT INTO MARKS VALUES ('CHARLES', 'SCIENCE', 20) ;
INSERT INTO MARKS VALUES ('DAVE', 'ENGLISH', 80) ;
INSERT INTO MARKS VALUES ('DAVE', 'HISTORY', 20) ;
INSERT INTO MARKS VALUES ('DAVE', 'SCIENCE', 40) ;
SELECT * FROM MARKS;

 
SID
SUBJECT
MARK
ADAM
ENGLISH
20
ADAM
HISTORY
40
ADAM
SCIENCE
60
BRIAN
HISTORY
60
BRIAN
SCIENCE
80
CHARLES
ENGLISH
60
CHARLES
SCIENCE
20
DAVE
ENGLISH
80
DAVE
HISTORY
20
DAVE
SCIENCE
40

 

SELECT S.SID, E.MARK AS ENGLISH, H.MARK AS HISTORY, C.MARK AS SCIENCE
FROM STUDENTS AS S
LEFT OUTER JOIN MARKS AS E ON (S.SID = E.SID AND E.SUBJECT = 'ENGLISH')
LEFT OUTER JOIN MARKS AS H ON (S.SID = H.SID AND H.SUBJECT = 'HISTORY')
LEFT OUTER JOIN MARKS AS C ON (S.SID = C.SID AND C.SUBJECT = 'SCIENCE')
ORDER BY 1 ;

 
SID
ENGLISH
HISTORY
SCIENCE
ADAM
20
40
60
BRIAN
NULL
60
80
CHARLES
60
NULL
20
DAVE
80
20
40

  

Actually, this sample above is the excerpt from Netezza support. However,the most efficient way is use the query below with two advantages:
1.    Only one table is used
2.    Not any joins are needed
3.    The performance is much better than approach above

 
SELECT
SID
,MAX(CASE WHEN SUBJECT = 'ENGLISH' THEN MARK ELSE NULL END) AS ENGLISH
,MAX(CASE WHEN SUBJECT = 'HISTORY' THEN MARK ELSE NULL END) AS HISTORY
,MAX(CASE WHEN SUBJECT = 'SCIENCE' THEN MARK ELSE NULL END) AS SCIENCE
FROM MARKS
GROUP BY SID
ORDER BY 1 ;

SID
ENGLISH
HISTORY
SCIENCE
ADAM
20
40
60
BRIAN
NULL
60
80
CHARLES
60
NULL
20
DAVE
80
20
40

 

Monday, October 1, 2018

How to automatically create summary table based on existing FM Model

As Known, summary tables provides a big performance advantage. This document will share a solution to automatically generate summary table with the help of Excel spreadsheet and Cognos FM, assuming that all detail tables are ready and modeled from Cognos Framework Manager.

Assumption:
The existing ETL and reports based on detail tables represent a significant investment.
All these detail tables can be consolidated.

Goal:
We will make summary tables with analytical capacity with much better performance;
We can also use summary tables to speed up any new report development and dashboard; and
We can make summary tables to support both Cognos and Tableau.

Approach:
To make current existing ETL code and report not touched or with minimal adjustment, we will reuse conformed measures and conformed dimensions.
Overall the effort to create summary tables is assessed to be considerably smaller than expected.
Identify all conformed dimensions
Identify all conformed measures
create Summary tables
We can list all measures and then identify most used measures for summary tables.

Implementation:

Step 1: create a Data Warehouse Bus Matrix as below

Step 2: create a excel template to generate SQL

The basic logic is to define new table, and insert data from detail tables into new summary table. All column names are defined as column names and types, including both conformed dimensions and measures.

DROP TABLE FACT_AGGREGATE_PERFORMANCE ;
CREATE TABLE FACT_AGGREGATE_PERFORMANCE ....
INSERT INTO FACT_AGGREGATE_PERFORMANCE
SELECT cast(to_char(FILEDATE,'YYYYMM') as integer) AS YYYYMM..........
UNION
SELECT YEAR_MONTH AS YYYYMM, .........
 Create table for example: CREATE TABLE MPL_BI."&B14&" ("&C6&" "&C7&","&D6&" "&D7&","&E6&" "&E7&","&F6&" "&F7&","&G6&" "&G7&","&H6&" "&H7&","&I6&" "&I7&","&J6&" "&J7&","&K6&" "&K7&","&L6&" "&L7&","&M6&" "&M7&","&N6&" "&N7&","&O6&" "&O7&","&P6&" "&P7&","&Q6&" "&Q7&","&R6&" "&R7&","&S6&" "&S7&","&T6&" "&T7&","&C14&" "&C15&","&D14&" "&D15&","&E14&" "&E15&","&F14&" "&F15&","&G14&" "&G15&","&H14&" "&H15&","&I14&" "&I15&","&J14&" "&J15&","&K14&" "&K15&","&L14&" "&L15&","&M14&" "&M15&","&N14&" "&N15&");
Insert data for example: SELECT "&C4& " AS "&$C$6&","&D4& " AS "&$D$6&","&E4& " AS "&$E$6&","&F4& " AS "&$F$6&","&G4& " AS "&$G$6&","&H4& " AS "&$H$6&","&I4& " AS "&$I$6&","&J4& " AS "&$J$6&","&K4& " AS "&$K$6&","&L4& " AS "&$L$6&","&M4& " AS "&$M$6&","&N4& " AS "&$N$6&","&O4& " AS "&$O$6&","&P4& " AS "&$P$6&","&Q4& " AS "&$Q$6&","&R4& " AS "&$R$6&","&S4& " AS "&$S$6&","&T4& " AS "&$T$6&","&"SUM("&C12&")"&" AS " &$C$14& ","&"SUM("&D12&")"&" AS " &$D$14& ","&"SUM("&E12&")"&" AS " &$E$14& ","&"SUM("&F12&")"&" AS " &$F$14& ","&"SUM("&G12&")"&" AS " &$G$14& ","&"SUM("&H12&")"&" AS " &$H$14 &","&"SUM("&I12&")"&" AS " &$I$14 &","&"SUM("&J12&")"&" AS " &$J$14 &","&"SUM("&K12&")"&" AS " &$K$14 &","&"SUM("&L12&")"&" AS " &$L$14 &","&"SUM("&M12&")"&" AS " &$M$14 &","&"SUM("&N12&")"&" AS " &$N$14 & " FROM  MPL_BI."&B12 &C18 & " GROUP BY " &$C4& "," &$D4& "," &$E4& "," &$F4& "," &$G4& "," &$H4& "," &$I4& "," &$J4& "," &$K4& "," &$L4& "," &$M4& "," &$N4& "," &$O4& "," &$P4& "," &$Q4& "," &$R4& "," &$S4& "," &$T4 & " UNION "
Step 3:  Fill in logic for columns and measures
The logic is actually built into Cognos Framework Manager, we need to simply find the column and test result, and find the native SQL, and then fill into column.

Step 4:  handle special cases such as to combine detail fact tables with different granularity Combine detail fact tables with different granularity

Sometimes, the detail tables have different granularities and different columns. One detail table includes all data, but not all columns, while other fact table includes only partial data only but with much more related columns. The challenge is to combine both data together to present the single interface which should not make end users aware of two different data sets.
The solution is to separate measures into two different groups:
1.Make sure all conformed dimensions are  assigned to both  detail tables.  This way will make both detail fact tables share the same conformed dimensions. (This is very much act as the stitched query in Cognos with multiple fact table.) if column(s) is not available, then assign -1, or “-1” for the column.
2.Assume that there are two detail fact tables, or Fact 1 and Fact2. Fact 1 handles measure A, B and C, while Fact 2 handles measure D, E and F.  In this case, then assign Measure A, B and C, and assign 0, or NULL for measure D, E and F for  Fact 1; Assign Measure D, E and F, and assign 0, or NULL for measure A, B and C for  Fact 2 able.

Saturday, September 1, 2018

How to make KPI measure clickable in Cognos active reports

Requirement

When clicking Revenue
 


















 


When click Product cost





















Analysis
Use crosstab: As only active report variable can only specified by data item, it is difficult to find function to retrieve measure caption. In other word, when clicking value 914M, active variable should be specified as ‘Revenue’; when clicking value 547M, active variable should be specified as ‘Product cost’.  The only solution is to split it as multiple crosstab, each crosstab will specify one data item, such as  ‘Revenue’ or  ‘Product cost’.  Even with multiple crosstabs, the column title still cannot be share the same highlighted area, as column title can is not specified as click region.
Use List: It will have the same problem as crosstab. This approach won’t be very flexible, as crosstab give data more flexibility to display.
Use static repeat table: This solution provides the maximum display flexibility, but it still like crosstab and list to have their limitations.
Use static repeat table with singleton to achieve the needed functionality. Each static repeat table has only one cell. Each cell will contain normal table, and each table will have singleton. All query can be shared.

Implementation
- Create a table with two rows, where the first one contains table with two static repeat table. Each static table has only one cell. the second row will contain a deck with two cards – page1 and Page2


- To make the title with measure together, we need to put both value and title together. Please see page structure below.











































- Variable vPage will be specified by both active reportable table, and then pass it to deck to make it take in effect .

Wednesday, August 1, 2018

How to make Cognos report work with historical data snapshot

1 Context
There are many business cases where you need to run report based on historical data and produce the same report generated before. If the data amount is very big, then this requirement is very challenge. There are difficult ways to archive historical data, such as save the whole database, save the whole table. However, these solutions may not be very efficient in term of capacity and maintenance. This document is intended to share one solution based on assumptions below:
1) Only portion of data are changed
2) Cognos framework manager is used

2 Overall solutionThe diagram below illustrates the architecture about how the data flow to report based on the selection.


3 Host historical data in separated table with snapshot column
It is proposed to host two tables below:

Current FactTable
Static Fields
Dynamic Fields
SF1
SFn
DF1
DFn
XX
XX
XX
XX
XX
XX
XX
XX
XX
XX
XX
XX


Historical  FactTable  
Snapshot 1 
Snapshot n
Snapshot
DF1
DFn
Snapshot
DF1
DFn
Snapshot 1
XX
XX
XX
Snapshot n
XX
XX
XX
Snapshot 1
XX
XX
XX
Snapshot n
XX
XX
XX

Please note that Current FactTable holds all current data with all columns containing two sets of data:
Static data – data is not changed
Dynamic data – data is changed month by month
Historical  FactTable  holds all historical snapshots, for example, each month has one snapshot. The snapshot only saves columns that data are changed.

4 Make report generation based on snapshot
A new parameter is introduced to allow report users to choose snapshot. This parameter is built into historical FactTable with filter Snapshot = #Prompt()# with default value. If it is default, then there won’t be any data for report to use. But if users choose a snapshot, then these dynamic columns will be used for report.

5 Introduce a generic reportable Fact Table
Reportable fact table is dynamic generated based on snapshot. If there is snapshot specified, then these columns will be replaced from historical data based on selected snapshot

Reportable FactTable
Static Fields
Dynamic Fields
SF1
SFn
DF1
DFn
XX
XX
XX
XX
XX
XX
XX
XX
XX
XX
XX
XX

Basically, if new parameter is introduced to allow report users to choose snapshot. This parameter is built into historical FactTable with filter Snapshot = #Prompt()# with default value. If it is default, then there won’t be any data for report to use. But if users choose a snapshot, then these dynamic columns will be used for report.

6 Make left join relationship to accomplish replace current data with a historical snapshot
The generic reportable Fact Table is the join result from both current Fact table and historical Fact table. There are two cases:
Case 1: when no snapshot selected, then the left join simply get no data from historical snapshot. The dynamic columns in reportable fact table is actually from current fact table.
Case 2: when a snapshot is selected, then the left join get data from historical snapshot. The dynamic columns in reportable fact table will take data from snapshot over the original data from current fact table.
All the logic can be built in Cognos framework manager for business tier. The advantage is that all existing reports will not be needed to change at all. we can simply change column definition to achieve this goal, such as NVL ([Historical FactTable].[DFi], ([Currental FactTable].[DFi],))