These conventions are based upon Oracle’s BI Applications 7.9.6.3 standards.
The Analytic Data Warehouse (ADW) is an arbitrary name for the performance layer of your data-warehouse. The diagram below shows a typical data warehouse reference architecture with the Performance Layer circled in red.
Table Names
The ADW schema will predominantly hold tables built by ETL and queried by OBIEE
- Fact tables contain columns which can be summed, averaged, etc.
- Dimensional tables contain descriptive columns
- ADW tables to be loosely named in the format business name/grain/type
- Business name is a short business description of the table which is not tied to Northgate or e-Business Suite names
- Grain is optional and describes table granularity (useful for aggregate tables)
- For the time dimension, the level names DAY, WEEK, MONTH, YEAR should be used
- Type is one of:
- _A = aggregated fact table
- _D = dimension table
- _DH = dimension hierarchy table
- _F = fact table
- Staging tables (used during ETL but not by OBIEE) will have S appended to the
- _DS = dimensional staging table
- _FS = fact staging table
- Materialized views should have MV appended to them
- _DMV = dimensional materialized view
- _FMV = fact materialized view
- By default use singular names (facts maybe plural?)
- Some examples:
- TENANCY_D = Tenancy dimension
- DEBT_DETAIL_F = Debt fact (lowest level detail)
- DEBT_REGION_A = Debt fact aggregated to a regional level
- DAY_D = Calendar dimension at day level (lowest level)
- WEEK_D = Calendar dimension at week level
- MONTH_D = Calendar dimension at month level
- PROPERTY_UTILISATION_FMV = Property utilisation fact materialized view
Columns
- Column names should be business relevant, non-technical, and not tied to Northgate or e-Business Suite names
- Column suffixes:
- WID - Identifier generated by ETL to link dimension and fact tables (except ROW_ID). For example TENANCY_WID would reference TENANCY_D.ROW_ID.
- _ID - Source system identifier
- _FLG - Flags Y/N
- _IND - Indicator e.g. D,W,M
- _DT - Date
- _TS – Timestamp
- _AMT - Amount
- _CNT - Count
- _PCT - Percentage
- _CODE - A coded alphanumeric value that defines a possible value for a data element
- _NAME - A short name that describes the code
- _DESC - A long description for the code
- Other conventions:
- _NO - For descriptive number columns, i.e. TELEPHONE_NO
- _TXT - For text fields which contain concatenated values etc.
- System columns on each table:
- ROW_WID - surrogate key to uniquely identify record (generated during ETL), first column in table
- DATASOURCE_NUM_ID - Unique identifier of source system from which data was extracted. TBD numbers to use, 2nd column in table
- INTEGRATION_ID - Unique identifier of fact or dimension in source system, 3rd column in table
- W_INSERT_DT - Date inserted to Warehouse, at the end of table column definitions
- W_UPDATE_DT - Date updated to Warehouse, at the end of table column definitions
- ETL_RUN_ID - Identifier of ETL process that loaded, integrated, checked the row
- Optional on each table:
- CREATED_ON_DT - Date created in source system
- CHANGED_ON_DT - Date changed in source system
- DELETED_FLG - For logical deletes of warehouse records
- Type 2 Slowly Changing Dimension columns:
- EFFECTIVE_FROM_DT – Date that the dimensional record is effective from
- EFFECTIVE_TO_DT – Date that the dimensional record is effective to
- CURRENT_FLG – Is record current? Y/N
- All ADW surrogate keys should be integers and generated from a sequence
- Foreign keys should always use warehouse surrogate keys
- The W_ prefix denotes a translated to warehouse code/name/description combination. (the concept of domain values)
Notes on Time Dimensions
- Reporting, financial, and other calendars are all associated (de-normalized) onto each day of the year
- Dimensional aggregates are needed by OBIEE for aggregate navigation. So WEEK_D, MONTH_D, and YEAR_D should also be created.
- ETL updates dimension with reporting flags as necessary
- The surrogate key ROW_WID is an integer representation of the period, of the format YYYYMMDD or YYYYMM or YYYY
Example Physical Table Column Conventions
Domain | Datatype | Notes |
AMT | Number (12,2) | Amount |
CNT | Integer | Count |
CODE | Varchar2 (20) | Code e.g. TUS, TUD |
DD | Number (2) | Day of month e.g. 1-31 |
DDD | Number (3) | Day of year e.g. 1-366 |
DESC | Varchar2 (200) | Code description |
DS | Number (2) | Datasource e.g. 1=EBS |
DT | Date | Date and time |
FLG | Char (1) | Y/N flag |
ID | Varchar2 (12) | Identifier |
IND | Varchar2 (3) | Indicator e.g. D,W,M,Q,A |
MM | Number (2) | Month of year e.g. 1-12 |
NAME | Varchar2 (250) | Name |
NO | Number (12) | Number |
PCT | Number (3,3) | Percentage |
| Number (2) | Quarter of year e.g. 1-4 |
TS | Timestamp | Timestamp |
TXT | Varchar2 (200) | Text string e.g. +64 4 1234567 |
WW | Number (2) | Week of year e.g. 1-53 |
WID | Number (12) | Warehouse Identifier i.e. rowid |
YYYY | Number (4) | Year e.g. 2012 |
Refresher on Slowly Changing Dimensions Types
(from Wikipedia)