ERWin Dimentional Modeling

Data structure and relationships among entities.

Data Model (conceptual level, logical level, physical level)

Diff b/t Dimensional Model and ER Model

Dimensions, Facts, Granularity and various types of dimensions.

 

Conceptual Data Model

(collating requirements, focus on entities and relationships)

 

Logical Data Model

(attributes, key groups, relationships, business rules)

 

Physical Data Model

(performance of database, strategy of indexing, physical storage, denormalization are considered in this phase)

 

Logical Data Model

Physical Data Model

Represents business info and defines business rules

Represents the physical implementation of the model in a database.

Entity

table

Attribute

column

Primary key

primary key constraint

alternate key

unique constraint or unique index

Inversion key entry

non unique index

rule

check constraint, default value

relationship

foreign key

definition

comment

 

 

 

Basics of RDBMS

1.       RDBMS

2.       Data modeling approaches(Conceptual, Logical, Physical)

3.       ER Modeling

4.       Keys(CK, PK, FK)

5.       Normalization Levels

6.       OLTP

7.       Data mart

8.       OLAP

9.       Datawarehouse(DWH)

 

Identifying Relationships

(It exists when the child table cannot be uniquely identified in the absence of the parent)

Ex : HolderAccount (p),  Account, Holder

Non-Identifying Relationships

(the child can uniquely be identified without the parent)

Ex : Account, AccountType

 

Types of Entities :

Independent entity, dependent entity

Ex: unit, unit head

 

Referential Integrity

(Every non-null foreign key value must be matched by an existing primary key value)

 

Types of Attributes :

Key and Non-key attributes

Required or optional attributes

Simple and composite attributes

Single-valued and multi-valued attributes

Stored, coded, or derived attributes

 

ER Model only supports two dimensional view of data.

Dimensional database is optimized to retrieve data and analyze it. It tends to be subject oriented.

 

Dimensional Modeling(DM) is a logical design technique used in DWH.

Mostly denormalized relational model

Made up of entities with attributes

Relationships defined by primary keys and foreign keys

Used by most contemporary BI solutions “Right” mix of normalization and de normalization often called Dimentional Normalization.

Consists of two types of tables : Dimension tables and Fact tables.

 

Dimension Keys

Dimension Business Key

(Column or columns that uniquely identify an instance of the business record)

(Used in the process of  ETL to tie fact records with dimension members)

Dimension Record Surrogate Key

(Defines the dimension’s primary key)

(Relates to the fact table foreign key field)

(Numeric data type, typically integer)

 

Dimension Surrogate Key

a.       Surrogate Key Usage

Consolidates multi-value business keys

Allows tracking of dimension history

Standardizes dimension tables

Limits fact table width for optimization

 

b.      Surrogate key Design Practices

Avoid smart keys

Avoid production keys (may changes!)

A company may acquire a competitor and thereby change the essential building rules changed-record, but intentionally not change the key.

Narrow as possible

 

Dimension Types

1.       Slowly Changing Dimension

It is a dimension whose attributes or attributes for a record(row) change slowly over time.

Type-1(Overwriting the history)

Type-2(Preserving the history)

Type-3(Preserving one or many versions of history)

2.       Junk(garbage) Dimension

A garbage Dimension is a dimension that includes low-cardinality columns such as codes, indicators and status flags.

The attributes in a Junk Dimension are not related to any hierarchy.

3.       Role-playing Dimension

A single dimension which gets differently expressed in a fact table using views is called a role-playing dimension.

Ex : Create two views for OrderDate and ReceivedData based on the same date dimension table.

 

Facts (or measures) can be categorized by

a)      Numerical data type

b)      Aggregation type

c)       Additive nature

d)      Granularity

Useful facts normally are numeric and additive.

Fact Types :

1)      Transactional

2)      Additive

Additive (Facts that can be summed up/aggregated across all dimensions within the fact table)

 ex : dollars sold

Semi-Additive (Facts that can be added up for some of dimensions in a fact table, but not for others)

 ex : account balance, inventory level, distinct counts

Non-Additive (Facts cannot be summed up for any of the dimensions which are present in the fact table)

Ex: measurement of room temperature

3)      Snapshot or inventory

4)      Factless

5)      Derived

6)      Aggregate

When to use snowflake

1)      The dimension table has two or more sets of attributes which normally define information at different grains.

2)      The attribute sets of the same dimension table are getting populated by different source systems.

It is not possible to implement a many to many relationship in a physical model as two tables can’t be children of one another

It needs to use a bridge table to solve this issue.

Introduce a junction table Customer_PaymentModes, which has many to one relationship with both Customer and PaymentModes.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章