null
US
Sign In
Sign Up for Free
Sign Up
We have detected that Javascript is not enabled in your browser. The dynamic nature of our site means that Javascript must be enabled to function properly. Please read our
terms and conditions
for more information.
Next up
Copy and Edit
You need to log in to complete this action!
Register for Free
955764
Data Warehousing and Mining
Description
Revision mind map for Data Warehousing and Mining.
No tags specified
data management
data warehousing
data mining
Mind Map by
i7752068
, updated more than 1 year ago
More
Less
Created by
i7752068
over 10 years ago
233
1
0
Resource summary
Data Warehousing and Mining
Data Warehousing
Increased corporate productivity.
Competitive advantage.
Potential for high ROI.
Extremely high initial costs (£50k+)
Long development time (3 years +/-)
High demand for memory.
High maintenance costs.
Problems with source data (extraction, cleaning, loading).
Building a Data Warehouse Database (Dimensionality Modelling)
Fact Tables
Contains facts generated by events in the past.
Data in tables should be regarded as read only.
Tables are often very large.
Dimension Tables
Contains descriptive textual data.
Simple primary keys.
Gives a characteristic star scheme or star join.
Star Schema
De-normalising reference data can speed up query performance.
Main aim is to avoid data redundancy.
This achieved in part via the process of normalisation.
OTLP System
Automating business saves money.
Data could be useful in organisations future operations.
Information too detailed.
May require information from more than one OTLP system.
Difficult to extract information.
Snowflake Schema
Variant of Star Schema where dimension tables do not contain de-normalised data.
Dimension tables have other dimension tables linked to them via foreign keys.
More than one dimension table can share these "dimension of a dimension" tables.
Starflake Schema
Hybrid structure that contains a mixture of star and snowflake schema's.
Contains both normalised and de-normalised data.
Some dimension tables may be present in both normalised and de-normalised forms.
OLAP Analytical Operations
Consolitation
Involves the aggregation of data, such as "roll ups" e.g. branches can be rolled up to cities, cities to countries etc.
Drill-down
Reverse of consolidation.
Involves displaying the detailed data that compromises the consolidated data.
Slicing and Dicing (aka pivoting)
Ability to view data from different viewpoints.
One slice may display revenue by type of property within cities.
Another slice may display revenue by branch office within city.
Often performed along a time axis to find patterns and trends.
Data Mining Operations and Techniques
Predictive Modelling
Reflect human experience using observations to form a model of the important characteristics of some phenomenon.
Model developed using a two-phase supervised learning approach.
The training phase uses a large sample of historical data called a training set to build a model of the important characteristics.
The testing phase tests the accuracy and performance of the model on new data.
Used in credit approval, customer retention management, direct marketing.
Database Segmentation
Partition database into an unknown number of segments or clusters of similar records.
Results can be displayed on scatterplot.
Used in customer profiling and direct marketing.
Link Analysis
Aims to discover links (called associations) between individual records or groups of records in a database.
Anomaly Detection
Identifies outliers (expressions of deviation from previously known expectations and norms).
Used in detection of credit card and insurance fraud, quality control and defects tracing.
Media attachments
Star_Schema.gif (image/gif)
Snowflake-schema (image/png)
105005.gif (image/gif)
Show full summary
Hide full summary
Want to create your own
Mind Maps
for
free
with GoConqr?
Learn more
.
Similar
Transactions
i7752068
Chapter 19 Key Terms
Monica Holloway
Insurance Policy Advisor
Sufiah Takeisu
Marketing Research and Support Systems
Kathleen Keller
Data Mining Part 1
Kim Graff
Chapter 4 Flashcards
Dennis Jameson
Minería de Datos.
Marcos Soledispa
Machine Learning
Alberto Ochoa
Data Mining from Big Data 4V-s
Prohor Leykin
Model Roles
Steve Hiscock
Data Mining Process
Steve Hiscock
Browse Library