null
US
Entrar
Registre-se gratuitamente
Registre-se
Detectamos que o JavaScript não está habilitado no teu navegador. Habilite o Javascript para o funcionamento correto do nosso site. Por favor, leia os
Termos e Condições
para mais informações.
Próximo
Copiar e Editar
Você deve estar logado para concluir esta ação!
Inscreva-se gratuitamente
955764
Data Warehousing and Mining
Descrição
Revision mind map for Data Warehousing and Mining.
Sem etiquetas
data management
data warehousing
data mining
Mapa Mental por
i7752068
, atualizado more than 1 year ago
Mais
Menos
Criado por
i7752068
mais de 10 anos atrás
231
1
0
Resumo de Recurso
Data Warehousing and Mining
Data Warehousing
Increased corporate productivity.
Competitive advantage.
Potential for high ROI.
Extremely high initial costs (£50k+)
Long development time (3 years +/-)
High demand for memory.
High maintenance costs.
Problems with source data (extraction, cleaning, loading).
Building a Data Warehouse Database (Dimensionality Modelling)
Fact Tables
Contains facts generated by events in the past.
Data in tables should be regarded as read only.
Tables are often very large.
Dimension Tables
Contains descriptive textual data.
Simple primary keys.
Gives a characteristic star scheme or star join.
Star Schema
De-normalising reference data can speed up query performance.
Main aim is to avoid data redundancy.
This achieved in part via the process of normalisation.
OTLP System
Automating business saves money.
Data could be useful in organisations future operations.
Information too detailed.
May require information from more than one OTLP system.
Difficult to extract information.
Snowflake Schema
Variant of Star Schema where dimension tables do not contain de-normalised data.
Dimension tables have other dimension tables linked to them via foreign keys.
More than one dimension table can share these "dimension of a dimension" tables.
Starflake Schema
Hybrid structure that contains a mixture of star and snowflake schema's.
Contains both normalised and de-normalised data.
Some dimension tables may be present in both normalised and de-normalised forms.
OLAP Analytical Operations
Consolitation
Involves the aggregation of data, such as "roll ups" e.g. branches can be rolled up to cities, cities to countries etc.
Drill-down
Reverse of consolidation.
Involves displaying the detailed data that compromises the consolidated data.
Slicing and Dicing (aka pivoting)
Ability to view data from different viewpoints.
One slice may display revenue by type of property within cities.
Another slice may display revenue by branch office within city.
Often performed along a time axis to find patterns and trends.
Data Mining Operations and Techniques
Predictive Modelling
Reflect human experience using observations to form a model of the important characteristics of some phenomenon.
Model developed using a two-phase supervised learning approach.
The training phase uses a large sample of historical data called a training set to build a model of the important characteristics.
The testing phase tests the accuracy and performance of the model on new data.
Used in credit approval, customer retention management, direct marketing.
Database Segmentation
Partition database into an unknown number of segments or clusters of similar records.
Results can be displayed on scatterplot.
Used in customer profiling and direct marketing.
Link Analysis
Aims to discover links (called associations) between individual records or groups of records in a database.
Anomaly Detection
Identifies outliers (expressions of deviation from previously known expectations and norms).
Used in detection of credit card and insurance fraud, quality control and defects tracing.
Anexos de mídia
Star_Schema.gif (image/gif)
Snowflake-schema (image/png)
105005.gif (image/gif)
Quer criar seus próprios
Mapas Mentais
gratuitos
com a GoConqr?
Saiba mais
.
Semelhante
Transactions
i7752068
Chapter 19 Key Terms
Monica Holloway
Insurance Policy Advisor
Sufiah Takeisu
Marketing Research and Support Systems
Kathleen Keller
Data Mining Part 1
Kim Graff
Chapter 4 Flashcards
Dennis Jameson
Minería de Datos.
Marcos Soledispa
Machine Learning
Alberto Ochoa
Data Mining from Big Data 4V-s
Prohor Leykin
Model Roles
Steve Hiscock
Data Mining Process
Steve Hiscock
Explore a Biblioteca