DWH

Description

Wirtschaftsinformatik Flashcards on DWH, created by Amelie Stute on 25/01/2020.
Amelie Stute
Flashcards by Amelie Stute, updated more than 1 year ago
Amelie Stute
Created by Amelie Stute almost 5 years ago
7
0

Resource summary

Question Answer
CAP-Theorem For any system sharing data it is impossible to guarantee simultaneously all of these three properties - Consistency - Availability - Partition Tolerance
Availability (CAP-Theorem) system can run even if parts have failed
Partition tolerance (CAP-Theorem) network can break into two or more parts, each with active components that cannot communicate with other parts; overall system can tolerate such situations
Consistency (CAP-Theorem) all copies have same value
Is eventual consistency ok? If there is an ongoing partition and you want to be available, the compromise eventual consistency is a good idea. If there is no partition going on, the CAP Theorem does not justify the eventual consistency!
New Data related trends - New Data sources - (Near) real time ETL - Self-service BI & Data discovery
New Data Sources - Physical objects merge with IT (IoT) - Cyber-physical systems (CPS) integrate smart objects and information systems to: o Record data using sensors o (Re-/)Actively interact both with the physical and digital world
(Near) real-time ETL The value of data-driven information reduces over time. Motives: - Higher user expectations - Globalization (different time zones) - New types of data sources (stock market prices -> real time info accessible) - Affordable technical realizations
Self-service BI & data discovery enable BI users to become more self-reliant and less dependent on the IT organization
Information Needs Analysis (Components) - Requested Information - Information supply - Subjective needed information - objective needed information
Requested information explicit Management request
Information supply all the information that are out there and can be provided
Objective Information need actual required to fulfill an organization’s objectives) deducted from the strategy; the decisions that need to be made; external knowledge
subjective Information need perceived as required; implicit in his mind
Operational Objectives & Business Strategy Operational Objectives enable management to do the right things right. “right things”: defined by business strategy “right”: derived from operational objectives
DWH Definition Inmon a collection of -subject oriented (focus on analytical requirements) - integrated (complex effort of joining together data) -nonvolatile (durability of data is ensured, disallowing data modification) -and time-varying (different values for the same information and the time restoring the historical truth of data) data to support management decisions
Data mart -Are specialized DWH targeted toward a particular functional area/user group in an organization -Can be either derived from an enterprise DWH or collected directly from data sources -Are easier to build than an enterprise WH
Disadvantages of Data marts -No reconcilability of data (Disregarding “single point of truth” (SPOT) principle) - Extract proliferation (Extractvermehrung - ever increasing extracts form the DWH needed) - Change propagation (Änderungsausbreitung - changes in one data mart may echo through all data marts; chances of error quickly grow - Non-extensibility (danger to start form scratch after major organizational changes)
Design principles
Design principle Examples for each step 1. avoid variations in color that do not encode any meaning 2. grid lines in graphs often represent distracting non-data Pixels 3. Present only information that is relevant for the user 4. Different degrees of visual emphasis
Designing the multidimensional model (Process) 1. Choose the Business process 2. Declare the grain 3. Identify the dimensions 4. Identify the facts
Logical data modeling Schema / Generic DWH schema - RO – RO comb – RO hier. - Dim – Dim comb – Dim – hier. - Fact – Ratio – Ratio comb
Advantages Snowflake-schema + better storage and querying with sparse dimensions (geringe Maße) + reflects the way users think about data
Disadvantages snowflake-schema - performance is affected, since more joins need to be performed, when executing queries along hierarchy paths - benefit on normalization is insignificant - more complex structure than star
Advantages Star Schema + less joins need to be performed + benefit on normalization is quite significant
Disadvantages Star-Schema - querying with larger dimensions - does not reflect the way users think
Ways to handle slowly changing dimensions - Overwrite (+ easy; - not for analytical attr) - add new dimension row (+ "reliable workhorse"; - big tables get bigger) - Add new dimension Attributes (+"soft changes" (new & old values relevant; - only for limited No of changes)
Show full summary Hide full summary

Similar

Wirtschaftsinformatik Teil 2
Sabrina Heckler
U8 SQL
Lena A.
Wirtschaftsinformatik Teil 1
Sabrina Heckler
Wirtschaftsinformatik1 - UIB - MA
Nikica Kalem
SQL Basics
Han Gru
IT-Governance
Torsten Knabel
IT-Controlling
Torsten Knabel
My SQL
chrisi.0605
Wirtschaftsinformatik
Ausizio Talan
Betriebliche Anwendungssysteme
Torsten Knabel
Datenbanken
David Hoffmann