Zusammenfassung der Ressource
Data Mining from Big Data 4V-s
- Volume - the simpliest
- High dimensions
- Large number of records
- New sources
- Velocity harder
- Interaction with a customer
- Capture data, learn and act
- Enhance customer's journey
- Iteratively improve user expericence
- Variety - the hardest
- Number of data owners exploded
- Value
- KYC (know your customer)
- Difficult for internet companies - never sees customers
- Being able to exploit all the data available
- Age of analythics
- Access to Data
- SQL
- Look-up a few records
- Populate standard report
- OLAP, mining
- Create new report
- Data Mining
- Locate a problem
- Optimize business process
- Answer a tough question
- Understand something new
- Finding interesting structure in data
- Interesting patterns
- Segmentation, data clustering
- Predictive models
- Classification, regression
- Hidden relations
- Affinity (summarization) - relation between fields, associations
- Work for Data scientist
- Understands business needs
- Able to close those gaps
- Algorithms
- Knows more about statistics than programmer
- Data logic
- Knows more about programming, than statistician
- Technologies
- Summarization
- Variable corellation
- Frequent itemsets
- Association rules
- Clustering
- Distance
- Partition
- Sequence analysis
- Classification / prediction
- Decision trees
- Neural networks
- Bayers nets
- Regression
- Support vector machines
- But
- End user is not a statistician
- Lack data warehousing expertise
- IT focus is to keep running
- Building data warehouse is too expensive
- Proliferating analytics throughout the organization
- Make every part of business smarter
- Embedding analytics into every area
- Significant business value
- Acquire and enhance actions
- Marketing and sales
- Identify potential customer
- Establish campaign effectiveness
- Manufacturing process
- Causes of manufacturing problems
- Customer behaviour
- Affinities, propensities
- Fraudulent transaction detection
- Loan approval
- Establish credit worthiness of customer
- Web analytics and metrics
- Model user preferences
- Recommendation, targeting