PROCESS OF DISCOVERING INTERESTING PATTERN AND KNOWLEDGE FROM LARGE AMOUNTS OF DATA
DESCRIPTIVES
PREDICTIVES
DOMAINS
STATISTICS
Anotações:
Statistics studies the collection, analysis, interpretation or explanation, and presentation
of data
MACHINE LEARNING
Anotações:
Machinelearning investigates how computers can learn (or improve their performance)
based on data
PATTERN RECOGNITION
DATABASE
DATA WAREHOUSE
INFORMATION RETRIEVAL
VISUALIZATION
ALGORITHMS
HIGH PERFORMANCE COMPUTING
PATTERNS CAN BE MINED
DATA MINING
FUNCTIONALITIES
DISCRIMINATION
Anotações:
DISCRIMINATION: COMPARISON OF FEATURES OF ONE CLASS DATA OBJETC AGAINST GENERAL FEATURES OF OBJECTS FROM ONE OR MULTIPLE CLASS OBJECTS
CHARACTERIZATION:
summarizing the data of the class under study (often
called the target class) in general terms
FREQUEN PATTERNS
Anotações:
There are many kinds of frequent patterns, including frequent itemsets, frequent subsequences (also known as sequential patterns), and frequent substructures.
SUPPORT
CONFIDENCE
accuracy and
coverage
ASSOCIATIONS
CORRELATIONS
CLASSIFICATION AND
REGRESSION
Anotações:
Classification is the process of finding a model (or function) that describes and distinguishes data classes or concepts.
Regression analysis is astatistical methodology that is most often used for numeric prediction,
CLUSTERING ANALYSIS AND
OULIER ANALYSIS
Anotações:
Unlike classification and regression, which analyze class-labeled (training) data sets,
clustering analyzes data objects without consulting class labels.
INTERESTING PATTERNS
NOVEL
CERTAINTY
POTENTIALLY USEFUL
EASILY UNSDERSTOOD
PATTERN INTERSTINGNESS
SUBJECTIVE
OBJECTIVE
DATA CAN BE MINED
DATABASES
DATA WAREHOUSES
TRANSACTIONAL DATA
MANY OTHERS
ISSUES OF DATA MINING RESEARCH
MINING METHODOLOGIES
USER INTERACTION
EFFICIENCY AND SCALABILITY
DIVERSITY OF DATA TYPES
DATA MINING AND SOCIETY
VIEWS
APPLICATION
TECHNOLOGIES
DATA
KNOWLEDGE
PATTERN EVALUATION
Anotações:
¿Interesante?:
(1) easily understood byhumans, (2) valid on new or test data with some degree of certainty, (3) potentiallyuseful, and(4) novel. A pattern is also interesting if it validates a hypothesis that the user sought to confirm.