Machine Learning

Descrição

Machine Learning Mind Map based on the book "Hands-On Machine Learning with Scikit-Learn and TensorFlow".
Luan Pessoa Rocha
Mapa Mental por Luan Pessoa Rocha, atualizado more than 1 year ago
Luan Pessoa Rocha
Criado por Luan Pessoa Rocha mais de 5 anos atrás
106
0

Resumo de Recurso

Machine Learning
  1. Types
    1. Classification
      1. Prediction
      2. Learning Task
        1. Supervised

          Anotações:

          • Training data includes the desired solution(labels)
          1. Common Tasks
            1. Classification
              1. Prediction
              2. Common Algorithms
                1. k-Nearest Neighbors
                  1. Linear Regression
                    1. Logistic Regression
                      1. Support Vector Machines(SVMs)
                        1. Decision Tree
                          1. Random Forests
                            1. Neural Networks
                          2. Unsupervised

                            Anotações:

                            • Tries to learn without a labeled data
                            1. Common Tasks
                              1. Clustering
                                1. K-Means
                                  1. Hierarchical Cluster Anaysis(HCA)
                                    1. Expectation Maximization
                                    2. Visualization and Dimensionality Reduction
                                      1. Principal Component Analysis(PCA)
                                        1. Kernel PCA
                                          1. Locally-Linear Embedding(LLE)
                                            1. t-Distributed Stochastic Neighbor Embedding(t-SNE)
                                            2. Association Rule Learning
                                              1. Apriori
                                                1. Eclat
                                                2. Anomaly Dectection
                                              2. Semisupervised

                                                Anotações:

                                                • Some labeled data and lots of unlabelled data
                                                1. Reinforcement

                                                  Anotações:

                                                  • Use reinforcements and penalties to train the data. Often used to make robots learn how to walk. 
                                                2. Learning Process
                                                  1. Batch

                                                    Anotações:

                                                    • AKA offline, the system is trained and then is used.
                                                    1. Online or Incremental Learning

                                                      Anotações:

                                                      • Learns while it runs. It can be pre trained and then getting updated while it runs. It is also called incremental learning because it can be trained offline but in an incremental fashion.
                                                      1. Out-of-core

                                                        Anotações:

                                                        • Used to train huge datasets. It chops the data in smaller batches and test it untill it's acceptable.
                                                    2. Generalization Model
                                                      1. Instance Based

                                                        Anotações:

                                                        • Train by heart. Use data create similarity data to find similarities.
                                                        1. Model Based

                                                          Anotações:

                                                          • Create a model from the examples and use that model to make predictions.
                                                          1. Test Functions
                                                            1. Utility Function
                                                              1. Cost Function
                                                          2. Challenges
                                                            1. Bad Data
                                                              1. Insufficient Training Data
                                                                1. Non Representative Training Data

                                                                  Anotações:

                                                                  • The data must represent what you want to generalize.
                                                                  1. Sampling Noise

                                                                    Anotações:

                                                                    • Sample too small. (Non representative data as a result of chance)
                                                                    1. Sampling Bias

                                                                      Anotações:

                                                                      • Large number of data but sampling is flawed.
                                                                      1. Nonresponsive Bias
                                                                    2. Poor Quality Data

                                                                      Anotações:

                                                                      • Data that is full of errors, outliers, and noise(e.g., due to poor quality measurements)
                                                                      1. Celan up the Outliers
                                                                        1. Treat the Instances that are Missing Features
                                                                        2. Irrelevant Features

                                                                          Anotações:

                                                                          • The proccess of feature engineering is used to prevent this problem.
                                                                          1. Feature Engineering
                                                                            1. Feature Selection
                                                                              1. Features Extraction

                                                                                Anotações:

                                                                                • Combine existing features to produce a more useful one. See Dimensionality and reduction algorithms.
                                                                                1. New Features
                                                                              2. Overfitting the Training Data
                                                                                1. Overgeneralizing
                                                                                  1. Noise Attributes

                                                                                    Anotações:

                                                                                    • Introducing uninformative attributes can cause noise and to include unwanted patterns.
                                                                                2. Bad Algorithm
                                                                                  1. Overfitting

                                                                                    Anotações:

                                                                                    • Performs well on training data end poorly on test data.
                                                                                    1. Possible Solutions
                                                                                      1. Simplify the Model

                                                                                        Anotações:

                                                                                        • Use a model with fewer parameters, reducing the number of attributes
                                                                                        1. Regularization

                                                                                          Anotações:

                                                                                          • Constrain the model to make it simpler.
                                                                                        2. Gather More Training Data
                                                                                          1. Reduce noise in Training Data
                                                                                        3. Underfitting

                                                                                          Anotações:

                                                                                          • Performs bad in training and test data.
                                                                                          1. Reasons
                                                                                            1. Model is Too Simple
                                                                                            2. Possible Solutions
                                                                                              1. More Powerful Model with More Parameters
                                                                                                1. Feeding better Features
                                                                                                  1. Reducing Regularization
                                                                                                2. Testing and Validating
                                                                                                  1. Test on Production

                                                                                                    Anotações:

                                                                                                    • Not a good option.
                                                                                                    1. Use Test Set

                                                                                                      Anotações:

                                                                                                      • A common rule is 80% for training data and 20% for test data.
                                                                                                      1. Generalization Error

                                                                                                        Anotações:

                                                                                                        • Error rate on new cases. Training error is low and Generalization error is high you have a overfitting problem.
                                                                                                      2. Validation Test Set

                                                                                                        Anotações:

                                                                                                        • Used to tune the Hyperparameters and lastly test in the test set.
                                                                                                        1. Cross Validation

                                                                                                          Anotações:

                                                                                                          • Split the training set into complementary subsets, and each model is trained against a different combination of these subsets and validated against the remaining parts. Once the model type and hyperparameters have been selected, a final model training is made using the full training set, and the generalization error is measured on the test set. Used to avoid "wasting" too many training data into validation tests sets.
                                                                                                      3. Algorithm
                                                                                                        1. Hyperparameter

                                                                                                          Anotações:

                                                                                                          • Is a parameter whose value is set before the learning process begins.
                                                                                                          1. Regularization
                                                                                                        2. Model
                                                                                                          1. Model Parameter

                                                                                                            Anotações:

                                                                                                            • Parameters derived via training. Parameters related to the model.e.g., the parameters Theta0 and Theta1 of a linear model.

                                                                                                          Semelhante

                                                                                                          Machine Learning
                                                                                                          Abhijay Gupta
                                                                                                          Python
                                                                                                          Jay Prakash
                                                                                                          Machine Learning
                                                                                                          Vinh Phạm
                                                                                                          Terminology
                                                                                                          hvrd1
                                                                                                          Artificial Intellegence
                                                                                                          nicky elin
                                                                                                          Inteligencia Artificial
                                                                                                          Jean Ramírez
                                                                                                          GLOSARIO DE NTELIGENCIA ARTIFICIAL
                                                                                                          Amaranta García
                                                                                                          INTELIGENCIA ARTIFICIAL
                                                                                                          ANA SOFIA MONTES CABANZO
                                                                                                          Machine learning: Supervision
                                                                                                          Domhnall Murphy
                                                                                                          Técnicas
                                                                                                          Lina Ochoa
                                                                                                          Relation extraction
                                                                                                          François Plesse