Is the ability of machine of learned automatically
Supervised
Nota:
is where the data is labeled and the program learns to predict the output from the input data.
Regression
Nota:
In regression problems, we are trying to predict a continuous-valued output. Examples are:
What is the housing price in Neo York?What is the value of cryptocurrencies?
Loss:
We can think about loss as the squared distance from the point to the line. We do the squared distance (instead of just the distance) so that points above and below the line both contribute to total loss in the same way:
Classification
Nota:
In classification problems, we are trying to predict a discrete number of values. Examples are:
Is this a picture of a human or a picture of an AI?Is this email spam?
Classification is used to predict a discrete label. The outputs fall under a finite set of possible outcomes. Many situations have only two possible outcomes. This is called binary classification (True / False, 1 o 0)
Normalization
Nota:
The scale of the datapoint can afect the algoritm because the data points are in diferent scale
Min Max Normalization
Nota:
Min-max normalization is one of the most common ways to normalize data. For every feature, the minimum value of that feature gets transformed into a 0, the maximum value gets transformed into a 1, and every other value gets transformed into a decimal between 0 and 1.
For example, if the minimum value of a feature was 20, and the maximum value was 40, then 30 would be transformed to about 0.5 since it is halfway between 20 and 40. The formula is as follows:
(value - min)/(max-min)
Min-max normalization has one fairly significant downside: it does not handle outliers very well. For example, if you have 99 values between 0 and 40, and one value is 100, then the 99 values will all be transformed to a value between 0 and 0.4.
Z -Score Normalization
Nota:
Z-score normalization is a strategy of normalizing data that avoids this outlier issue. The formula for Z-score normalization is below:
(value - u)/o
u: valor promedio de la caracteristica
o: standart desviation of feature
K - Nearest Neighbors
Multi Label Clasification
Nota:
is when there are multiple possible outcomes. It is useful for customer segmentation, image categorization, and sentiment analysis for understanding text. To perform these classifications, we use models like Naive Bayes, K-Nearest Neighbors, and SVMs.
Unsupervised
Nota:
Unsupervised Learning is a type of machine learning where the program learns the inherent structure of the data based on unlabeled examples.
Clustering
The ML Process
Nota:
1. Formulating a Question
2. Finding a understanding the data
3. Cleaning the data and feature engineering
4.Choosing a model
5. Tunning and Evaluating
6. Using the model and present the result
Testing our Models
Nota:
In order to test the effectiveness of your algorithm, we’ll split this data into:
training set
validation set
test set
Trainning Set
Nota:
The training set is the data that the algorithm will learn from. Learning looks different depending on which algorithm you are using. For example, when using Linear Regression, the points in the training set are used to draw the line of best fit. In K-Nearest Neighbors, the points in the training set are the points that could be the neighbors.
Validation Set
Nota:
After training using the training set, the points in the validation set are used to compute the accuracy or error of the classifier. The key insight here is that we know the true labels of every point in the validation set, but we’re temporarily going to pretend like we don’t. We can use every point in the validation set as input to our classifier. We’ll then receive a classification for that point. We can now peek at the true label of the validation point and see whether we got it right or not. If we do this for every point in the validation set, we can compute the validation error!
How split
Nota:
In general, putting 80% of your data in the training set, and 20% of your data in the validation set is a good place to start.
Coefficients
Nota:
Coefficients are most helpful in determining which independent variable
carries more weight.
For example, a coefficient of -1.345 will impact the rent more than a coefficient of 0.238,
with the former impacting prices negatively and latter positively.
Correlations
Nota:
A negative linear relationship means that as X values increase, Y values will decrease. Similarly, a positive linear relationship means that as X values increase, Y values will also increase.
Evaluating the models Accuracy
Nota:
Now let's say we add another x variable, building's age, to our model. By adding this third relevant x variable, the R² is expected to go up. Let say the new R² is 0.95. This means that square feet, number of bedrooms and age of the building together explain 95% of the variation in the rent.
The best possible R² is 1.00 (and it can be negative because the model can be arbitrarily worse). Usually, a R² of 0.70 is considered good.