Creado por Sarah Sandison
hace más de 5 años
|
||
Pregunta | Respuesta |
Clustering: What is clustering? | Clustering is a data mining technique used to organise objects into groups that have similar characteristics. |
Clustering: What is the least number of variables needed to perform Clustering? | One |
Clustering: What is the goal of clustering a set of vectors? | To divide them into groups of vectors that near each other |
Clustering: Name as many different types of clustering as you can | - K-Means - DBSCAN - Graph Community Detection |
Clustering: The most common used measure of similarity is _____ distance or it's square | Euclidean |
Clustering: What is K-Means? | K-Means is an algorithm that will group object by nearest neighbour |
Clustering (K-Means): K-Means only works for data in what format? | Numerical |
Clustering (K-Means): What 3 data attributes causes inaccurate results in K-Means? | 1. Data that has outliers 2. Data that has non-convex (not circular) shape 3. Data that has a round shape |
Clustering (K-Means): K-means is an iterative algorithm. Which two steps are repeatedly carried out in its inner-loop? | 1. Move the cluster centroids, where the centroids μk are updated 2. Parameters c are updated when the clusters are assigned to the closest centroid |
Clustering (K-Means): K-Means is also known as... | Non-hierarchical |
Clustering (K-Means): What is Within the Sum of Square? | For each cluster, it is the sum of the squared distances of points in that cluster to their center, summed over the clusters |
Clustering: ? | Groups or clusters are suggested by the data and are not defined a priori. |
Clustering: ? | Objects in each cluster are similar to each other and unlike objects in other clusters. |
Association Rules | Association rule mining is a data mining technique used to discover frequent patterns and associations among items in a (often transactional) dataset. Sometimes it is called market-basket analysis. |
Association Rules: Itemset | A group of items that occur together. |
Association Rules: Support | Each itemset can have a support level, which is the percentage of times it appears in a dataset. |
Association Rules: Association Rule | The statement of itemsets that occur together. {Hotdog} => {Bun, Ketchup} |
Association Rules: Confidence | Confidence is an indication of how often the rule has been found to be true. Support of LHS + RHS / Support LHS |
Association Rules: Lift | Lift is a measure of how many times more often X and Y occur together than expected if they are statistically independent of each other. Support of LHS + RHS / (Support LHS * Support of RHS) |
Association Rules: List the different Association Rule algorithms. | - Apriori - ECLAT - FP-Growth - AprioriTID Aprioiri Hybrid |
Association Rules: ? | For a given number of items n, there is a larger number of possible rules than possible itemsets. |
Association Rules: Which measure of association needs to not be too small in order to ensure that the rule has not just occurred by chance? | n |
Association Rules: The definition of a frequent itemset is... | An itemset whose support is greater than the support threshold |
Association Rules: What is a sparse matrix? | A matrix with mostly 0 entries |
Association Rules: Is the support of the rule A → B the same or different as the support of the rule A → B is? | Same |
Association Rules: In general, an association rule A → B tells us that... | If A occurs, then B is likely to occur too |
Association Rules: ? | If an itemset is infrequent, then all of its supersets are infrequent |
Visualisation: What are the 3 main python libraries for visualisations? | 1. Matplotlib 2. Seaborn 3. Bokeh |
Visualisation: What Charts can you use for Comparison of Data? | - Bar - Column -Pie (although DF don't like it) - Scatter Plot - Line |
Visualisation: What Charts can you use to show Composition of Data? | - Pie - Stacked Bar - Stacked Column - Area - Waterfall |
Visualisation: What Charts can you use to show Distribution of Data? | - Scatter Plot - Line - Column - Bar |
Visualisation: What Charts can you use to show Trends within Data? | - Line - Dual-Axis Line - Column |
Visualisation: What Charts can you use to show the the relationship between value sets? | - Scatter Plot - Bubble - Line |
Regression: What are the limits for statistically significant correlation? |
¿Quieres crear tus propias Fichas gratiscon GoConqr? Más información.