Inferential Statistics for Data Science

Pregunta 1

Pregunta

Select the right answer. With the help of inferential statistics, we can :

Respuesta

Making conclusions from a sample about the population
Conclude if a sample selected is statistically significant to the whole population or not
Compare two models to find which one is more statistically significant as compared to the other.
We can do feature selection, whether adding or removing a variable helps in improving the model or not.
Hypothesis testing.
All

Pregunta 2

Pregunta

Standard Error is the amount of variation in the _________ data. It is related to Standard Deviation as σ/√n, where, n is the _________ size.

Respuesta

Sample
Population

Pregunta 3

Pregunta

A Sampling Distribution is a probability distribution of a statistic (Mean/Median/Mode) obtained through a large number of samples drawn from a specific population.

Respuesta

True
False

Pregunta 4

Pregunta

A Sampling Distribution behaves much like a normal curve and has some interesting properties like :

Respuesta

The shape of the Sampling Distribution does not reveal anything about the shape of the population.
Sampling Distribution helps to estimate the population statistic.
Both.

Pregunta 5

Pregunta

Central Limit Theorem states that: When plotting a sampling distribution of means, the mean of sample means will be equal to the population mean. And the sampling distribution will approach a normal distribution with variance equal to σ/√n where σ is the standard deviation of population and n is the sample size.

Respuesta

False
True

Pregunta 6

Pregunta

Greater the sample size, lower the standard error and greater the accuracy in determining the population mean from the sample mean?

Respuesta

False
True

Pregunta 7

Pregunta

No matter the shape of the population distribution, be it bi-modal, right-skewed, etc. The shape of the Sampling Distribution will remain the same (normal curve)?

Respuesta

True
False

Pregunta 8

Pregunta

For a sampling distribution: The number of samples has to be sufficient (generally more than 50) to satisfactorily achieve a normal curve distribution. We also have to keep the sample size fixed since any change in sample size will change the shape of the sampling distribution and it will no longer be bell-shaped?

Respuesta

False
True

Pregunta 9

Pregunta

As we increase the sample size, the sampling distribution squeezes from both sides giving a better estimate of the population statistic since it lies somewhere in the middle of the sampling distribution (generally).

Respuesta

False
True

Pregunta 10

Pregunta

The confidence interval is a type of interval estimate from the ___________ distribution which gives a range of values in which the population statistic may lie.

Respuesta

Sampling
Population

Pregunta 11

Pregunta

The margin of error is a statistic expressing the amount of random sampling error in the results of a survey.

Respuesta

True
False

Pregunta 12

Pregunta

Margin of Error________ the width of Confidence Interval

Respuesta

1/2
1/4th

Pregunta 13

Pregunta

Which of the following points are true for confidence intervals?

Respuesta

Confidence Intervals can be built with different degrees of confidence suitable to a user’s needs like 70 %, 90% etc.
Greater the sample size, smaller the Confidence Interval
There are different confidence intervals for different sample means. For example, a sample mean of 40 will have a different confidence interval from a sample mean of 45.
95% Confidence Interval, does not mean that the probability of a population mean to lie in an interval is 95%. Instead, 95% C.I means that 95% of the Interval estimates will contain the population statistic.
All of the above.

Pregunta 14

Pregunta

Hypothesis testing lets us identify ________ statistic to be checked against a _________ statistic or statistic of another sample to study any intervention etc.

Respuesta

Sample, Population
Population, Sample

Pregunta 15

Pregunta

Null hypothesis is a type of hypothesis in which we assume that sample observations are not by chance. They are affected by some non-random situation. It is denoted by H1 or Ha.

Respuesta

True
False

Pregunta 16

Pregunta

Alternate Hypothesis is a type of hypothesis in which we assume that the sample observations are purely by chance. It is denoted by H0.

Respuesta

True
False

Pregunta 17

Pregunta

Hypothesis Testing is done on different levels of confidence and makes use of z-score to calculate the probability.

Respuesta

False
True

Pregunta 18

Pregunta

For a 95% Confidence Interval, anything above the z-threshold for 95% would reject the null hypothesis.

Respuesta

False
True

Pregunta 19

Pregunta

Write down the steps to hypothesis testing.

Respuesta

write your answer down.
check them later after the quiz.

Pregunta 20

Pregunta

The significance level, also denoted as alpha or α, is the probability of rejecting the null hypothesis when it is ________.

Respuesta

True
False

Pregunta 21

Pregunta

p-value is the probability of obtaining results at least as extreme as the observed results of a statistical hypothesis test, assuming that the null hypothesis is correct.

Respuesta

True
False

Pregunta 22

Pregunta

Low enough p-value is ground for rejecting the null hypothesis. We reject the null hypothesis if the p-value is less than the significance level?

Respuesta

False
True

Pregunta 23

Pregunta

Type-1 error: Type 1 error is the case when we fail to reject the null hypothesis but actually it is false. The probability of having a type-1 error is called beta(β).

Respuesta

False
True

Pregunta 24

Pregunta

Type-2 error: Type 2 error is the case when we reject the null hypothesis but in actual it was true. The probability of having a Type-2 error is called significance level alpha(α).

Respuesta

True
False

Pregunta 25

Pregunta

For Type 1 and Type 2 error: α= P (Null hypothesis rejected | Null hypothesis is true) β= P (Null hypothesis accepted | Null hypothesis is false)

Respuesta

True
False

Pregunta 26

Pregunta

Power of test is defined as P= 1- Type-2 error = 1 – β Lesser the type-2 error more the power of the hypothesis test.

Respuesta

True
False

Pregunta 27

Pregunta

For a Z - test: 1. A Z-test is mainly used when the data is normally distributed. 2. We find the Z-statistic of the sample means and calculate the z-score. 3. Z-test is mainly used when the population mean and standard deviation are given.

Respuesta

True
False

Pregunta 28

Pregunta

T-tests are similar to the z-scores, the only difference being that instead of the Population Standard Deviation, we use the Sample Standard Deviation?

Respuesta

True
False

Pregunta 29

Pregunta

Z-tests are statistical calculations that can be used to compare population means to a sample's. T-tests are calculations used to test a hypothesis, but they are most useful when we need to determine if there is a statistically significant difference between two independent sample groups.

Respuesta

True
False

Pregunta 30

Pregunta

The Degree of Freedom is the number of __________that have the choice of having more than one arbitrary value.

Respuesta

Variable
Sample

Pregunta 31

Pregunta

Select the True statement

Respuesta

1. Greater the difference between the sample mean and the population mean, greater the chance of rejecting the Null Hypothesis.
2. Greater the sample size, greater the chance of rejection of Null Hypothesis.
Both

Pregunta 32

Pregunta

One-sample t-test compares the mean of _________ data to a known value.

Respuesta

Sample
Population

Pregunta 33

Pregunta

Which of the following points are true for One Sample T- test?

Respuesta

Determine whether the mean of a group differs from the specified value.
Calculate a range of values that are likely to include the population mean.
We can run a one-sample T-test when we do not have the population S.D. or we have a sample of size less than 30.
All of them.

Pregunta 34

Pregunta

We use a two-sample T-test when we want to evaluate whether the mean of the two independent samples is different or not.

Respuesta

False
True

Pregunta 35

Pregunta

Two-sample T-test is used to:

Respuesta

Determine whether the means of two independent groups differ.
Calculate a range of values that is likely to include the difference between the population means.
Both

Pregunta 36

Pregunta

Points to be noted for two sample T-test are: 1. The groups to be tested should be __________ 2. The groups’ distribution should not be highly _________.

Respuesta

Independent, Skewed
Dependent, Normal

Pregunta 37

Pregunta

A Independent Samples t-test compare the means for ______ different groups? Samples are __________ of each other?

Respuesta

Two, Independent
Same, Dependent

Pregunta 38

Pregunta

A Paired sample t-test compares means from the ______ group at different times? Samples are _________ on each other?

Respuesta

Same, Dependent
Two, Independent

Pregunta 39

Pregunta

ANOVA is used to determine whether there are any statistically significant differences between the means of ________ independent (unrelated) groups.

Respuesta

One
Two
Three or more

Pregunta 40

Pregunta

A one-way ANOVA has ______ independent variable, while a two-way ANOVA has ______.

Respuesta

One, Two
Two, One

Pregunta 41

Pregunta

Write down the steps to perform ANOVA.

Respuesta

Write down the answers
Check them later

Pregunta 42

Pregunta

Practical applications of ANOVA in modeling are:

Respuesta

Identifying whether a categorical variable is relevant to a continuous variable.
Identifying whether a treatment was effective to the model or not.
Both.

Pregunta 43

Pregunta

The Chi-Square Test determines whether there is an association between _______ variables (i.e., whether the variables are independent or related).

Respuesta

Categorical
Continuous

Pregunta 44

Pregunta

Goodness of fit: It compares two categorical variables to find whether they are related with each other or not.

Respuesta

True
False

Pregunta 45

Pregunta

Test of Independence: It determines if sample data of categorical variables matches with population or not.

Respuesta

True
False

Pregunta 46

Pregunta

Regression analysis is a form of predictive modelling technique which investigates the relationship between a ___________ (target) and __________ variable (s) (predictor).

Respuesta

Dependent, Independent
Independent, Dependent

Pregunta 47

Pregunta

The regression sum of squares describes how well a regression model represents the modeled data. A higher regression sum of squares indicates that the model does not fit the data well?

Respuesta

True
False

Pregunta 48

Pregunta

A residual sum of squares (RSS) is a statistical technique used to measure the amount of_________ in a data set that is not explained by a regression model.

Respuesta

Mean
Variance

Pregunta 49

Pregunta

Coefficient of Determination (R-Square): It represents the strength of correlation between two variables?

Respuesta

True
False

Pregunta 50

Pregunta

Correlation Coefficients are used to measure how strong a relationship is between two variables?

Respuesta

True
False

	Creado por Vishakha Achmare hace alrededor de 4 años

Siguiente

Inferential Statistics for Data Science

Descripción

Resumen del Recurso

Pregunta 1

Pregunta 2

Pregunta 3

Pregunta 4

Pregunta 5

Pregunta 6

Pregunta 7

Pregunta 8

Pregunta 9

Pregunta 10

Pregunta 11

Pregunta 12

Pregunta 13

Pregunta 14

Pregunta 15

Pregunta 16

Pregunta 17

Pregunta 18

Pregunta 19

Pregunta 20

Pregunta 21

Pregunta 22

Pregunta 23

Pregunta 24

Pregunta 25

Pregunta 26

Pregunta 27

Pregunta 28

Pregunta 29

Pregunta 30

Pregunta 31

Pregunta 32

Pregunta 33

Pregunta 34

Pregunta 35

Pregunta 36

Pregunta 37

Pregunta 38

Pregunta 39

Pregunta 40

Pregunta 41

Pregunta 42

Pregunta 43

Pregunta 44

Pregunta 45

Pregunta 46

Pregunta 47

Pregunta 48

Pregunta 49

Pregunta 50

Similar