Inferential Statistics for Data Science

Pregunta 1 de 50

1

Select the right answer.

With the help of inferential statistics, we can :

Selecciona una de las siguientes respuestas posibles:

Making conclusions from a sample about the population
Conclude if a sample selected is statistically significant to the whole population or not
Compare two models to find which one is more statistically significant as compared to the other.
We can do feature selection, whether adding or removing a variable helps in improving the model or not.
Hypothesis testing.
All

Explicación

Pregunta 2 de 50

1

Standard Error is the amount of variation in the _ data. It is related to Standard Deviation as σ/√n, where, n is the _ size.

Selecciona una de las siguientes respuestas posibles:

Sample
Population

Explicación

Pregunta 3 de 50

1

A Sampling Distribution is a probability distribution of a statistic (Mean/Median/Mode) obtained through a large number of samples drawn from a specific population.

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 4 de 50

1

A Sampling Distribution behaves much like a normal curve and has some interesting properties like :

Selecciona una de las siguientes respuestas posibles:

The shape of the Sampling Distribution does not reveal anything about the shape of the population.
Sampling Distribution helps to estimate the population statistic.
Both.

Explicación

Pregunta 5 de 50

1

Central Limit Theorem states that:

When plotting a sampling distribution of means, the mean of sample means will be equal to the population mean. And the sampling distribution will approach a normal distribution with variance equal to σ/√n where σ is the standard deviation of population and n is the sample size.

Selecciona una de las siguientes respuestas posibles:

False
True

Explicación

Pregunta 6 de 50

1

Greater the sample size, lower the standard error and greater the accuracy in determining the population mean from the sample mean?

Selecciona una de las siguientes respuestas posibles:

False
True

Explicación

Pregunta 7 de 50

1

No matter the shape of the population distribution, be it bi-modal, right-skewed, etc. The shape of the Sampling Distribution will remain the same (normal curve)?

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 8 de 50

1

For a sampling distribution:

The number of samples has to be sufficient (generally more than 50) to satisfactorily achieve a normal curve distribution. We also have to keep the sample size fixed since any change in sample size will change the shape of the sampling distribution and it will no longer be bell-shaped?

Selecciona una de las siguientes respuestas posibles:

False
True

Explicación

Pregunta 9 de 50

1

As we increase the sample size, the sampling distribution squeezes from both sides giving a better estimate of the population statistic since it lies somewhere in the middle of the sampling distribution (generally).

Selecciona una de las siguientes respuestas posibles:

False
True

Explicación

Pregunta 10 de 50

1

The confidence interval is a type of interval estimate from the ___________ distribution which gives a range of values in which the population statistic may lie.

Selecciona una de las siguientes respuestas posibles:

Sampling
Population

Explicación

Pregunta 11 de 50

1

The margin of error is a statistic expressing the amount of random sampling error in the results of a survey.

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 12 de 50

1

Margin of Error________ the width of Confidence Interval

Selecciona una de las siguientes respuestas posibles:

1/2
1/4th

Explicación

Pregunta 13 de 50

1

Which of the following points are true for confidence intervals?

Selecciona una de las siguientes respuestas posibles:

Confidence Intervals can be built with different degrees of confidence suitable to a user’s needs like 70 %, 90% etc.
Greater the sample size, smaller the Confidence Interval
There are different confidence intervals for different sample means. For example, a sample mean of 40 will have a different confidence interval from a sample mean of 45.
95% Confidence Interval, does not mean that the probability of a population mean to lie in an interval is 95%. Instead, 95% C.I means that 95% of the Interval estimates will contain the population statistic.
All of the above.

Explicación

Pregunta 14 de 50

1

Hypothesis testing lets us identify statistic to be checked against a _ statistic or statistic of another sample to study any intervention etc.

Selecciona una de las siguientes respuestas posibles:

Sample, Population
Population, Sample

Explicación

Pregunta 15 de 50

1

Null hypothesis is a type of hypothesis in which we assume that sample observations are not by chance. They are affected by some non-random situation. It is denoted by H1 or Ha.

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 16 de 50

1

Alternate Hypothesis is a type of hypothesis in which we assume that the sample observations are purely by chance. It is denoted by H0.

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 17 de 50

1

Hypothesis Testing is done on different levels of confidence and makes use of z-score to calculate the probability.

Selecciona una de las siguientes respuestas posibles:

False
True

Explicación

Pregunta 18 de 50

1

For a 95% Confidence Interval, anything above the z-threshold for 95% would reject the null hypothesis.

Selecciona una de las siguientes respuestas posibles:

False
True

Explicación

Pregunta 19 de 50

1

Write down the steps to hypothesis testing.

Selecciona una de las siguientes respuestas posibles:

write your answer down.
check them later after the quiz.

Explicación

Pregunta 20 de 50

1

The significance level, also denoted as alpha or α, is the probability of rejecting the null hypothesis when it is ________.

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 21 de 50

1

p-value is the probability of obtaining results at least as extreme as the observed results of a statistical hypothesis test, assuming that the null hypothesis is correct.

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 22 de 50

1

Low enough p-value is ground for rejecting the null hypothesis. We reject the null hypothesis if the p-value is less than the significance level?

Selecciona una de las siguientes respuestas posibles:

False
True

Explicación

Pregunta 23 de 50

1

Type-1 error: Type 1 error is the case when we fail to reject the null hypothesis but actually it is false. The probability of having a type-1 error is called beta(β).

Selecciona una de las siguientes respuestas posibles:

False
True

Explicación

Pregunta 24 de 50

1

Type-2 error: Type 2 error is the case when we reject the null hypothesis but in actual it was true. The probability of having a Type-2 error is called significance level alpha(α).

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 25 de 50

1

For Type 1 and Type 2 error:

α= P (Null hypothesis rejected | Null hypothesis is true)

β= P (Null hypothesis accepted | Null hypothesis is false)

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 26 de 50

1

Power of test is defined as

P= 1- Type-2 error

= 1 – β

Lesser the type-2 error more the power of the hypothesis test.

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 27 de 50

1

For a Z - test:

1. A Z-test is mainly used when the data is normally distributed.
2. We find the Z-statistic of the sample means and calculate the z-score.
3. Z-test is mainly used when the population mean and standard deviation are given.

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 28 de 50

1

T-tests are similar to the z-scores, the only difference being that instead of the Population Standard Deviation, we use the Sample Standard Deviation?

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 29 de 50

1

Z-tests are statistical calculations that can be used to compare population means to a sample's.

T-tests are calculations used to test a hypothesis, but they are most useful when we need to determine if there is a statistically significant difference between two independent sample groups.

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 30 de 50

1

The Degree of Freedom is the number of __________that have the choice of having more than one arbitrary value.

Selecciona una de las siguientes respuestas posibles:

Variable
Sample

Explicación

Pregunta 31 de 50

1

Select the True statement

Selecciona una de las siguientes respuestas posibles:

1. Greater the difference between the sample mean and the population mean, greater the chance of rejecting the Null Hypothesis.
2. Greater the sample size, greater the chance of rejection of Null Hypothesis.
Both

Explicación

Pregunta 32 de 50

1

One-sample t-test compares the mean of _________ data to a known value.

Selecciona una de las siguientes respuestas posibles:

Sample
Population

Explicación

Pregunta 33 de 50

1

Which of the following points are true for One Sample T- test?

Selecciona una de las siguientes respuestas posibles:

Determine whether the mean of a group differs from the specified value.
Calculate a range of values that are likely to include the population mean.
We can run a one-sample T-test when we do not have the population S.D. or we have a sample of size less than 30.
All of them.

Explicación

Pregunta 34 de 50

1

We use a two-sample T-test when we want to evaluate whether the mean of the two independent samples is different or not.

Selecciona una de las siguientes respuestas posibles:

False
True

Explicación

Pregunta 35 de 50

1

Two-sample T-test is used to:

Selecciona una de las siguientes respuestas posibles:

Determine whether the means of two independent groups differ.
Calculate a range of values that is likely to include the difference between the population means.
Both

Explicación

Pregunta 36 de 50

1

Points to be noted for two sample T-test are:

1. The groups to be tested should be __
2. The groups’ distribution should not be highly _.

Selecciona una de las siguientes respuestas posibles:

Independent, Skewed
Dependent, Normal

Explicación

Pregunta 37 de 50

1

A Independent Samples t-test compare the means for different groups?
Samples are ____ of each other?

Selecciona una de las siguientes respuestas posibles:

Two, Independent
Same, Dependent

Explicación

Pregunta 38 de 50

1

A Paired sample t-test compares means from the group at different times?
Samples are ___ on each other?

Selecciona una de las siguientes respuestas posibles:

Same, Dependent
Two, Independent

Explicación

Pregunta 39 de 50

1

ANOVA is used to determine whether there are any statistically significant differences between the means of ________ independent (unrelated) groups.

Selecciona una de las siguientes respuestas posibles:

One
Two
Three or more

Explicación

Pregunta 40 de 50

1

A one-way ANOVA has independent variable, while a two-way ANOVA has .

Selecciona una de las siguientes respuestas posibles:

One, Two
Two, One

Explicación

Pregunta 41 de 50

1

Write down the steps to perform ANOVA.

Selecciona una de las siguientes respuestas posibles:

Write down the answers
Check them later

Explicación

Pregunta 42 de 50

1

Practical applications of ANOVA in modeling are:

Selecciona una de las siguientes respuestas posibles:

Identifying whether a categorical variable is relevant to a continuous variable.
Identifying whether a treatment was effective to the model or not.
Both.

Explicación

Pregunta 43 de 50

1

The Chi-Square Test determines whether there is an association between _______ variables (i.e., whether the variables are independent or related).

Selecciona una de las siguientes respuestas posibles:

Categorical
Continuous

Explicación

Pregunta 44 de 50

1

Goodness of fit: It compares two categorical variables to find whether they are related with each other or not.

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 45 de 50

1

Test of Independence: It determines if sample data of categorical variables matches with population or not.

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 46 de 50

1

Regression analysis is a form of predictive modelling technique which investigates the relationship between a _ (target) and variable (s) (predictor).

Selecciona una de las siguientes respuestas posibles:

Dependent, Independent
Independent, Dependent

Explicación

Pregunta 47 de 50

1

The regression sum of squares describes how well a regression model represents the modeled data.
A higher regression sum of squares indicates that the model does not fit the data well?

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 48 de 50

1

A residual sum of squares (RSS) is a statistical technique used to measure the amount of_________ in a data set that is not explained by a regression model.

Selecciona una de las siguientes respuestas posibles:

Mean
Variance

Explicación

Pregunta 49 de 50

1

Coefficient of Determination (R-Square): It represents the strength of correlation between two variables?

Selecciona una de las siguientes respuestas posibles:

True
False

Explicación

Pregunta 50 de 50

1

Correlation Coefficients are used to measure how strong a relationship is between two variables?

Selecciona una de las siguientes respuestas posibles:

True
False

	Creado por Vishakha Achmare hace casi 4 años

A basic quiz on Inferential Statistics.

Inferential Statistics for Data Science

Pregunta 1 de 50

Select the right answer. With the help of inferential statistics, we can :

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 2 de 50

Standard Error is the amount of variation in the _________ data. It is related to Standard Deviation as σ/√n, where, n is the _________ size.

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 3 de 50

A Sampling Distribution is a probability distribution of a statistic (Mean/Median/Mode) obtained through a large number of samples drawn from a specific population.

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 4 de 50

A Sampling Distribution behaves much like a normal curve and has some interesting properties like :

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 5 de 50

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 6 de 50

Greater the sample size, lower the standard error and greater the accuracy in determining the population mean from the sample mean?

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 7 de 50

No matter the shape of the population distribution, be it bi-modal, right-skewed, etc. The shape of the Sampling Distribution will remain the same (normal curve)?

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 8 de 50

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 9 de 50

As we increase the sample size, the sampling distribution squeezes from both sides giving a better estimate of the population statistic since it lies somewhere in the middle of the sampling distribution (generally).

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 10 de 50

The confidence interval is a type of interval estimate from the ___________ distribution which gives a range of values in which the population statistic may lie.

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 11 de 50

The margin of error is a statistic expressing the amount of random sampling error in the results of a survey.

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 12 de 50

Margin of Error________ the width of Confidence Interval

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 13 de 50

Which of the following points are true for confidence intervals?

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 14 de 50

Hypothesis testing lets us identify ________ statistic to be checked against a _________ statistic or statistic of another sample to study any intervention etc.

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 15 de 50

Null hypothesis is a type of hypothesis in which we assume that sample observations are not by chance. They are affected by some non-random situation. It is denoted by H1 or Ha.

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 16 de 50

Alternate Hypothesis is a type of hypothesis in which we assume that the sample observations are purely by chance. It is denoted by H0.

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 17 de 50

Hypothesis Testing is done on different levels of confidence and makes use of z-score to calculate the probability.

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 18 de 50

For a 95% Confidence Interval, anything above the z-threshold for 95% would reject the null hypothesis.

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 19 de 50

Write down the steps to hypothesis testing.

Selecciona una de las siguientes respuestas posibles:

Explicación

Pregunta 20 de 50

The significance level, also denoted as alpha or α, is the probability of rejecting the null hypothesis when it is ________.

Selecciona una de las siguientes respuestas posibles:

Explicación

Select the right answer.

With the help of inferential statistics, we can :

Standard Error is the amount of variation in the _ data. It is related to Standard Deviation as σ/√n, where, n is the _ size.

Hypothesis testing lets us identify statistic to be checked against a _ statistic or statistic of another sample to study any intervention etc.

For Type 1 and Type 2 error:

α= P (Null hypothesis rejected | Null hypothesis is true)

β= P (Null hypothesis accepted | Null hypothesis is false)

Power of test is defined as

P= 1- Type-2 error

= 1 – β

Lesser the type-2 error more the power of the hypothesis test.

For a Z - test:

1. A Z-test is mainly used when the data is normally distributed.
2. We find the Z-statistic of the sample means and calculate the z-score.
3. Z-test is mainly used when the population mean and standard deviation are given.

Z-tests are statistical calculations that can be used to compare population means to a sample's.

T-tests are calculations used to test a hypothesis, but they are most useful when we need to determine if there is a statistically significant difference between two independent sample groups.

Points to be noted for two sample T-test are:

1. The groups to be tested should be __
2. The groups’ distribution should not be highly _.

A Independent Samples t-test compare the means for different groups?
Samples are ____ of each other?