3rd Year Stats Exam

Beschreibung

University Statistics Quiz am 3rd Year Stats Exam, erstellt von Jessica Whittick am 04/01/2017.
Jessica Whittick
Quiz von Jessica Whittick, aktualisiert more than 1 year ago
Jessica Whittick
Erstellt von Jessica Whittick vor mehr als 7 Jahre
37
1

Zusammenfassung der Ressource

Frage 1

Frage
What is the purpose of performing a linear regression analysis?
Antworten
  • To identify potential outliers in the data
  • To fit the data to a model that defines y as a function or 2 or more variables
  • To determine the dependence of a dependent variable on a predictor/independent variable
  • To perform multiple comparisons whilst controlling overall type 1 error rate
  • To derive robust confidence intervals

Frage 2

Frage
Which axis does the dependent variable go on?
Antworten
  • y
  • x

Frage 3

Frage
What does the mean of the x and y values give you in a linear regression analysis?
Antworten
  • The size of the force which the points exert on the line of best fit
  • The leverage of those data points
  • The fit and slope of the model
  • The centre of gravity and pivot point of the data

Frage 4

Frage
What does the R-squared value represent?
Antworten
  • How well the model fits the data (0 - 1)
  • The slope coefficient
  • The distribution of the residuals
  • The level of multicolinearity in the model

Frage 5

Frage
What does an R-squared value of 0.068 and a slope coefficient (b1) value of 0.12 mean?
Antworten
  • The model can explain 68% of the data and for every unit of independent variable, the dependent variable goes up 12 units
  • The fit of the model to the data is 0.12% and the influence that the data points have on the model is 0.68%
  • The data points have an influence of 68% on the model and 12% on the outcome
  • The model can explain 6.8% of the data and for every unit of independent variable, the dependent variable goes up 0.12 units

Frage 6

Frage
In order to identify potential outliers:
Antworten
  • Standardised residual >2 is worth checking, if more than 5% of the residuals >2 may indicate model is a poor fit
  • Standardised residual >3 is worth checking, if more than 5% of the residuals >2 may indicate that the model is a poor fit
  • Standardised residual >2.5 is worth checking, if more than 5% of the residuals >2 may indicate that the model is a poor fit
  • Standardised residual >3 is worth checking, if more than 10% of the residuals >2 may indicate that the model is a poor fit

Frage 7

Frage
What does Cook's distance tell us when performing model diagnostics to see if the regression model is stable or biased by a few cases?
Antworten
  • influence of data point on predicted values (0 = no influence, 1 = complete influence)
  • standardised measures of how much each element of the model would change if data point was removed (values >1 = substantial influence)
  • how susceptible the mean is to being biased by the outliers present in the data
  • measure of overall influence of each individual data point on the overall model (>1 = concern)

Frage 8

Frage
What does the Leverage value tell us when performing model diagnostics to see if the regression model is stable or biased by a few cases?
Antworten
  • influence of data point on predicted values (0 = no influence, 1 = complete influence)
  • standardised measures of how much each element of the model would change if data point was removed (values > 1 = substantial influence)
  • measure of overall influence of each individual data point on the overall model (> 1 = concern)
  • precisely how large the standardised residuals are

Frage 9

Frage
With regard to model diagnostics, what do the DFFit and DFBeta values tell us about the data model?
Antworten
  • they are standardised measures of how much each element of the model would change if that data point was removed (values > 1 = substantial influence)
  • they indicate the influence of that data point on predicted values (0 = no influence, 1 = complete influence)
  • whether or not the standardised residuals are worth checking and if they indicate that the model is a poor fit
  • they summarise the equation: 2(k+1)/n where k = number of predictors and n = number of data points

Frage 10

Frage
With regard to the model diagnostic called the Leverage value, what defines whether or not the data point is worth investigating?
Antworten
  • if >2(k+1)/n where k = number of predictors (2 for simple linear regression) and n = number of data points
  • if >2(k+1)/n where k = number of predictors (1 for simple linear regression) and n = number of data points
  • if >2(K+1)/n where k = number of data points and n = number of predictors (1 for simple linear regression)
  • if >n(k+1)/2 where k = number of predictors (1 for simple linear regression) and n = number of data points
  • if >2(n+1)/k where k = number of predictors (1 for simple linear regression) and n = number of data points

Frage 11

Frage
Multiple linear regression does what?
Antworten
  • fits the data to a model that defines y as a function of 2 or more variables - determines the effect of an independent variable on the dependent variable taking account of other variables
  • provides an analysis of variance and determines if an interaction is present in the data
  • determines the dependence of a dependent variable on a predictor/independent variable and allows outliers to be identified from x, y plot or from standardised residual plot

Frage 12

Frage
With regard to multiple linear regression, what is the correct form of the equation for the model which is fitted? (all of the numbers are technically subscript)
Antworten
  • y = b0 + b1x2 + b2x1
  • y = b0 + b1x1 + b2x2 +....
  • y = b0 + b1 + b2x
  • y = b0 + b1x1 + b2x2

Frage 13

Frage
What does the F-ratio represent?
Antworten
  • the average variability due to the model divided by the average variability due to the residuals
  • the unexplained variability divided by the variability due to the model
  • the signal to noise ratio multiplied by the number of data points
  • the variance in the model divided by the R-squared value

Frage 14

Frage
With regard to multiple linear regression, whenever you fit a predictor variable, that takes up...
Antworten
  • one slope parameter
  • two degrees of freedom
  • one degree of freedom
  • one R-squared value

Frage 15

Frage
As colinearity increases what effect does this have?
Antworten
  • standard errors of b coefficients decrease therefore confidence increases
  • limits F-ratio value and variance inflation factor
  • coefficients become stable
  • standard errors of b coefficients increase and therefore confidence decreases

Frage 16

Frage
How do you interpret the variance inflation factor (VIF) when assessing multicolinearity?
Antworten
  • A VIF > 5 or an avereage VIF > 2 is problematic
  • A VIF > 10 or an average VIF > 1 is problematic
  • A VIF > 2 or an average VIF > 1 is problematic
  • A VIF > 10 or an average VIF > 2 is problematic

Frage 17

Frage
How do you interpret the tolerance factor when assessing multicolinearity?
Antworten
  • < 5 is problematic
  • < 10 is problematic
  • < 2 is problematic
  • < 0.1 is problematic
  • < 1 is problematic

Frage 18

Frage
When does multicolinearity truly pose a problem?
Antworten
  • when predicting y using the multiple regression equation
  • when you want to look inside the model at the effect of individual predictors
  • when you want to perform separate correlations for each x variable
  • when you want to quantify the relationship between an independent and dependent variable

Frage 19

Frage
How do you help solve the problem of multicolinearity?
Antworten
  • always take a colinear variable out
  • combine predictors into a single predictor (as long as it makes biological sense)
  • rely on automatic variable selection
  • remove all outliers

Frage 20

Frage
With regard to hierarchical multiple regression, what value do you use when comparing new model to previous model?
Antworten
  • F-ratio
  • Cook's distance
  • R-squared
  • variance inflation factor
  • tolerance factor

Frage 21

Frage
For multiple linear regression assumptions, what must the variables be?
Antworten
  • dependent variables = quantitative or categorical predictor variable = qualitative and continuous
  • dependent variables = qualitative predictor variable = continuous
  • dependent variables = qualitative and continuous predictor variable = quantitative or categorical
  • dependent variables = continuous or categorical predictor variable = quantitative or categorical

Frage 22

Frage
When considering multiple linear regression assumptions, how do you assess the independence of the residuals?
Antworten
  • assess the DFFit and DFBeta values
  • use the Welch's test
  • use Gabriel's test
  • use the Durbin-Watson test

Frage 23

Frage
For multiple linear regression, how large should the sample size be?
Antworten
  • 10 times the number of predictors tested
  • 5 times the number of predictors tested
  • at least 30
  • at least 40

Frage 24

Frage
What would an interaction among predictors look like in the form of an equation?
Antworten
  • effect of height + effect of weight = overall effect on SBP
  • effect of height + overall effect on SBP = effect of weight
  • effect of height x effect of weight = overall effect on SBP

Frage 25

Frage
What is simple linear regression equal to?
Antworten
  • paired t-test
  • unpaired t-test
  • unpaired, two-tailed t-test
  • paired, one-tailed t-test

Frage 26

Frage
A one-way anova is the same as what?
Antworten
  • unpaired t-test
  • paired t-test
  • simple linear regression
  • multiple regression

Frage 27

Frage
What does a one-way ANOVA do?
Antworten
  • analyses how much of the overall variance can be explained by variation between group means compared to the unexplained variation within a group
  • fits data to a model that defines y as a function of 2 or more variables
  • performs separate correlations for each x variable
  • determines the dependence of a dependent variable on a predictor/independent variable

Frage 28

Frage
What does the total variability equal?
Antworten
  • total squares divided by the degrees of freedom
  • the F-ratio
  • the difference between each individual data point and the overall mean
  • error mean squares divided by degrees of freedom

Frage 29

Frage
The F-ratio is:
Antworten
  • higher the larger the difference of the group means from the overall mean and smaller the amount of random variability
  • lower the larger the difference of the group means from the overall mean and smaller the amount of random variability
  • higher the larger the difference of the group means from the overall mean and larger the amount of random variability
  • higher the smaller the difference of the group means from the overall mean and smaller the amount of random variability

Frage 30

Frage
When is the ANOVA most robust to deviations from normality and equality of variance?
Antworten
  • when effect size is large
  • when the F-ratio is high
  • when the degrees of freedom are greater than 10
  • when the group sizes are equal

Frage 31

Frage
If group sizes are unequal and equality of variance is not met then which correction do you use?
Antworten
  • Games-Howell's
  • Durbin-Watson's
  • Gabriel's
  • Tukey's
  • Welch's

Frage 32

Frage
What are post-hoc tests used for?
Antworten
  • performing multiple comparisons whilst controlling overall type 2 error rate
  • performing multiple comparisons whilst controlling overall type 1 error rate
  • when there is a specific hypothesis to be tested
  • when group size is not equal

Frage 33

Frage
You use Tukey's test when which of the following is true? (multiple correct answers)
Antworten
  • sample sizes are unequal
  • sample sizes are equal
  • you require good trade-off between type 1 and type 2 errors
  • you are interested in comparing all groups vs a single control group
  • you wish to cut down on the number of comparisons that you make

Frage 34

Frage
When would you use Bonferroni as a post-hoc test? (multiple correct answers)
Antworten
  • when you don't need a high level of confidence
  • when you aren't performing multiple comparisons
  • when you require a conservative test
  • when you need a high level of confidence
  • when sample sizes are equal

Frage 35

Frage
When would you use Dunnet's as a post-hoc test? (multiple correct answers)
Antworten
  • when interested in comparing all groups versus a single control group
  • when sample sizes are equal
  • when you want to cut down comparisons
  • when you want a good trade-off between type 1 and type 2 errors

Frage 36

Frage
Separate, unpaired t-tests to do comparisons will increase your chance of getting what?
Antworten
  • a false -ve
  • a type 2 error
  • a false +ve
  • biased data

Frage 37

Frage
If you have a sample which has an n number of 10 and a sample with an n number of 12, which post hoc test should you use?
Antworten
  • Gabriel's
  • Hochberg's GT2
  • Games-Howell
  • Tukey

Frage 38

Frage
If there is any doubt about equality of variance then which post-hoc test should you use?
Antworten
  • Gabriel's
  • Hochberg's GT2
  • Games-Howell
  • Sidak

Frage 39

Frage
Complete this statement relating to planned contrasts: Always _________________________ than number of groups
Antworten
  • ten times the number of contrasts
  • more contrasts
  • one more contrast
  • two times the number of contrasts
  • one fewer contrast

Frage 40

Frage
When doing orthogonal contrasts, the contrasts are independent so you can... (multiple correct answers)
Antworten
  • enter weights for most of the variables
  • trust p-values as you aren't inflating the type 1 error rate
  • ignore the F-ratio value and R-squared value
  • not worry about doing any corrections for multiple comparisons

Frage 41

Frage
"tests for trends in the data, which cannot be obtained directly using post-hoc tests, when there is a logical order to the groups entered" To what is this statement referring to?
Antworten
  • Planned contrasts
  • Orthogonal contrasts
  • Polynomial contrasts

Frage 42

Frage
An independent factorial ANOVA does what?
Antworten
  • each level of one factor is tested against at least one level of the other
  • performs multiple comparisons whilst controlling overall type 1 error rate
  • divides total variability in the data set into different sources
  • each level of one factor is tested at each level of the other

Frage 43

Frage
Sidak is the best correction for what?
Antworten
  • independent ANOVA
  • one-way ANOVA
  • repeated measures ANOVA
  • multiple linear regression

Frage 44

Frage
Standard contrasts and post hoc tests are only available to examine main effects and are therefore most useful when:
Antworten
  • there is no unnecessary variability in the data
  • there is no interaction
  • the sphericity assumption is met
  • group sizes are equal

Frage 45

Frage
A p value of less than 0.5 means that....
Antworten
  • there is a less than 0.5% chance of committing a type 1error
  • there is a less than 5% chance of committing a type 2 error
  • there is a less than 5% chance of committing a type 1 error
  • there is a less than 0.5% chance of committing a type 2 error

Frage 46

Frage
Standard error the proportion equals....

Frage 47

Frage
Power =
Antworten
  • 1 - type 1 error rate
  • 1 - type 2 error rate
  • 1 - (type 1 + type 2 error rate)
  • type 2 error rate - type 1 error rate

Frage 48

Frage
Power can be increased by....
Antworten
  • increasing effect size. decreasing random variation. decreasing sample size.
  • increasing effect size. increasing random variation. increasing sample size
  • decreasing effect size. increasing random variation. increasing sample size
  • increasing effect size. decreasing random variation. increasing sample size

Frage 49

Frage
This gives you a standardised effect size for a difference between means, what is it called?
Antworten
  • Welch's correction
  • Cohen's d
  • Games-Howell test
  • Sidak correction
  • standard error of the proportion

Frage 50

Frage
how do you calculate expected frequency?
Antworten
  • (row total + column total)/overall total
  • (row total - column total)/overall total
  • (row total x column total)/overall total

Frage 51

Frage
How do you calculate degrees of freedom from a contingency table?
Antworten
  • df = (rows - 1) x (columns -1)
  • df = (rows + 1) x (columns +1)
  • df = (rows - 1) / (columns -1)
  • df = (rows x 2) + (columns x 2)

Frage 52

Frage
With regard to categorical data - what must be satisfied in order for the analysis to be reliable?
Antworten
  • The assumption that at least 50% of expected frequency must be more than or equal to 5
  • Dunnet's test
  • The assumption that at least 80% of expected frequency must be more than or equal to 5
  • The same assumptions as multiple linear regression

Frage 53

Frage
Which graph indicates an interaction?

Frage 54

Frage
What does simple effects analysis do? (multiple correct answers)
Antworten
  • probes where a certain effect is happening
  • performs an ANOVA to allow you to reject/accept a null hypothesis
  • analyses the differences between levels of one variable
  • performs multiple comparisons whilst controlling overall type 2 error rate

Frage 55

Frage
Which multiple comparison correction should you choose after simple effects analysis in order to control the overall type 1 error rate?
Antworten
  • Bonferroni
  • LSD
  • Sidak

Frage 56

Frage
Repeated measures ANOVA requires the data to have/be:
Antworten
  • independent
  • not independent
  • naturally paired
  • sorted into even group sizes

Frage 57

Frage
What is the definition of sphericity?
Antworten
  • “noise” in the relationship between the independent variables and the dependent variable is the same across all values of the independent variables
  • equality of differences between linked values in each group
  • well-modeled by a normal distribution and likely for a random variable underlying the data set to be normally distributed
  • residuals are (roughly) normal and (approximately) independently distributed with a mean of 0 and some constant variance

Frage 58

Frage
Which test provides a fix for sphericity?
Antworten
  • Welch's correction
  • Games-Howell test
  • Gabriel's test
  • Mauchy's test

Frage 59

Frage
How can we adjust the degrees of freedom and change the significance level associated with the F-statistic?
Antworten
  • Mauchy's test
  • Welch's correction
  • Green House-Geisser correction
  • Gabriel's test

Frage 60

Frage
Which post hoc test is most robust and most conservative for a repeated measures ANOVA?
Antworten
  • Sidak
  • Tukey
  • Dunnets
  • Bonferroni

Frage 61

Frage
If parametric assumptions are in doubt, we must use the non-parametric equivalent of a single factor repeated measures ANOVA which is:
Antworten
  • Durbin-Watson test
  • Friedman test
  • Gabriel's test
  • Hochberg's GT2 test

Frage 62

Frage
For which analysis do BOTH the equality of variance assumption and sphericity assumption apply?
Antworten
  • Non-linear regression
  • Two-way ANOVA
  • Independent ANOVA
  • Mixed model ANOVA
  • Repeated measures ANOVA

Frage 63

Frage
In terms of polynomial regression, what happens if you add further terms to the polynomial? (multiple correct answers)
Antworten
  • the fit will automatically improve
  • there is a risk of over-fitting the model
  • the significance level associated with the F-statistic changes
  • the R-squared value will increase

Frage 64

Frage
In terms of nonlinear regression, why would you want to try multiple starting parameters?
Antworten
  • to ensure that the interaction between the variables is taken into account
  • to ensure that the computer has found the global minimum
  • to ensure that the computer has found the local minimum
  • to ensure that an accurate scientific relationship is found

Frage 65

Frage
How would you calculate the Sum of Squares (SS)?
Antworten
  • add all the standard deviations together and square that value
  • square the mean from each sample and add those together
  • square each standard deviation and add them all together
  • square each standard deviation and add this to the variance

Frage 66

Frage
Variance is calculated by doing what?
Antworten
  • dividing the standard deviations by the degrees of freedom
  • dividing the sum of squares by the degrees of freedom
  • multiplying the degrees of freedom by the mean
  • multiplying the standard deviations by the sum of squares

Frage 67

Frage
How do we define the normal distribution curve?
Antworten
  • the population mean is the height and the sum of squares is the distance from the midline of the curve to the edge
  • the variance is the height and the population mean is the distance from the midline of the curve to the edge
  • the population standard deviation is the height and the population mean is the distance from the midline of the curve to the edge
  • the population mean is the height and the population standard deviation is the distance from the midline of the curve to the edge

Frage 68

Frage
How do you calculate a z-score?
Antworten
  • (x - mean) /sd
  • (x - sd)/mean
  • (mean-x)/sd
  • (x + mean)/sd

Frage 69

Frage
Choose all of the correct statements
Antworten
  • approximately 99% of normally-distributed values lie between +- 2 sds from the mean
  • approximately 95% of normally-distributed values lie between +-2 sds from the mean
  • approximately 99.9% of normally-distributed values lie between +- 2.6 sds from the mean
  • approximately 99% of normally-distributed values lie between +- 2.6 sds from the mean
  • approximately 99.9% of normally-distributed values lie between +- 3 sds from the mean
  • approximately 95% of normally-distributed values lie between +- 3 sds from the mean

Frage 70

Frage
How do you calculate SEM and therefore, confidence intervals?
Antworten
  • SEM = sd x square root of n and therefore a 95% CI would be +- 1.96 x SEM
  • SEM = sd/square root of n and therefore a 95% CI would be +- 3 x SEM
  • SEM = sd x square root of n and therefore a 95% CI would be +- 2.6 x SEM
  • SEM = sd/square root of n and therefore a 95% CI would be +- 1.96 x SEM

Frage 71

Frage
Which statement is true?
Antworten
  • P < 0.05 means that 5% of the results arose by chance if the null hypothesis is true
  • P < 0.05 means <5% probability of the results arising by chance if the null hypothesis is true
  • P < 0.05 means <0.05% probability of the results arising by chance if the null hypothesis is true
  • P < 0.05 means that <0.5% probability of the results arising by chance if the null hypothesis is true

Frage 72

Frage
Choose the correct statements
Antworten
  • type 1 error rate is conventionally set to 5% ( P < 0.05)
  • type 2 error rate is conventionally set to 5% ( P < 0.05)
  • type 1 error rate = 1 - power
  • type 2 error rate = 1 - power
  • if you accept a statistical power of 80% it will mean that you have a type 2 error rate of 20%
  • if you accept a statistical power of 80% it will mean that you have a type 1 error rate of 20%

Frage 73

Frage
What happens if you design an experiment with 3 groups and are tempted to test for differences between the means using 3 separate t-tests? (multiple correct answers)
Antworten
  • you will increase the chance of making a type 2 error
  • you will increase the chance of making a type 1 error
  • you will inflate your p-value
  • you will decrease your p-value

Frage 74

Frage
Which statements are correct regarding the Pearson Correlation Coefficient?
Antworten
  • +- 0.5 is a large effect
  • +- 0.1 is a small effect
  • +- 1 is a small effect
  • it measures how close the data points are to a straight line that best describes the linear relationship
  • r = +0.1 refers to a perfect straight line with a positive slope
  • r = -1 refers to a perfect straight line with a negative slope

Frage 75

Frage
How is the line of best fit created in simple linear regression?
Antworten
  • by minimising the total sum of squares
  • by minimising the sum of squares of the residuals
  • by creating an equation which fits the model best
  • by entering the data into the computer in Hierarchical form

Frage 76

Frage
How do you calculate R squared?
Antworten
  • 1 - (SS of the residuals/total SS)
  • 1 + (SS of the residuals/total SS)
  • 1 - (total SS/SS of the residuals)
  • 1 + (total SS/SS of the residuals)

Frage 77

Frage
Shapiro Wilk's test is used to...
Antworten
  • check for sphericity
  • correct degrees of freedom
  • ascertain that residuals are random and normally distributed
  • minimise the sum of squares of the residuals

Frage 78

Frage
When entering more than 2 categories as dummy variables... (multiple correct answers)
Antworten
  • the thing that you're comparing the baseline to gets a 1
  • the thing that you're comparing the baseline to gets a 0.1
  • 1 fewer dummy variables than number of categories
  • baseline condition gets a value of 0
  • baseline condition gets a value of 1.5

Frage 79

Frage
Bonferroni test on its own - the p-values need to be less than what to claim significance?
Antworten
  • 0.05/number of categories
  • 0.05/n
  • 0.05/number of comparisons
  • 0.05/variance

Frage 80

Frage
Normally distributed variables X and Y are significantly correlated with a p level of 0.006 and a Pearson’s correlation coefficient of 0.468. Approximately how much of the variability in X and Y can be explained by this correlation?
Antworten
  • 32%
  • 47%
  • 22%
  • 13%
Zusammenfassung anzeigen Zusammenfassung ausblenden

ähnlicher Inhalt

Thromboseprophylaxe
tia.weber
Stochastik Grundbegriffe
steffen_1411
Testfragen Sozialpsychologie
Sven Christian
GPSY ALPS
hf.meyer
Grundzüge Soziologischer Theorien - Rudolf Richter 2017 - Version 2
Markus Gio
Projektmanagement Uni Due
Awash Kaul
Österreichische Geschichte ll Mesner (ÖG 2)
Selma Tahirovic
Vetie Immunologie
Anne Heyne
Mewa WS 18/19
Adrienne Tschaudi
Onlinequiz zu MS-4.2 Kapitel_2_Teil_II
Deborah Büscher
Vetie - Geflügelkrankheiten 2016
sylva Heise