Between Subjects:
- different participants in each condition
- looks at the differences between groups
Within Subjects:
- same participants in each condition
- differences between the treatments
The dependent variable is measured in exactly the same way for each design
Problems for between
Nota:
Participants variables
Large group of participants required - impractical
Biases lead to false conclusions
- assignment, observer-expectancy, subject-expectancy
It is possible to assess the baseline measure
Problems for within
Nota:
Practice effects
- lack of naivety
- the more you do the task, the better you get
Longer testing sessions when many conditions.
Factorial Designs
Nota:
one dependent variable
two or more independent variables.
Used when we suspect that more than one IV is contributing to a DV.
Allow exploration of complicated relationships between IVs and a DV
Main effect: how the IVs individual effect the
DV - overall trend
Interactions: how IV factors combine to affect the DV
Between Subjects factorial ANOVA
Within Subjects factorial ANOVA
Mixed factorial ANOVA
Nota:
Efficient uses of participant numbers and individual participant time - reduces cons of other designs.
One of the most common types of design.
Mixed factorial ANOVA assumptions and formulae are the same as for factorial ANOVA.
mix of between and within factors
Nota:
at least one between subjects factor and one within subjects factor
Increasing between subjects factors rapidly makes high cost studies non-viable
Main effect and Interaction formula
Nota:
F values
MS values
SS values
F(between df, within df) = F value, p = p value
Within subjects
Between Subjects
F(between df, within df) = f value, p = p value
F values, MS values, SS values
Assumptions
Nota:
Interval/ ratio data
Normal distribution
- histogram
Homogeneity of variance
- between subjects
- Levene's test
Sphericity of covariance
- within subjects
- Mauchly's test
No parametric alternatives if these are violated
1. interval/ ratio data
2. normal distribution
3. Homogenity of variance (between, Levene's)
Nota:
want it to be non-significant
4. Sphericity of covariance (within, Mauchly's)
Nota:
want it to be non-significant
no parametric alternatives if violated
TWO RULES
use between subjects formulae for between subjects effects and within subjects for within subjects effects
if there is a conflict e.g. in interactions, use within subjects
N = total number of scores
n = number of scores within the condition
Correlation
Tests of Association
Nota:
Tests of the relationships between two variables and are usually performed on continuous variables.
Tests where there is a shared variance between any given pair of variables.
looking for an association between the samples, not a difference (independent samples t-test).
Also point-biserial correlation
- one continuous variable
- one cateogrical variable
2 levels
And simple linear regression, and multiple linear regression.
Pearson's Correlation Assumptions (parametric)
1. linear relationship between variables
Nota:
A linear relationship means that at any point a given change in x will lead to a change in y.
If the scatterplot shows a clear nonlinear relationship do not run a Pearson's correlation.
Data isn't suitable for correlation analysis if it has a curving nonlinear relationship.
2. variables measure interval/ ratio data which are normally distributed
Nota:
as the mean and s.d. only accurately describe the average and dispersal of the data when the data are normally distributed.
If frequency distributions fshow a non-normal distribution do not run a Pearson's correlation.
3. Data should be free of statistical outliers
Nota:
because outliers have disproportionate influence on the correlation statistic or correlation coefficient (r).
There is a misrepresentation of data if outliers are included.
Either exclude them or use a Spearman's correlation (nonparametric) if they are more systematic.
either a positive, negative or curved relationship. Not a bell curve.
-
relationship that goes in one direction
2. works on ordinal/ interval/ ratio data - no need to worry about the distribution
3. outliers can be included in Spearman's analysed data
Nota:
they do not exert as much influence, this is because Spearman's correlations do not use means or s.d.s but use ranks.
tell us whether variables covary with other variables
Pearson's correlation formula
Nota:
a. For each case, subtract the mean from the score on the X variable; repeat for the mean and score on the Y variable; multiply these two values, then add together the products for all cases.
b. For each case, subtract the mean from the score on the X variable; square this difference; add together the squared value for all cases, and then find the square root. Repeat for the Y variable and multiply. Use this value to divide by.
Df = no. of pairs - 2
r(df) = r value, p = p value
r = correlation coefficient
indication of the strength of the relationship
r2 = coefficient of determination
measure of the strength of the relationship, describes the amount of variance explained
effect size
Scatterplots
Nota:
typically show relationships between pairs of variables.
Each point represents one pair of observations at each measurement point
Bottom left to top right = positive
Top left to bottom right = negative
the spread gives an indication of the strength of the relationship
Direction and Strength
If there is low or no spread between the data points then there is a very strong correlation between the variables
Nota:
If there is a reasonable spread, then there is a strong correlation between the variab;es.
r value = 1/ -1
Nota:
direct diagonal line.
when there is a greater spread, the points deviate from 1/-1.
If there is a high spread then there is low or no correlation
Interpreting correlations; facts about correlation coefficients
range from -1 to 1.
no units
they are the same for x and y as for y and x
positive values: as one variable increases so does the other
negative values: as one variable increases, the other decreases
positive relationship - as one value decreases, so does the other
the more spread out data are, the more values will deviate from 1
how close a value is to -1 or 1 indicates how close the two variables are to being perfectly linearly related
R values
Estimating r values
Nota:
1. plot your scatterplot and divide it accordingly to the mean x and y values in order to estimate your values.
2. count up number of points in each quadrant. A positive correlation will populate the +ve quadrants more than the -ve quadrants and vice versa.
Calculating r values - determining whether two values are associated.
1.Plot the raw values against one another
Nota:
scaling problems - different means and SDs. We don't care about the means etc, only the relationships. If all the values are along the bottom, we must try to look at the data in a way that accounts for the differing means and SDs of each axis - therefore do a z score.
2. Z score gives you values which have a mean of 0 and a z score of 1.
Nota:
z score = (score-mean)/ SD
No scaling or unit problems.
Converting raw scores into z scores allows direct comparisons between scores even if they are measured on different scales, and thus enables a comparison of the relative probabilities of each.
Z scores are referred to as standard scores because measurement scales are converted into a standardised format (mean = 0, SD =1)
3. r = the adjusted average of the product for each standardised x-y coordinate pair
Nota:
Top-right and bottom left quadrants produce positive values +/- r
Calculate the area between the points and you would do this for every single value you are looking for a relationship between.
The outliers would artifically inflate the correlation value (r). Bigger area = larger correlation value - further away from the mean.
the closer to the diagonal a point is, the more it contributes to the r value.
The further away from both means a point is, the more it contributes to r.
r = Σ(zX x zY)/ N -1
Nota:
where zX = X- x̄/ Sx
Limitations
Correlation does not equal causation
Nota:
there can be a casual link but correlation analyses do not allow us to conclude this.
To prove causation, the experiment would have to be controlled.
Regression
what is regression?
a family of inferential statistics
Test of association
Help make predictions about data
used when causal relationships are likely
Correlation does not tell you how much to intervene
line of best fest
formula of the line gives the exact answer
Predictions
it is possible to make predictions about how predictor variables will effect outcome variables
regression gives an indication of the:
unstandardised relationship
between outcome (y-axis) and predictor (x-axis) variables
using calculations of the intercept and gradient
expressed in the form Y = a + bX
a = intercept/ constant
b = gradient/ coefficient
in order to determine a, you need to calculate b first
Assumptions
1. the data are linearly related
2. Homoscedasticity of data
residuals
residuals are the difference between the actual outcome score and the predicted score outcome
need same degree of variation across all predictor variable scores
if data are heteroscedastic, a regression isn't the appropriate analysis
simple regression
predicting one outcome variable from one predictor variable
Y = a + bX
SPSS output
1. descriptive statistics
2. correlation coefficient
3. variables enter and removed
variable entered = predictor variable
dependent variable = outcome variable
4. model summary (R values)
5. Check assumptions - graph tests of homoscedasticity
3 charts at the bottom
frequency plot of standardised residuals
histogram of residual values
want normal distribution
bars should approx fit the normal curve
good indication of homoscedasticity
normal plot of regression standardised residual
points should follow the diagonal line
scatterplot of regression standardised residual and regression standardised predicted value
DV = change
plots standardised predicted y values (x axis) against their corresponding residuals
want to see a diffused cloud - no distinct patterns
Determining whether the regression model is statistically valid - 3 R values
R = pooled correlation
R2 = amount of variance in the data that is explained by the model (%)
most important value
adjusted R2 = how much variance would be expected by chance
ANOVA table
test of whether the regression model is better than using the mean outcome value (y) for all cases
is the model signfiicantly better at predicting another model
report R2 than ANOVA result
Reporting Results
1. Check descriptives and correlations
2. Check that predictor and outcome variables show a linear relationship (scatterplot)
3. Check that homscedasticity assumption is not violated
Report the R2 in the test, and the ANOVA results
R2 = , F( , )= , p <
Report the coefficients in a table
Multiple Regression
Predicting one outcome variable from more than one predictor variable
Formula: Y = a + b1X1 + b2X2 +b3X3
many participants are needed
Methods
predictors can be entered in many different orders
Simultaneous
all predictors are entered at the same time
use for exploratory analysis
Hierarchical
predictors are entered in a pre-defined order
used when regressions are informed by well-defined theory
Stepwise
predictors are entered in an order driven by how well they correlated with the outcome
not used often as unstable
SPSS output
1. Descriptive Statistics
2. Correlations
3. Assumptions - visual tests for homoscedasticity
4. Model
summary
how good the model is, R2
ANOVA significance
Reporting Results
1. Check descriptives and correlations
2. Difficult to check for linear relationships
3. Check that homoscedasticity assumption is not violated
4. Report the R2 value
R2 = F(df,df) , p =
5. Report the coefficients in a table
multicollinearity occurs when variables are highly correlated with each other. This is undesired.
Summary
Regression analyses allow to make predictions about outcome variables using predictor variables
All regressions assume homoscedasticity
Simple (bivariate) regression uses one predictor variable. Multiple regression uses more than one.
To report regressions:
i) report R2 and the ANOVA in the text
ii) report the coefficients in a table
Correlation is used to examine the relationship between variables
Regression is used to make predictions about scores on one variable based on knowledge of the values of others