A way for ESTIMATING the medians
and quartiles for large sets of (normally
continuous) data.
How to draw a cumulative frequency graph
You normally start with a table like this: (click on the piece of paper icon at the top right!)
Annotations:
Time waiting (min) | No. of passengers |
0 < T ≤ 5 14
5 < T ≤ 10 35
10 < T ≤ 15 26
15 < T ≤ 20 18
20 < T ≤ 25 7
This needs to be changed into a cumulative frequency table. (like this)
Annotations:
Time waiting Passengers
T ≤ 5 14
T ≤ 10 (14+35 =) 49
T ≤ 15 (49+26=) 75
T ≤ 20 (75+18=) 93
T ≤ 25 (93+7=) 100
Now, plot the graph, with the cumulative frequency (number of
passengers) on the y axis, and the time waiting on the x axis.
Use a sensible scale, and don't forget to label the axes!
Join the points up with a smooth line that goes through every point.
To find the median using your graph...
Find the middle of the cumulative frequency. (By
dividing the total cumulative frequency by 2) In this
example, the total is 100, so you are looking for 50
Draw a line with a ruler horizontally from the 50 on
the cumulative frequency along to where it meets
the curved line you drew on the graph to join up the
points.
From the point where these two lines meet,
draw another line directly downwards, and
whatever number this line goes to on the x
axis is the median time waiting.
To find the quartiles, do the same as for the
median, but with whatever one quarter of the
total cumulative frequency is (for the lower
quartile) and three quarters of it for the upper
quartile.
To find the IQR, find both quartiles using the method above,
and then subtract the LQ from the UQ.
When plotting, make sure you put the points in the
right place along the x axis. eg for the first one, put
it on the line where 5 is, nowhere before it.