business stats

Question	Answer
Categorial Data	Ordinal Scale Nominal Scale
Quantitative Data	Ratio Scale Interval Scale
Frequency Distribution	“A frequency distribution is a tabular summary of data showing the number (frequency) of items in each of several non-overlapping classes”
Summarizing Categorical Data	Frequency Distribution: “A frequency distribution is a tabular summary of data showing the number (frequency) of items in each of several non-overlapping classes” Bar/Column Charts and Pie Charts
Summarizing Quantitative Data	Frequency Distribution: i.Number of “Classes” ii.Width of the “Classes” iii.Class limits “Class limits must be chosen so that each data item belongs to one and only one class”
Cumulative Distributions	Each successive “row” includes every previous “row.”
Ogive	graphical, rather than the tabular, presentation of a cumulative distribution
Why n-1?	Unbiased
why interquartile range?	To minimize the mishaps of extreme values
Mean	Central Tendency
Median	Central Tendency not affected by extreme values
Distribution Shape	i.Skewness ii.Symmetric
Z-scores	a relative measure of the distance from the mean
Random Variables	numerical description of the outcome of an experiment: In effect, a random variable associates a numerical value with each possible experimental outcome
Discrete Random Variable	assume[s] either a finite number of values [such as “0” or “1”] or an infinite sequence of values such as 0, 1, 2 …
DISCRETE PROBABILITY FUNCTION	The PROBABILITY FUNCTION for a random variable describes how probabilities are distributed over the values of the random variable. For a discrete random variable x, the probability distribution is defined by a PROBABILITY FUNCTION, denoted by f(x)
BINOMIAL PROBABILITY DISTRIBUTION Properties of a Binomial Distribution – or, how do I recognize that it is a Binomial Distribution?	1. There is a sequence of “n” identical trials; 2. There are two possible outcomes from each trial (the “Bi”); 3. The probability of “success,” (p) does not change from one trial to the next; 4. Each trial is independent
POISSON PROBABILITY DISTRIBUTION Properties of a Poisson Distribution	1. The interest is in “area” or “intervals;” 2. The probability of an occurrence is the same for any two “areas” or “intervals;” 3. The occurrences in any area or interval is independent of any other area or interval
extra HYPERGEOMETRIC PROBABILITY DISTRIBUTION Properties of a Hypergeometric Distribution	1. There is a sequence of “n” identical trials; 2. There are two possible outcomes from each trial (the “Bi”); 3. The probability of “success,” (p) DOES change from one trial to the next; 4. Each trial is NOT independent
CHARACTERISTICS OF THE NORMAL DISTRIBUTION	1. Normal distributions are differentiated by two parameters: (1) the mean; and, (2) the standard deviation; 2. The “peak” of the normal distribution is equal to the mean, the median, and the mode; 3. The mean of the distribution can be any value – negative, zero, or positive; 4. The normal distribution is symmetric; 5. The standard deviation determines how flat – or wide – the normal curve is; 6. Probabilities for the normal random variable are given by the area under the normal curve (The total probability “area” under the curve is equal to “1”); And, 7. The percentages of values can be commonly stated in terms of plus/minus standard deviations: i. plus/minus one standard deviation – 68.3 percent ii. plus/minus two standard deviations – 95.4 percent iii. plus/minus three standard deviations – 99.7 percent
There are three “types” of probabilities we will compute:	1. the probability that the standard normal random variable z will be less than or equal to a given value; 2. the probability that z will be between two given values; and, 3. the probability that z will be greater than or equal to a given value
1. Sampling from a finite population: A simple random sample	A simple random sample of size “n” from a finite population of size “N” is a sample selected such that each possible sample of size “n” has the same probability of being selected
2. Sampling from a process – or an infinite population: A random sample	A random sample is one is which each of the sampled elements is independent and follows the same probability distribution as the elements in the population. If a production process … is operating properly, then each unit produced is independent of each other unit and the differences in the units are only attributable to chance variation. In such a situation, we can select a random sample by selecting any “n” units produced while the process is operating properly
RELATIONSHIP BETWEEN SAMPLE SIZE AND THE SAMPLING DISTRIBUTION OF X-BAR (THE STANDARD ERROR OF THE MEAN)	Since the Standard Error of the mean (σ_x ̅ ) is reduced as the sample size is increased, larger sample sizes will result in sample means being more closely “grouped” around the “true” population mean and therefore more likely to provide a “better” point estimate of the population parameter
PROBABILITY SAMPLING METHODS	1. Simple Random Sample: A sample selected such that each possible sample of size ‘n’ has the same probability of being selected. 2. Stratified Random Sampling: The elements in the population are first divided into groups called “strata”, such that each element in the population belongs to one and only one stratum. 3. Cluster Sampling: As with the strata above, clusters represent the division of the population into sub-groups or clusters. The difference is that the clusters are intended to be each representative of the population itself rather than an individual characteristic of the population. 4. Systematic Sampling: choosing a random sample from a large population by selecting every “kth” element from a random list of the entire population. Systematic sampling eliminates the necessity of identifying each element in the population
NON-PROBABILITY SAMPLING METHODS	5. Convenience Sampling: a “non-probability” sampling technique. As the name implies, the sample is identified primarily by convenience. Elements are included in the sample without pre-specified or known probabilities of being selected – but are selected instead based on their “convenience” or ease of access. 6. Judgement Sampling: elements of the population are selected based on an expert researcher’s knowledge of their representativeness of the population.
POTENTIAL ERRORS THAT ARISE FROM SAMPLING	1. Coverage Errors: Are all members of the population of interest ‘available’ for inclusion into the sample? 2. Non-Response Errors: Are all members of the population of interest sampled ‘responding’ to the survey? 3. Sampling Errors: Does the sample selected fully represent the diversity of the population of interest? 4. Measurement Errors: Has the data collected been accurately reported? Errors 1, 2, and 4 can be termed “non-sampling errors” as the size of the sample doesn’t “fix” the problem. Error 3 can often be mitigated through larger sample sizes as the larger sample size helps to reduce the “chance” of missing different groups in a population of interest.
In practice, the planning value, p* can be chosen by one of the following procedures:	1. Use the sample proportion from a previous sample of the same or similar units; 2. Use a pilot study to select a preliminary sample. The sample proportion from this sample can be used as the planning value, p; 3. Use judgement or a “best guess” for the planning value, p; and, If none of the preceding alternatives apply, use a planning value of p* = 0.50.
DETERMINING THE SAMPLE SIZE	If a desired margin of error is selected prior to sampling, the procedures in this section can be used to determine the sample size necessary to satisfy the margin of error requirement.
In practice, the planning value, X-Bar can be chosen by one of the following procedures:	1. Use the estimate of the population standard deviation computed from data of previous studies as the planning value for σ; 2. Use a pilot study to select a preliminary sample. The sample standard deviation from the preliminary sample can be used as the planning value for σ; and, 3. Use judgement or a “best guess” for the value of σ. For example, we might begin by estimating the largest and smallest data values in the population. The difference between the largest and smallest values provides an estimate of the range for the data. Finally, the range divided by 4 is often suggested as a rough approximation of the standard deviation and thus an acceptable planning value for σ.

Next up

Description

Resource summary

Similar

	Created by yimyo2020 over 9 years ago