This course is designed to acquaint the student with the principles of descriptive and inferential statistics. Topics will include: types of data, frequency distributions and histograms, measures of central tendency, measures of variation, probability, probability distributions including binomial, normal probability and student's t distributions, standard scores, confidence intervals, hypothesis testing, correlation, and linear regression analysis. This course is open to any student interested in general statistics and it will include applications pertaining to students majoring in athletic training, pre-nursing and business.
Listed below are the top 10 annual salaries (in millions of dollars) of TV personalities.
Find the (a) mean, (b) median, (c) mode, and (d) midrange for the given sample data in millions of dollars.
(e) Given that these are the top 10 salaries, do we know anything about the salaries of TV personalities in general?
(f) Are such top 10 lists valuable for gaining insight into the larger population?
a. The mean is 20.60.
b. The median is 14.25.
c. There is no mode.
d. The midrange is 23.85.
e. Since the sample values are the 10 highest, they give almost no information about the salaries of TV personalities in general.
f. No, because such top 10 lists represent an extreme subset of the population rather than the larger population
The mean of the data set is $52329.7.
The midrange of the data set is $52490.0.
The median of the data set is $52274.0.
The mode(s) of the data set is (are) $52297.
Nothing meaningful can be concluded from this information except that these are the largest tuitions of colleges in the country for a recent year.
An insurance institute conducted tests with crashes of new cars traveling at 6 mi/h. The total cost of the damages was found for a simple random sample of the tested cars and listed below.
Find the (a) mean, (b) median, (c) mode, and (d) midrange for the given sample data.
(e) Do the different measures of center differ very much?
Listed below are the lead concentrations (in mg/g) measured in different samples of a medicine.
Find the mean, midrange, median, and mode of the data set.
What do the results suggest about the safety of this medicine?
What do the decimal values of the listed amounts suggest about the precision of the measurements?
Listed below are the measured radiation emissions (in W/kg) corresponding to cell phones: A, B, C, D, E, F, G, H, I, J, and K respectively. The media often present reports about the dangers of cell phone radiation as a cause of cancer. Cell phone radiation must be 1.6 W/kg or less.
Find the a. mean, b. median, c. midrange, and d. mode for the data.
e. If you are planning to purchase a cell phone, are any of the measures of center the most important statistic? Is there another statistic that is most relevant? If so, which one?
Listed below are the errors between the predicted temperatures and actual temperatures of a certain city.
Find the mean and median for each of the two samples.
Do the means and medians indicate that the temperatures predicted one day in advance are more accurate than those predicted 5 days in advance, as we might expect?
The mean difference between actual high and the predicted high one day earlier is 0.5°.
The median difference between actual high and the predicted high one day earlier is 1.0°.
The mean difference between actual high and the predicted high five days earlier is -0.3°.
The median difference between actual high and the predicted high five days earlier is 2.0°.
No, the means and medians do not indicate any substantial difference in accuracy.
Statistics are sometimes used to compare or identify authors of different works. The lengths of the first 10 words in a book by Terry are listed with the first 10 words in a book by David.
Find the mean and median for each of the two samples.
Compare the two sets of results. Does there appear to be a difference?
The mean number of letters per word in Terry's book is 3.5.
The median number of letters per word in Terry's book is 3.0.
The mean number of letters per word in David's book is 3.1.
The median number of letters per word in David's book is 3.0.
Yes. Based on the results, words in Terry's book are longer than the words in David's book.
Waiting times (in minutes) of customers in a bank where all customers enter a single waiting line and a bank where customers wait in individual lines at three different teller windows are listed below.
Find the mean and median for each of the two samples.
Determine whether there is a difference between the two data sets that is not apparent from a comparison of the measures of center. If so, what is it?
The mean waiting time for customers in a single line is 7.11 minutes.
The median waiting time for customers in a single line is 7.10 minutes.
The mean waiting time for customers in individual lines is 7.11 minutes.
The median waiting time for customers in individual lines is 7.10 minutes.
The times for customers in individual lines are much more varied than the times for customers in a single line.
Notice that the mean and median waiting times for customers in single and individual lines are the same. To determine if there is a difference between the two data sets that is not apparent from the comparison of the means and medians, compare how the data values vary among themselves in each set.
Use the magnitudes (Richter scale) of the earthquakes listed in the data set below.
Find the mean and median of this data set.
Is the magnitude of an earthquake measuring 7.0 on the Richter scale an outlier (data value that is very far away from the others) when considered in the context of the sample data given in this data set? Explain.
The geometric mean is often used in business and economics for finding average rates of change, average rates of growth, or average ratios. Given n values (all of which are positive), the geometric mean is the nth root of their product.
The average growth factor for money compounded at annual interest rates of 14%, 6%, and 3% can be found by computing the geometric mean of 1.14, 1.06, and 1.03. Find that average growth factor.
The single percentage growth rate is found by subtracting 1 from the growth factor and then multiplying by 100%. What single percentage growth rate would be the same as having three successive growth rates of 14%, 6%, and 3%?
Is that result the same as the mean of 14%, 6%, and 3%?
The average growth factor is 1.0757.
(1.14 x 1.06 x 1.03)^(1/3) = 1.075678893
The single percentage growth rate that would be the same as having three successive growth rates of 14%, 6%, and 3% is 7.57%.
(1.0757 – 1) x 100 = 7.57
The mean of 14%, 6%, and 3% is 7.67%.
The single percentage growth rate is not the same as the mean of 14%, 6%, and 3%.
Methods used that summarize or describe characteristics of data are called _______ statistics.
Which of the following is NOT a measure of center?
Listed below are the top 10 annual salaries (in millions of dollars) of TV personalities.
Find the range, variance, and standard deviation for the sample data.
Given that these are the top 10 salaries, is the standard deviation of the sample a good estimate of the variation of salaries of TV personalities in general?
Six different second-year medical students at Bellevue Hospital measured the blood pressure of the same person. The systolic readings (in mmHg) are listed below.
Find the range, variance, and standard deviation for the given sample data.
If the subject's blood pressure remains constant and the medical students correctly apply the same measurement technique, what should be the value of the standard deviation?
Listed below are the arrival delay times (in minutes) of randomly selected airplane flights from one airport to another. Negative values correspond to flights that arrived early before the scheduled arrival time, and positive values represent lengths of delays.
Find the range, variance, and standard deviation for the set of data.
Some of the sample values are negative, but can the standard deviation ever be negative?
Listed below are amounts (in millions of dollars) collected from parking meters by a security service company and other companies during similar time periods.
Find the coefficient of variation for each of the two samples, then compare the variation.
Do the limited data listed here show evidence of stealing by the security service company's employees? Consider a difference of greater than 1% to be significant.
The coefficient of variation for the amount collected by the security service company is 9.98%.
(0.1567021236 ÷ 1.57) x 100 = 9.981026981
The coefficient of variation for the amount collected by the other companies is 7.65%.
(0.1316561177 ÷ 1.72) x 100 = 7.654425448
Yes. There is a significant difference in the variation.
Below are the range and standard deviation for a set of data.
Use the range rule of thumb and compare it to the standard deviation listed below.
Does the range rule of thumb produce an acceptable approximation? Suppose a researcher deems the approximation as acceptable if it has an error less than 15%.
A certain group of test subjects had pulse rates with a mean of 77.6 beats per minute and a standard deviation of 10.2 beats per minute.
Would it be "unusual" for one of the test subjects to have a pulse rate of 68.0 beats per minute?
Minimum "usual" value = 57.2 beats per minute
minimum "usual" value = (mean)
– 2 x (standard deviation)
77.6 – 2(10.2) = 57.2
Maximum "usual" value = 98.0 beats per minute
minimum "usual" value = (mean) + 2 x (standard deviation)
77.6 – 2(10.2) = 98.0
No, because it is between the minimum and maxmum "usual" values.
Cans of regular soda have volumes with a mean of 13.51 oz and a standard deviation of 0.12 oz.
Is it "unusual" for a can to contain 13.59 oz of soda?
Find the standard deviation, s, of sample data summarized in the frequency distribution table given below by using the formula below, where x represents the class midpoint, f represents the class frequency, and n represents the total number of sample values.
Compare the computed standard deviation to the standard deviation obtained from the original list of data values, 9.0. Consider a difference of 20% between two values of a standard deviation to be significant.
Standard deviation = 7.7
---------------------------------------------------------------------------------
(Use calculator)
L1 L2
midpt freq
1–Var Stats L1,L2
Sx = 7.679830596
---------------------------------------------------------------------------------
The computed value is not significantly different from the given value.
Heights of men on a baseball team have a bell-shaped distribution with a mean of 178 cm and a standard deviation of 8 cm.
Using the empirical rule, what is the approximate percentage of the men between the following values?
a. 154 cm and 202 cm
b. 170 cm and 186 cm
a. 99.73% of the men are between 154 cm and 202 cm.
(154 – 178) ÷ 8 = -3
(202 – 178) ÷ 8 = 3
3 SD... 99.73%
b. 68% of the men are between 170 cm and 186 cm.
(170 – 178) ÷ 8 = -1
(186 – 178) ÷ 8 = 1
1 SD... 68%
Which of the following is NOT a property of the standard deviation?
A. The value of the standard deviation is never negative.
B. When comparing variation in samples with very different means, it is good practice to compare the two sample standard deviations.
C. The standard deviation is a measure of variation of all data values from the mean.
D. The units of the standard deviation are the same as the units of the original data.
B. When comparing variation in samples with very different means, it is good practice to compare the two sample standard deviations.
It's a good practice to compare two sample standard deviations
only when the sample means are approximately the same.
When comparing variation in samples with very different
means, it is better to use the coefficient of variation, which is
defined later in this section.
The Range Rule of Thumb roughly estimates the standard deviation of a data set as _______.
The square of the standard deviation is called the _______.
If your score on your next statistics test is converted to a z score, which of these z scores would you prefer: –2.00, –1.00, 0, 1.00, 2.00? Why?
The z score of 2.00 is most preferable because it is 2.00 standard deviations above the mean and would correspond to the highest of the five different possible test scores.
A z score (or standardized value) is the number of standard deviations that a given value x is above or below the mean. A negative z score corresponds to an x value less than the mean. A positive z score corresponds to an x value greater than the mean. The more negative the z score, the further the x value is below the mean. The more positive the z score, the further the x value is above the mean.
With a height of 68 in, George was the shortest president of a particular club in the past century. The club presidents of the past century have a mean height of 73.7 in and a standard deviation of 2.7 in.
a. What is the positive difference between George's height and the mean?
b. How many standard deviations is that [the difference found in part (a)]?
c. Convert George's height to a z score.
d. If we consider "usual" heights to be those that convert to z scores between –2 and 2, is George's height usual or unusual?
A particular group of men have heights with a mean of 173 cm and a standard deviation of 7 cm. Carl had a height of 180 cm.
a. What is the positive difference between Carl's height and the mean?
b. How many standard deviations is that [the difference found in part (a)]?
c. Convert Carl's height to a z score.
d. If we consider "usual" heights to be those that convert to z scores between –2 and 2, is Carl's height usual or unusual?
IQ scores are measured with a test designed so that the mean is 100 and the standard deviation is 18. Consider the group of IQ scores that are unusual.
What are the z scores that separate the unusual IQ scores from those that are usual?
What are the IQ scores that separate the unusual IQ scores from those that are usual? (Consider a value to be unusual if its z score is less than –2 or greater than 2.)
In a recent year the magnitudes (Richter scale) of 10,594 earthquakes were recorded. The mean is 1.218 and the standard deviation is 0.584. Consider the magnitudes that are unusual.
What are the magnitudes that separate the unusual magnitudes from those that are usual? (Consider a value to be unusual if its z score is less than –2 or greater than 2.)
One of the tallest living men has a height of 261 cm. One of the tallest living women is 243 cm tall. Heights of men have a mean of 170 cm and a standard deviation of 8 cm. Heights of women have a mean of 159 cm and a standard deviation of 3 cm.
Relative to the population of the same gender, who is taller? Explain.
Which is relatively better: a score of 52 on a psychology test or a score of 46 on an economics test? Scores on the psychology test have a mean of 94 and a standard deviation of 15. Scores on the economics test have a mean of 55 and a standard deviation of 5.
The third quartile Q_{3} is 60.
Quartiles are measures of location, denoted Q1, Q2, and Q3, which divide a set of data into four groups with about 25% of the values in each group. Note that quartiles and percentiles are related (Q1 = P_{25}, Q2 = P_{50}, and Q3 = P_{75 }).
L = k ÷ 100 x n
75 ÷ 100 x 24 = 18
Since L = 18 is a whole number, to find P_{75}, add the 18th value and the next value in the sorted set of data and divide the total by 2.
The 18th value is 59. The 19th value is 61.
59 + 61 = 120
120 ÷ 2 = 60
Since P_{75} = 60, the third quartile Q_{3} is 60.
When a data value is converted to a standardized scale representing the number of standard deviations the data value lies from the mean, we call the new value a _______.
A data value is considered _______ if its z-score is less than –2 or greater than 2.
Whenever a data value is less than the mean, _______.
In modified boxplots, a data value is a(n) _______ if it is above Q_{3 + }(1.5)(IQR) or below Q_{1 – }(1.5)(IQR).