2/22/2018
CHAPTER 10 Comparing Two Populations or Groups 10.1 Comparing Two Proportions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore
Bedford Freeman Worth Publishers
Comparing Two Proportions Learning Objectives After this section, you should be able to: DESCRIBE the shape, center, and spread of the sampling distribution of the difference of two sample proportions. DETERMINE whether the conditions are met for doing inference about p1 − p2. CONSTRUCT and INTERPRET a confidence interval to compare two proportions. PERFORM a significance test to compare two proportions.
The Practice of Statistics, 5th Edition
2
1
2/22/2018
Introduction Suppose we want to compare the proportions of individuals with a certain characteristic in Population 1 and Population 2. Let’s call these parameters of interest p1 and p2. The ideal strategy is to take a separate random sample from each population and to compare the sample proportions with that characteristic. What if we want to compare the effectiveness of Treatment 1 and Treatment 2 in a completely randomized experiment? This time, the parameters p1 and p2 that we want to compare are the true proportions of successful outcomes for each treatment. We use the proportions of successes in the two treatment groups to make the comparison.
The Practice of Statistics, 5th Edition
3
The Sampling Distribution of a Difference Between Two Proportions To explore the sampling distribution of the difference between two proportions, let’s start with two populations having a known proportion of successes. At School 1, 70% of students did their homework last night At School 2, 50% of students did their homework last night. Suppose the counselor at School 1 takes an SRS of 100 students and records the sample proportion that did their homework. School 2’s counselor takes an SRS of 200 students and records the sample proportion that did their homework.
What can we say about the difference pˆ1 − pˆ 2 in the sample proportions?
The Practice of Statistics, 5th Edition
4
2
2/22/2018
The Sampling Distribution of a Difference Between Two Proportions Using Fathom software, we generated an SRS of 100 students from School 1 and a separate SRS of 200 students from School 2. The difference in sample proportions was then be calculated and plotted. We repeated this process 1000 times.
What do you notice about the shape, center, and spread of the sampling distribution of pˆ1 − pˆ 2 ? The Practice of Statistics, 5th Edition
5
The Sampling Distribution of a Difference Between Two Proportions Both pˆ1 and pˆ 2 are random variables. The statistic pˆ1 − pˆ 2 is the difference of these two random variables. In Chapter 6, we learned that for any two independent random variables X and Y, µX −Y = µX − µY and σ X2 −Y = σ X2 + σ Y2 The Sampling Distribution of the Difference Between Sample Proportions
Choose an SRS of size n1 from Population 1 with proportion of successes p1 and an independent SRS of size n2 from Population 2 with proportion of successes p2. Shape When n1 p1, n1 (1− p1 ), n 2 p2 and n 2 (1− p2 ) are all at least 10, the sampling distribution of pˆ1 − pˆ 2 is approximately Normal.
Spread The standard deviation of the sampling distribution of pˆ1 − pˆ 2 is p1 (1− p1 ) p2 (1− p2 ) + n1 n2 as long as each sample is no more than 10% of its population (10% condition). The Practice of Statistics, 5th Edition
6
3
2/22/2018
The Sampling Distribution of a Difference Between Two Proportions
The Practice of Statistics, 5th Edition
7
Lets watch video from Text on Page 615
The Practice of Statistics, 5th Edition
8
4
2/22/2018
The Sampling Distribution of a Difference Between Two Proportions Suppose that there are two large high schools, each with more than 2000 students, in a certain town. At School 1, 70% of students did their homework last night. Only 50% of the students at School 2 did their homework last night. The counselor at School 1 takes an SRS of 100 students and records the proportion that did homework. School 2’s counselor takes an SRS of 200 students and records the proportion that did homework a) Describe the shape, center, and spread of the sampling distribution of pˆ1 − pˆ 2 .
Because n1 p1 =100(0.7) = 70, n1 (1− p1 ) = 100(0.30) = 30, n 2 p2 = 200(0.5) =100 and n 2 (1− p2 ) = 200(0.5) = 100 are all at least 10, the sampling distribution of pˆ1 − pˆ 2 is approximately Normal.
Its mean is p1 − p2 = 0.70 − 0.50 = 0.20.
Its standard deviation is 0.7(0.3) 0.5(0.5) + = 0.058. 100 200
The Practice of Statistics, 5th Edition
9
Confidence Intervals for p1 – p2 When data come from two random samples or two groups in a randomized experiment, the statistic pˆ1 − pˆ 2 is our best guess for the value of p1 − p2 . We can use our familiar formula to calculate a confidence interval for p1 − p2 :
statistic ± (critical value) ⋅ (standard deviation of statistic)
If the Normal condition is met, we find the critical value z* for the given confidence level from the standard Normal curve.
The Practice of Statistics, 5th Edition
10
5
2/22/2018
Confidence Intervals for p1 – p2 Conditions For Constructing A Confidence Interval About A Difference In Proportions
•
Random: The data come from two independent random samples or from two groups in a randomized experiment. o 10%: When sampling without replacement, check that n1 ≤ (1/10)N1 and n2 ≤ (1/10)N2.
•
Because we don't know the values of the parameters p1 and p2 , we replace them in the standard deviation formula with the sample proportions. The result is the pˆ1 (1− pˆ1 ) pˆ 2 (1− pˆ 2 ) standard error of the statistic pˆ1 − pˆ 2 : + n1 n2 The Practice of Statistics, 5th Edition
11
Confidence Intervals for p1 – p2 Two-Sample z Interval for a Difference Between Two Proportions
The Practice of Statistics, 5th Edition
12
6
2/22/2018
Lets watch video for page 617 in the text
Confidence Interval for a difference in Proportions page 618
The Practice of Statistics, 5th Edition
13
The Practice of Statistics, 5th Edition
14
7
2/22/2018
Significance Tests for p1 – p2 An observed difference between two sample proportions can reflect an actual difference in the parameters, or it may just be due to chance variation in random sampling or random assignment. Significance tests help us decide which explanation makes more sense. The null hypothesis has the general form H0: p1 - p2 = hypothesized value We’ll restrict ourselves to situations in which the hypothesized difference is 0. Then the null hypothesis says that there is no difference between the two parameters: H0: p1 - p2 = 0 or, alternatively, H0: p1 = p2 The alternative hypothesis says what kind of difference we expect. Ha: p1 - p2 > 0, Ha: p1 - p2 < 0, or Ha: p1 - p2 ≠ 0 The Practice of Statistics, 5th Edition
15
Please see video for example on Page 620
The Practice of Statistics, 5th Edition
16
8
2/22/2018
Significance Tests for p1 – p2 Conditions For Performing a Significance Test About A Difference In Proportions
•
Random: The data come from two independent random samples or from two groups in a randomized experiment. o 10%: When sampling without replacement, check that n1 ≤ (1/10)N1 and n2 ≤ (1/10)N2.
•
The Practice of Statistics, 5th Edition
17
Significance Tests for p1 – p2 To do a test, standardize pˆ1 − pˆ 2 to get a z statistic : test statistic =
statistic − parameter standard deviation of statistic
z=
( pˆ1 − pˆ 2 ) − 0 standard deviation of statistic
If H0: p1 = p2 is true, the two parameters are the same. We call their common value p. We need a way to estimate p, so it makes sense to combine the data from the two samples. This pooled (or combined) sample proportion is:
pˆ C =
count of successes in both samples combined X1 + X 2 = count of individuals in both samples combined n1 + n 2
The Practice of Statistics, 5th Edition
18
9
2/22/2018
Significance Tests for p1 – p2 Two-Sample z Test for the Difference Between Two Proportions
The Practice of Statistics, 5th Edition
19
Please see video for example on Page 622
Also Please see page 624 in your Text to Review Significance Test for a Difference in Proportions on Your Calculator
The Practice of Statistics, 5th Edition
20
10
2/22/2018
Inference for Experiments Many important statistical results come from randomized comparative experiments. Defining the parameters in experimental settings is more challenging. •Most experiments on people use recruited volunteers as subjects. •When subjects are not randomly selected, researchers cannot generalize the results of an experiment to some larger populations of interest. •Researchers can draw cause-and-effect conclusions that apply to people like those who took part in the experiment. •Unless the experimental units are randomly selected, we don’t need to check the 10% condition when performing inference about an experiment.
The Practice of Statistics, 5th Edition
21
Please Review the Video for Example on Page 625
The Practice of Statistics, 5th Edition
22
11
2/22/2018
The Practice of Statistics, 5th Edition
23
The Practice of Statistics, 5th Edition
24
12
2/22/2018
Comparing Two Proportions Section Summary In this section, we learned how to… DESCRIBE the shape, center, and spread of the sampling distribution of the difference of two sample proportions. DETERMINE whether the conditions are met for doing inference about p1 − p2. CONSTRUCT and INTERPRET a confidence interval to compare two proportions. PERFORM a significance test to compare two proportions.
The Practice of Statistics, 5th Edition
25
13