What are the two different types of confidence intervals for the difference between two means?

Is it true that the average price of a cup of coffee is different depending on the size of the city you live in? It certainly seems reasonable that the average price for a cup of coffee would be more in a large city compared to a small one, but how do you tell if that is really true? Confidence intervals for the difference of two means are the way to go to really be sure of your answer. So solve your coffee woes by reading further!

Even your coffee wants you to be happy!

Confidence Interval for the Difference of Two Means with Known Standard Deviations

If you were only interested in the average coffee price in one city you could do a confidence interval for a population mean. In that case, in order to do a proper confidence interval you would need that:

  • Either the sample size is large enough (\(n \ge 30\)) or the population distribution is approximately normal.

  • The sample is random or it is reasonable to assume it is representative of the larger population.

If you know the population standard deviation, \(\sigma\), the confidence interval is given by

\[ \bar{x} \pm (z \text{ critical value})\left(\frac{\sigma}{\sqrt{n}}\right)\]

where \(\bar{x}\) is the sample mean.

But here you have two different cities and you want to compare the average coffee price, so how do you construct the confidence interval? Let's start by listing some of the notation used going forward.

First the population notation:

Population \(1\)

Population \(2\)

Population Mean

\( \mu_1\)

\( \mu_2\)

Population Standard Deviation

\(\sigma_1\)

\(\sigma_2\)

And now for the samples:

Sample from Population \(1\)

Sample from Population \(2\)

Sample Size

\(n_1\)

\(n_2\)

Sample Mean

\(\bar{x}_1\)

\(\bar{x}_2\)

Sample Standard Deviation

\(s_1\)

\(s_2\)

Then the conditions for constructing a confidence interval for the difference of two means are:

  • The samples are independent.

  • Either the sample size is large enough (\(n_1 \ge 30\) and \(n_2 \ge 30\)) or the population distribution is approximately normal.

  • The samples are random or it is reasonable to assume that the samples are representative of the larger population.

These conditions don't change even if you don't know the population standard deviations.

Because the samples are independent and random, you know that

\[ \mu_{\bar{x}_1 - \bar{x}_2} = \mu_1 - \mu_2\]

and that

\[ \sigma_{x_1 - x_2} = \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2} }.\]

Then the confidence interval for the difference in the two population means is

\[\bar{x}_1 - \bar{x}_2 \pm (z \text{ critical value})\sqrt{\frac{\sigma_1^2}{n_1} +\frac{\sigma_2^2}{n_2} } .\]

In general you aren't going to know what the population standard deviations are, but let's look at an example illustrating the use of the formulas.

You do a survey of \(40\) small town coffee shops and \(49\) big city coffee shops, and find that the mean price of a large cup of coffee is \(\$3.75\) and in the big cities it is \(\$ 4.50\). You also know that the population standard deviation in small towns is \(1.20\), and in big cities the population standard deviation of \(0.98\).

Construct a \(99\%\) confidence interval for the difference of their two means, and draw conclusions from it.

Answer

It helps to lay out the information you have. Call the small city Population \(1\) and the large city Population \(2\). Then you know that

\[ \begin{array}{lll} & n_1 = 40 & \bar{x}_1 = 3.75 & \sigma_1 = 1.20 \\ & n_2 = 49 & \bar{x}_2 = 4.50 & \sigma_2 = 0.98 . \end{array}\]

You know that the \(z\) critical value for a \(99\%\) confidence interval is \(2.58\). Then calculating the confidence interval for the difference in the means,

\[\begin{align} & \bar{x}_1 - \bar{x}_2 \pm (z \text{ critical value})\sqrt{\frac{\sigma_1^2}{n_1} +\frac{\sigma_2^2}{n_2} } \\ & \qquad = 3.75-4.50 \pm 2.58 \sqrt{\frac{(1.20)^2}{40} +\frac{(0.98)^2}{49} } \\ & \qquad = -0.75 \pm 2.58\sqrt{0.036 + 0.0196} \\ & \qquad \approx -0.75 \pm 0.61 \\ & \qquad = (-1.36, -0.14) .\end{align}\]

Now what can you conclude from this? First, you can conclude that the method used to construct this interval estimate is successful in capturing the actual difference in the population means about \(99\%\) of the time.

More importantly, you can conclude with \(99\%\) confidence that the actual difference in the mean price of a large cup of coffee is between \(-\$1.36\) and \(-\$0.14\). Because both endpoints of the confidence interval are negative, you can estimate that the mean price of a large cup of coffee is between \(\$0.14\) and \(\$1.36\) lower in a small town than it is in a big city.

Notice that in the previous example both ends of the confidence interval were negative. What happens if one end is negative and one end is positive? That implies that \(0\) is inside the confidence interval, so in other words it would be plausible that there was no difference in the two means.

Confidence Interval for the Difference of Two Independent Population Means

If you don't know the population standard deviations, but you do know that your samples are independent (meaning that choosing a member of the first population doesn't affect your choice for a member of the second population), then you can calculate the confidence interval using the formula:

\[\bar{x}_1 - \bar{x}_2 \pm (t \text{ critical value})\sqrt{\frac{s_1^2}{n_1} +\frac{s_2^2}{n_2} } ,\]where \(n_1\) and \(n_2\) are the sample sizes, \(s_1\) and \(s_2\) are the sample standard deviations, and \(\bar{x_1}\) and \(\bar{x}_2\) are the sample means.

What are the two types of confidence intervals?

There are two types of estimates for each population parameter: the point estimate and confidence interval (CI) estimate.

What are confidence intervals for the difference between means?

A confidence interval (C.I.) for a difference between means is a range of values that is likely to contain the true difference between two population means with a certain level of confidence.

What is the confidence interval estimate of the difference between the two population means?

The confidence interval gives us a range of reasonable values for the difference in population means μ1 − μ2. We call this the two-sample T-interval or the confidence interval to estimate a difference in two population means.

What does it mean for a confidence interval for the difference of two means to contain zero?

If your confidence interval for a difference between groups includes zero, that means that if you run your experiment again you have a good chance of finding no difference between groups.

Chủ đề