ผลต่างระหว่างรุ่นของ "Probstat/notes/confidence intervals"

รุ่นแก้ไขปัจจุบันเมื่อ 03:00, 5 ธันวาคม 2557

This is part of probstat

Suppose that we take a sample of size $n$ , $X_{1},X_{2},\ldots ,X_{n}$ from a population which is normally distributed. Also suppose that the population has mean $\mu$ and variance $\sigma ^{2}$ . In this section, we assume that we do not know $\mu$ but we know the variance $\sigma ^{2}$ . The case we the variance is unknown will be discussed here.

We would like to estimate the mean $\mu$ . To do so, we compute the sample mean ${\bar {X}}$ . It is very certain that ${\bar {X}}\neq \mu$ , but we hope that it will be close to $\mu$ . In this section, we try to quantify how close the sample mean to the real mean. More precisely, we would like to find an error range $\beta$ such that we have some confidence that

${\bar {X}}-\beta \leq \mu \leq {\bar {X}}+\beta$ ,

i.e., that $\mu$ lies within ${\bar {X}}\pm \beta$ (or in the range $({\bar {X}}-\beta ,{\bar {X}}+\beta )$ ).

Definitions

When computing $\beta$ , we usually specify the level of confidence $1-\alpha$ that we want to get.

Two-sided confidence interval. Suppose that we take the sample $X_{1},X_{2},\ldots ,X_{n}$ of size $n-1$ and compute ${\bar {X}}$ . We say that an interval $({\bar {X}}-\beta ,{\bar {X}}+\beta )$ is called a $1-\alpha$ confidence level confidence interval if the probability that the real mean $\mu$ is in the range $({\bar {X}}-\beta ,{\bar {X}}+\beta )$ is $1-\alpha$ . That is,

$P\left\{{\bar {X}}-\beta <\mu <{\bar {X}}+\beta \right\}=1-\alpha$ .

If we know the distribution of ${\bar {X}}$ , we can use that to find $\beta$ for the required confidence level.

As discussed in the the last section, since the population is normal, the random variable ${\bar {X}}$ is a normal random variable with mean $\mu$ and s.d. $\sigma /{\sqrt {n}}$ , i.e.,

${\bar {X}}\sim Normal(\mu ,\sigma ^{2}/n)$

Remarks: When we say that $A\sim Normal(a,b)$ we mean that a random variable $A$ is normally distributed with mean $a$ and variance $b$ .

Therefore, we have that

${\frac {{\bar {X}}-\mu }{\sigma /{\sqrt {n}}}}={\sqrt {n}}({\bar {X}}-\mu )/\sigma$

is a unit normal random variable. We can then use the standard normal table to find probabilities related to this random variable.

Examples

EX1: If we look at the standard normal table, we can observe that

$P\left\{-1.96<{\sqrt {n}}({\bar {X}}-\mu )/\sigma <1.96\right\}=0.95$ ,

which means that

$P\left\{{\bar {X}}-1.96\sigma /{\sqrt {n}}<\mu <{\bar {X}}+1.96\sigma /{\sqrt {n}}\right\}=0.95$ .

From our definition, we have that the interval

$({\bar {X}}-1.96\sigma /{\sqrt {n}},{\bar {X}}+1.96\sigma /{\sqrt {n}})$

is a confidence interval with 95 percent confidence.

EX2: Suppose that we know that the population has variance $\sigma ^{2}=5$ . We compute a mean from a sample of size 10. Find the confidence interval with 90% confidence.

Solutions: Let $Z$ be a unit normal random variable. If we look at the standard normal table, we observe that

$P\{-1.64<Z<1.64\}=0.9$

Consider ${\sqrt {n}}({\bar {X}}-\mu )/\sigma =Z$ . We have that

$P\left\{{\bar {X}}-1.64\sigma /{\sqrt {n}}<\mu <{\bar {X}}+1.64\sigma /{\sqrt {n}}\right\}=P\{-1.64<Z<1.64\}=0.9$ .

Plugging in all the values, we have that $1.64\sigma /{\sqrt {n}}=1.16$ . Thus, the confidence interval with 90% confidence is

$({\bar {X}}-1.16,{\bar {X}}+1.16)$

EX3: Consider the previous population. Suppose that we want the error range to be small. More precisely, we want to sample mean to be accurate within 0.1 with 80% confidence level, i.e.,

$P\left\{{\bar {X}}-0.05<\mu <{\bar {X}}+0.05\right\}=0.8$

What is the size of the sample that we have to take?

Solution: We first look at the standard normal table, and find out that, for unit normal variable $Z$ ,

$P\{-1.28<Z<1.28\}=0.8.$

Set $Z={\sqrt {n}}({\bar {X}}-\mu )/\sigma$ .

$P\{-1.28<{\sqrt {n}}({\bar {X}}-\mu )/\sigma <1.28\}=P\{{\bar {X}}-1.28\sigma /{\sqrt {n}}<\mu <{\bar {X}}+1.28\sigma /{\sqrt {n}}\}$

Therefore we want $1.28\sigma /{\sqrt {n}}<0.05$ . This is true when ${\sqrt {n}}>1.28\cdot {\sqrt {5}}/0.05=57.243$ , i.e., $n>3276.799$ .

One-sided confidence intervals

In many cases, we only want the guarantee of the sample mean on the upper bound side or the lower bound side. For example, we want to say that the real mean is not far too large from the sample mean, i.e.,

$P\{\mu <{\bar {X}}+\beta \}=1-\alpha .$

In this case, we want to compute the one-sided confidence interval using essentially the same approach as in the two-sided case.

EX1: Suppose that we know that the population has variance $\sigma ^{2}=5$ . We compute a mean from a sample of size 10. Find the value $\beta$ such that $(-\infty ,{\bar {X}}+\beta )$ is the confidence interval with 80% confidence level that the sample mean is within this interval.

Solutions: Let $Z$ be a unit normal random variable. If we look at the standard normal table, we observe that

$P\{Z>-0.84\}=0.8$ .

From this, we can say that

$P\{{\sqrt {n}}({\bar {X}}-\mu )/\sigma >-0.84\}=P\{{\bar {X}}+0.84\sigma /{\sqrt {n}}>\mu \}=P\{\mu <{\bar {X}}+0.84\sigma /{\sqrt {n}}\}=0.8$ .

Thus, the interval that we want is $(-\infty ,{\bar {X}}+0.84\sigma /{\sqrt {n}})=(-\infty ,{\bar {X}}+0.594)$ .

Remarks

Be careful when using probability related to confidence interval. We can talk about probabilities that the sample mean is close to the actual mean only before we take a sample. After we get the sample and compute the value ${\bar {X}}$ , it does not make any sense to talk about probability, because the interval either contains the mean or does not contain the mean. Therefore, at that point, we can only say that the interval has, for example, 90% confidence level.

@@ แถว 21: / แถว 21: @@
 </center>
-If we know the distribution of <math>\bar{X}</math>, we can use that to find <math>\beta</math> for the required confidence level.  As discussed in the [[Probstat/notes/sample means and sample variances|the last section]], that the random variable <math>\bar{X}</math> is a normal random variable with mean <math>\mu</math> and s.d. <math>\sigma/\sqrt{n}</math>, i.e.,
+If we know the distribution of <math>\bar{X}</math>, we can use that to find <math>\beta</math> for the required confidence level.
+As discussed in the [[Probstat/notes/sample means and sample variances|the last section]], since the population is normal, the random variable <math>\bar{X}</math> is a normal random variable with mean <math>\mu</math> and s.d. <math>\sigma/\sqrt{n}</math>, i.e.,
 <center>
@@ แถว 35: / แถว 37: @@
 </center>
-is a unit normal random variable for which we have a [https://en.wikipedia.org/wiki/Standard_normal_table table].
+is a unit normal random variable.  We can then use the [https://en.wikipedia.org/wiki/Standard_normal_table standard normal table] to find probabilities related to this random variable.
-== Examples ==
+=== Examples ===
 '''EX1:''' If we look at the standard normal table, we can observe that
@@ แถว 61: / แถว 63: @@
 '''EX2:''' Suppose that we know that the population has variance <math>\sigma^2 = 5</math>.  We compute a mean from a sample of size 10.  Find the confidence interval with 90% confidence.
-Let <math>Z</math> be a unit normal random variable. If we look at the standard normal table, we observe that
+'''Solutions:''' Let <math>Z</math> be a unit normal random variable. If we look at the standard normal table, we observe that
 <center>
@@ แถว 67: / แถว 69: @@
 </center>
-Consider <math>\sqrt{n}(\bar{X}-\mu)/\sigma = Z</math>, we have that
+Consider <math>\sqrt{n}(\bar{X}-\mu)/\sigma = Z</math>.  We have that
 <center>
@@ แถว 75: / แถว 77: @@
 </center>
-Plugging in all the values, we have that
+Plugging in all the values, we have that <math>1.64\sigma/\sqrt{n} = 1.16</math>.  Thus, the confidence interval with 90% confidence is
 <center>
 <math>(\bar{X} - 1.16, \bar{X} + 1.16)</math>
 </center>
+'''EX3:''' Consider the previous population.  Suppose that we want the error range to be small.  More precisely, we want to sample mean to be accurate within 0.1 with 80% confidence level, i.e.,
+<center>
+<math>
+P\left\{\bar{X}-0.05 < \mu < \bar{X} + 0.05 \right\} = 0.8
+</math>
+</center>
+What is the size of the sample that we have to take?
+'''Solution:'''  We first look at the standard  normal table, and find out that, for unit normal variable <math>Z</math>,
+<center>
+<math>
+P\{-1.28 < Z < 1.28\} = 0.8.
+</math>
+</center>
+Set <math>Z = \sqrt{n}(\bar{X}-\mu)/\sigma</math>.
+<center>
+<math>
+P\{-1.28 < \sqrt{n}(\bar{X}-\mu)/\sigma < 1.28\} =
+P\{\bar{X}-1.28\sigma/\sqrt{n} < \mu < \bar{X} + 1.28\sigma/\sqrt{n} \}
+</math>
+</center>
+Therefore we want <math>1.28\sigma/\sqrt{n} < 0.05</math>.  This is true when <math>\sqrt{n} > 1.28\cdot\sqrt{5}/0.05=57.243</math>, i.e., <math>n > 3276.799</math>.
 == One-sided confidence intervals ==
+In many cases, we only want the guarantee of the sample mean on the upper bound side or the lower bound side.  For example, we want to say that the real mean is not far too large from the sample mean, i.e.,
+<center>
+<math>
+P\{\mu < \bar{X}+\beta\} = 1-\alpha.
+</math>
+</center>
+In this case, we want to compute the ''one-sided confidence interval'' using essentially the same approach as in the two-sided case.
+'''EX1:''' Suppose that we know that the population has variance <math>\sigma^2 = 5</math>.  We compute a mean from a sample of size 10.  Find the value <math>\beta</math> such that <math>(-\infty,\bar{X}+\beta)</math> is the confidence interval with 80% confidence level that the sample mean is within this interval.
+'''Solutions:''' Let <math>Z</math> be a unit normal random variable. If we look at the standard normal table, we observe that
+<center>
+<math>P\{ Z > -0.84 \} = 0.8</math>.
+</center>
+From this, we can say that
+<center>
+<math>P\{ \sqrt{n}(\bar{X}-\mu)/\sigma > -0.84 \} = P\{ \bar{X} + 0.84\sigma/\sqrt{n} > \mu  \} = P\{ \mu < \bar{X} + 0.84\sigma/\sqrt{n} \}=0.8</math>.
+</center>
+Thus, the interval that we want is <math>(-\infty,\bar{X} + 0.84\sigma/\sqrt{n}) = (-\infty,\bar{X} + 0.594)</math>.
 == Remarks ==
+Be careful when using probability related to confidence interval.  We can talk about probabilities that the sample mean is close to the actual mean '''only before''' we take a sample.  After we get the sample and compute the value <math>\bar{X}</math>, it does not make any sense to talk about probability, because the interval either contains the mean or does not contain the mean.  Therefore, at that point, we can only say that the interval has, for example, 90% confidence level.

ผลต่างระหว่างรุ่นของ "Probstat/notes/confidence intervals"

รุ่นแก้ไขปัจจุบันเมื่อ 03:00, 5 ธันวาคม 2557

เนื้อหา

Definitions

Examples

One-sided confidence intervals

Remarks

รายการเลือกการนำทาง

เครื่องมือส่วนตัว

เนมสเปซ

สิ่งที่แตกต่าง

ดู

เพิ่มเติม

ค้นหา

การนำทาง

เครื่องมือ