- This is part of probstat
Consider a certain distribution. The mean of the distribution is the expected value of a random variable sample from the distribution. I.e.,
.
Also recall that the variance of the distribution is
And finally, the standard deviation is .
Sample Statistics
Suppose that you take samples independently from this distribution. (Note that are random variables.)
Sample means
The statistic
is called a sample mean. Since are random variables, the mean is also a random variable.
We hope that approximates well. We can compute:
and since are independent, we have that
Sample variances and sample standard deviations
We can also use the sample to estimate .
The statistic
is called a sample variance. The sample standard deviation is .
Note that the denominator is instead of .
We can show that .
We note that since and are independent, we have that
.
Let's deal with the middle term here.
Let's work on the third term which ends up being the same as the middle term.
Let's put everything together:
Summary
Sample means:
Sample variance:
Properties of sample means and sample variances
Distribution of sample means
While we know basic properties of sample means , if we want to perform other statistical calculation (i.e., computing confidence intervals or testing hypotheses), it is very useful to know the exact distribution of .
For a general population, it will be hard to deal the the distribution of exactly. However, if the population is normal, we are in a very good shape.
Recall the definition of :
Therefore, is a sum of independent normally distributed random variables. A nice property of normal random variables is that the sum of normally distributed random variables remains a normal random variable. Since a normal random variable is uniquely determined by its mean and variance, we have the following observation.
Examples
Ex1. Suppose that the population has mean and variance . If you select a sample of size 20, what is the probability that the sample mean is greater than 17?
Solution:
The sample mean is normal with mean and variance . Therefore,
is unit normal.
Note that
We can look at the standard normal table and find out that , for a unit normal random variable Z. Thus, the probability
which is roughly 1%.
Ex2.
- To be added...
Why do we use normal distributions?
Normal random variables appear very often in our treatment of statistics. This is not just a coincidence. See limit theorems.