ผลต่างระหว่างรุ่นของ "Probstat/notes/balls and bins"
Jittat (คุย | มีส่วนร่วม) |
Jittat (คุย | มีส่วนร่วม) |
||
(ไม่แสดง 17 รุ่นระหว่างกลางโดยผู้ใช้คนเดียวกัน) | |||
แถว 1: | แถว 1: | ||
− | : ''This is part of [[probstat]]. | + | : ''This is part of [[probstat]]. These notes are meant to be used in complement to the video lectures. They only contain summary of the materials discussed in the video. Don't use them to avoid watching the clips please.'' |
+ | |||
+ | : The materials on this part is from [http://www.cs.berkeley.edu/~satishr/cs174/ this course] at Berkeley.'' | ||
We consider a balls-and-bins experiment where we throw ''n'' balls independently into ''n'' bins uniformly at random. | We consider a balls-and-bins experiment where we throw ''n'' balls independently into ''n'' bins uniformly at random. | ||
− | {{ | + | {{กล่องเทา|'''Question 1:''' ''How many possible outcomes are there?'' |
Since each ball has ''n'' choices and their choices are independent, there are <math>n^n</math> outcomes. | Since each ball has ''n'' choices and their choices are independent, there are <math>n^n</math> outcomes. | ||
}} | }} | ||
− | {{ | + | {{กล่องเทา|'''Question 2:''' ''What is the probability that bin 1 is empty?'' |
In this case, each ball only have ''n'' - 1 choices (because they have to avoid bin 1); therefore there are <math>(n - 1)^n</math> outcomes where bin 1 is empty. Since each outcome is equally likely, the probability that bin 1 is empty is <math>\frac{(n-1)^n}{n^n} = \left(\frac{n-1}{n}\right)^n = \left(1-\frac{1}{n}\right)^n</math>. | In this case, each ball only have ''n'' - 1 choices (because they have to avoid bin 1); therefore there are <math>(n - 1)^n</math> outcomes where bin 1 is empty. Since each outcome is equally likely, the probability that bin 1 is empty is <math>\frac{(n-1)^n}{n^n} = \left(\frac{n-1}{n}\right)^n = \left(1-\frac{1}{n}\right)^n</math>. | ||
แถว 16: | แถว 18: | ||
Let random variable ''X'' be the number of empty bins. | Let random variable ''X'' be the number of empty bins. | ||
− | {{ | + | {{กล่องเทา|'''Question 1:''' ''What is P{X <math>=</math> 0}?'' |
If there is no empty bin, every ball must land into different bin. In this case, the first ball has n choices, the second ball has n-1 choices, and in general the ''i''-th ball has ''n - i +1'' choices. Therefore, there are <math>n\cdot (n-1)\cdot(n-2)\cdots(2)(1) = n!</math> outcomes. The probability is <math>\frac{n!}{n^n}</math>. | If there is no empty bin, every ball must land into different bin. In this case, the first ball has n choices, the second ball has n-1 choices, and in general the ''i''-th ball has ''n - i +1'' choices. Therefore, there are <math>n\cdot (n-1)\cdot(n-2)\cdots(2)(1) = n!</math> outcomes. The probability is <math>\frac{n!}{n^n}</math>. | ||
แถว 22: | แถว 24: | ||
It will be very hard to compute E[''X''] directly. (Just computing ''P''{ ''X'' = 3 } seems to be exceedingly hard.) | It will be very hard to compute E[''X''] directly. (Just computing ''P''{ ''X'' = 3 } seems to be exceedingly hard.) | ||
+ | |||
+ | === The Linearity of Expectation === | ||
+ | One of the reason why expectations are rather easy to work with is that they have a certain linear property stated as follows. | ||
+ | |||
+ | {{กล่องฟ้า|'''Linearity of Expectation.''' | ||
+ | |||
+ | For random variables ''X'' and ''Y'', we have that | ||
+ | |||
+ | <center><math>\mathrm{E}[X+Y] = \mathrm{E}[X] + \mathrm{E}[Y].</math></center> | ||
+ | |||
+ | This is also true for more than one random variable. Let <math>X_1,X_2,\ldots,X_k</math> be random variables. Thus, | ||
+ | |||
+ | <center><math>\mathrm{E}\left[\sum_{i=1}^k X_i \right] = \sum_{i=1}^k \mathrm{E}[X_i].</math></center> | ||
+ | }} | ||
+ | |||
+ | This property is always true even when they are dependent. | ||
+ | |||
+ | This also leads to various nice properties of expectation. A very useful fact is this: for constants ''a'' and ''c'', <math>E[aX + c] = a\cdot E[X] + c</math>. | ||
+ | |||
+ | === Using the linearity of expectation === | ||
+ | |||
+ | In order to deal with complicated random variables such as ''X'' above, we can break it down into "simpler" random variables. | ||
+ | |||
+ | For bin ''i'', we define a random variable ''X<sub>i</sub>'' to be 1 if bin ''i'' is empty and 0 otherwise. Note that with this definition, we have that | ||
+ | |||
+ | <center><math>X = \sum_{i=1}^n X_i</math>.</center> | ||
+ | |||
+ | (You should think for a few minutes so that you understand this.) | ||
+ | |||
+ | This implies that <math>\mathrm{E}[X] = \mathrm{E}\left[\sum_{i=1}^n X_i\right]</math>, because the expectations of two equal things must be equal. | ||
+ | |||
+ | Note that we use each random variable ''X<sub>i</sub>'' to indicate if each bin is empty. This type of random variables is generally referred to as '''indicator random variables'''. | ||
+ | |||
+ | Since we know that ''X'' is the sum of other random variables, the linearity of expectation tells us that | ||
+ | |||
+ | <center><math>\mathrm{E}\left[\sum_{i=1}^n X_i\right] = \sum_{i=1}^n \mathrm{E}[X_i]</math>.</center> | ||
+ | |||
+ | Therefore, if we can find <math>\mathrm{E}[X_i]</math> then we are done. So let's consider indicator random variable ''X<sub>i</sub>''. From the definition, we have that | ||
+ | |||
+ | <center><math>\mathrm{E}[X_i] = 0\cdot P\{X_i = 0\} + 1\cdot P\{X_i = 1\}</math>,</center> | ||
+ | |||
+ | which clearly is <math>P\{X_i = 1 \}.</math> | ||
+ | |||
+ | So, when does ''X<sub>i</sub>'' is 1? This, by the definition of ''X<sub>i</sub>'', is equivalent to the event that bin ''i'' is empty that we have already calculated in Question 2. Thus, | ||
+ | |||
+ | <center><math>\mathrm{E}[X_i] = \left(1 - \frac{1}{n}\right)^n</math>.</center> | ||
+ | |||
+ | This implies that | ||
+ | <center><math> | ||
+ | \begin{array}{rcl} | ||
+ | \mathrm{E}[X] &=& \mathrm{E}\left[\sum_{i=1}^n X_i\right] \\ | ||
+ | &=& \sum_{i=1}^n \mathrm{E}[X_i]\\ | ||
+ | &=& \sum_{i=1}^n \left(1 - \frac{1}{n}\right)^n\\ | ||
+ | &=& n\cdot \left(1 - \frac{1}{n}\right)^n. | ||
+ | \end{array} | ||
+ | </math> | ||
+ | </center> | ||
+ | |||
+ | Again, the first step is true because <math>X=\sum_{i=1}^n X_i</math> and the second step is true because of the linearity of expectation. | ||
+ | |||
+ | Note that as ''n'' gets close to infinity, <math>\left(1 - 1/n\right)^n</math> is close to <math>1/e</math>. Thus, <math>\mathrm{E}[X]\approx n/e</math>, or there are about 36.7% empty bins. | ||
+ | |||
+ | {{กล่องเทา|'''Remark:''' For an indicator random variable ''A'', <math>\mathrm{E}[A] = P\{A = 1\}</math>. }} | ||
== Fullest bins == | == Fullest bins == | ||
+ | In this section, we will try to give a guarantee on the number of balls in the fullest bin. | ||
+ | |||
+ | === Alternative analysis === |
รุ่นแก้ไขปัจจุบันเมื่อ 07:38, 18 กันยายน 2557
- This is part of probstat. These notes are meant to be used in complement to the video lectures. They only contain summary of the materials discussed in the video. Don't use them to avoid watching the clips please.
- The materials on this part is from this course at Berkeley.
We consider a balls-and-bins experiment where we throw n balls independently into n bins uniformly at random.
Question 1: How many possible outcomes are there?
Since each ball has n choices and their choices are independent, there are outcomes.
Question 2: What is the probability that bin 1 is empty?
In this case, each ball only have n - 1 choices (because they have to avoid bin 1); therefore there are outcomes where bin 1 is empty. Since each outcome is equally likely, the probability that bin 1 is empty is .
เนื้อหา
Number of empty bins
Let random variable X be the number of empty bins.
Question 1: What is P{X 0}?
If there is no empty bin, every ball must land into different bin. In this case, the first ball has n choices, the second ball has n-1 choices, and in general the i-th ball has n - i +1 choices. Therefore, there are outcomes. The probability is .
It will be very hard to compute E[X] directly. (Just computing P{ X = 3 } seems to be exceedingly hard.)
The Linearity of Expectation
One of the reason why expectations are rather easy to work with is that they have a certain linear property stated as follows.
Linearity of Expectation.
For random variables X and Y, we have that
This is also true for more than one random variable. Let be random variables. Thus,
This property is always true even when they are dependent.
This also leads to various nice properties of expectation. A very useful fact is this: for constants a and c, .
Using the linearity of expectation
In order to deal with complicated random variables such as X above, we can break it down into "simpler" random variables.
For bin i, we define a random variable Xi to be 1 if bin i is empty and 0 otherwise. Note that with this definition, we have that
(You should think for a few minutes so that you understand this.)
This implies that , because the expectations of two equal things must be equal.
Note that we use each random variable Xi to indicate if each bin is empty. This type of random variables is generally referred to as indicator random variables.
Since we know that X is the sum of other random variables, the linearity of expectation tells us that
Therefore, if we can find then we are done. So let's consider indicator random variable Xi. From the definition, we have that
which clearly is
So, when does Xi is 1? This, by the definition of Xi, is equivalent to the event that bin i is empty that we have already calculated in Question 2. Thus,
This implies that
Again, the first step is true because and the second step is true because of the linearity of expectation.
Note that as n gets close to infinity, is close to . Thus, , or there are about 36.7% empty bins.
Remark: For an indicator random variable A, .
Fullest bins
In this section, we will try to give a guarantee on the number of balls in the fullest bin.