mean of geometric distribution: *intuitive* reason
Moderators: gmalivuk, Moderators General, Prelates
mean of geometric distribution: *intuitive* reason
Suppose you roll a fair sixsided die repeatedly, and you count the number of rolls it takes to see your favorite side (say your favorite number is 5) at least once.
The number of required rolls follows the wellknown geometric distribution, using the first of the two slightly different definitions given there.
The mean or expected value of the number of rolls, in this case, is 6. In general, it's 1/p, where p is the probability of "success" on each roll.
Is there an intuitive reason that it's 6? I mean, I believe that it is, and I can rearrange infinite sums and everything.
And it's intuitive that it's in the ballpark of 6, and if you increased the number of sides of the die, the expected number of rolls would increase, and I suppose it's believable that it would increase about linearly.
But is there a clear, simple, obvious "trick" to just "see" that the expected number of rolls is 6 in this case?
The number of required rolls follows the wellknown geometric distribution, using the first of the two slightly different definitions given there.
The mean or expected value of the number of rolls, in this case, is 6. In general, it's 1/p, where p is the probability of "success" on each roll.
Is there an intuitive reason that it's 6? I mean, I believe that it is, and I can rearrange infinite sums and everything.
And it's intuitive that it's in the ballpark of 6, and if you increased the number of sides of the die, the expected number of rolls would increase, and I suppose it's believable that it would increase about linearly.
But is there a clear, simple, obvious "trick" to just "see" that the expected number of rolls is 6 in this case?
Re: mean of geometric distribution: *intuitive* reason
The linearity of expectation means that, since the expected number of times a given face will appear in one roll of an nsided die is 1/n, the expected number of times it will appear in n rolls is 1. Thus n is the number of rolls needed to to have an expectation of a given face appearing once.
Any fewer rolls and you expect the face to appear less than once, any more rolls and you expect it to appear more than once. So the expected number of rolls to see the face once, is n. (Yes, I know this is not precise and I glossed over a lot. You asked for an intuitive reason.)
Any fewer rolls and you expect the face to appear less than once, any more rolls and you expect it to appear more than once. So the expected number of rolls to see the face once, is n. (Yes, I know this is not precise and I glossed over a lot. You asked for an intuitive reason.)
wee free kings
Re: mean of geometric distribution: *intuitive* reason
That's good, I like that.
The little intuitive leap I wasn't making last night: One can ask about the expected number of successes on a single roll, even though it's not an integer. And that expected number of successes is 1/n.
As far as "intuitive" explanations go, maybe yours is pretty much the "right" one. (And as I said, obviously it's possible to prove it rigorously via algebra by just manipulating sums in the right way.)
However, here's a minor problem I'm still having with making it "intuitive".
Intuitively, when we average over all "universes", the expected or average number of successes in the first n rolls is 1.
But is it intuitive that "average number of successes in first n rolls is 1" is the same as "average of number of rolls needed to see the first success is n"?
I guess there's a subtlety there.
Now that I read your second paragraph again, though, I think your explanation is reasonably intuitive. For illustration, let's say it's a tensided die. Suppose you have your mind made up that you're only going to roll the die seven times. If you repeat this experiment again and again, the expected number of successes is 7/10, which is less than 1. So "on average", you haven't had your first success yet. And if you replace 7 with a fixed number larger than 10 (let's say 12), then "on average", you expect to have "already" had your first success somewhere before the end of the 12 rolls.
In any case, thank you. Your intuitive explanation is pretty good, and there's always a certain amount of subjectivity in what counts as "enough" of an intuitive explanation.
The little intuitive leap I wasn't making last night: One can ask about the expected number of successes on a single roll, even though it's not an integer. And that expected number of successes is 1/n.
As far as "intuitive" explanations go, maybe yours is pretty much the "right" one. (And as I said, obviously it's possible to prove it rigorously via algebra by just manipulating sums in the right way.)
However, here's a minor problem I'm still having with making it "intuitive".
Intuitively, when we average over all "universes", the expected or average number of successes in the first n rolls is 1.
But is it intuitive that "average number of successes in first n rolls is 1" is the same as "average of number of rolls needed to see the first success is n"?
I guess there's a subtlety there.
Now that I read your second paragraph again, though, I think your explanation is reasonably intuitive. For illustration, let's say it's a tensided die. Suppose you have your mind made up that you're only going to roll the die seven times. If you repeat this experiment again and again, the expected number of successes is 7/10, which is less than 1. So "on average", you haven't had your first success yet. And if you replace 7 with a fixed number larger than 10 (let's say 12), then "on average", you expect to have "already" had your first success somewhere before the end of the 12 rolls.
In any case, thank you. Your intuitive explanation is pretty good, and there's always a certain amount of subjectivity in what counts as "enough" of an intuitive explanation.

 Posts: 224
 Joined: Tue Jun 17, 2008 11:04 pm UTC
Re: mean of geometric distribution: *intuitive* reason
This might be less intuitive, depending on how your intuition works. But you can also observe that the average needed is the same as 1 plus 5/6 of the average needed (you always need at least 1 roll, then 5/6 of the time you're back where you started). So 1/6th of the average is 1, so the average is 6.
This also has a visualization in terms of the corresponding geometric distribution. If you shift the whole distribution left by 1, and delete the part of the distribution that's now on 0 (because its mass doesn't contribute anything to the sumproduct for the average), that's the same as scaling down the distribution uniformly by a sixth. So a sixth of it averages 1, so the whole thing averages 6.
This also has a visualization in terms of the corresponding geometric distribution. If you shift the whole distribution left by 1, and delete the part of the distribution that's now on 0 (because its mass doesn't contribute anything to the sumproduct for the average), that's the same as scaling down the distribution uniformly by a sixth. So a sixth of it averages 1, so the whole thing averages 6.
 dudiobugtron
 Posts: 1098
 Joined: Mon Jul 30, 2012 9:14 am UTC
 Location: The Outlier
Re: mean of geometric distribution: *intuitive* reason
I like lightvector's approach, that is a pretty good way to understand it.

I think that it is actually a pretty unintuitive result, though. I think this is because, in practice, the average as you see it will always end up seeming slightly less than 6. The average is only exactly '6' if you factor in large, ridiculous results. You can't actually roll a die arbitrarily many times in real life.
The average position that a 5 first appears, out of all finite sets of (no more than N) rolls where it does appear at some point, will of course be less than 6.
If you roll a die 100 times and don't get a 5, you're not going to keep rolling it, you're going to think the die is loaded.

I think that it is actually a pretty unintuitive result, though. I think this is because, in practice, the average as you see it will always end up seeming slightly less than 6. The average is only exactly '6' if you factor in large, ridiculous results. You can't actually roll a die arbitrarily many times in real life.
The average position that a 5 first appears, out of all finite sets of (no more than N) rolls where it does appear at some point, will of course be less than 6.
If you roll a die 100 times and don't get a 5, you're not going to keep rolling it, you're going to think the die is loaded.
Re: mean of geometric distribution: *intuitive* reason
I like lightvector's approach too. Thanks, everybody.
Somewhat in the spirit of dudiobugtron's remarks: note that the median of the geometric distribution, as opposed to the mean, is less intuitive.
The formula for the median appears on the Wikipedia page I linked in my first post. It's the ceiling of log(2)/log(1p).
That median happens to be 4 for a 6sided die, 7 for a 10sided die, 14 for a 20sided die, and 69 for a 100sided die.
Somewhat in the spirit of dudiobugtron's remarks: note that the median of the geometric distribution, as opposed to the mean, is less intuitive.
The formula for the median appears on the Wikipedia page I linked in my first post. It's the ceiling of log(2)/log(1p).
That median happens to be 4 for a 6sided die, 7 for a 10sided die, 14 for a 20sided die, and 69 for a 100sided die.
Re: mean of geometric distribution: *intuitive* reason
I usually think of it as if you rolled a large number of die, you'd get an average of 1/6 6s, so the average time between them must be 6.
addams wrote:This forum has some very well educated people typing away in loops with Sourmilk. He is a lucky Sourmilk.
Re: mean of geometric distribution: *intuitive* reason
mikel wrote:I usually think of it as if you rolled a large number of die, you'd get an average of 1/6 6s, so the average time between them must be 6.
That reminds me of a seemingly paradoxical scenario I read once.
You go into an American casino, and go up to a roulette wheel. The odds of the ball falling on a green number (0 or 00) is 2 in 38, or 1/19. Therefore you would expect that the next occurrence of a 0/00 result to be in about 19 throws on average.
You can however also make the same argument with time reversed, so if you ask the croupier how long ago the previous 0/00 throw was, you would expect that on average the answer would be about 19 throws before you joined the table. This means that you expect the previous and the next 0/00 throws to be 38 throws apart.
How does this fit in with the fact that they should normally be about 19 apart?
Re: mean of geometric distribution: *intuitive* reason
jaap wrote:That reminds me of a seemingly paradoxical scenario I read once.
So, it is in fact a contradiction?
jaap wrote:You can however also make the same argument with time reversed, so if you ask the croupier how long ago the previous 0/00 throw was, you would expect that on average the answer would be about 19 throws before you joined the table.
Of course therein lies the problem. The expected value for having at least one success in n outcomes increases much faster than the number of successes in n outcomes.
Re: mean of geometric distribution: *intuitive* reason
Flumble wrote:jaap wrote:That reminds me of a seemingly paradoxical scenario I read once.
So, it is in fact a contradiction? :roll:jaap wrote:You can however also make the same argument with time reversed, so if you ask the croupier how long ago the previous 0/00 throw was, you would expect that on average the answer would be about 19 throws before you joined the table.
Of course therein lies the problem. The expected value for having at least one success in n outcomes increases much faster than the number of successes in n outcomes.
I don't understand what you mean. Where did I mention 'at least one'?
No, when you join the table it really is the case that the expected number of throws between the previous and the next occurrence of 0/00 is 37. How can that be?
Re: mean of geometric distribution: *intuitive* reason
Odds of joining table in between consecutive zeros: 1/19²
Odds of joining with one other number between zeros: 2∙18/19³
Odds of joining in a run of exactly two nonzeros: 3∙18²/19^{4}
In general, odds of joining in a run of k nonzeros: (k+1)(18/19)^{k}/19²
The first factor is how many different positions within the run you could have joined.
You do the math.
Odds of joining with one other number between zeros: 2∙18/19³
Odds of joining in a run of exactly two nonzeros: 3∙18²/19^{4}
In general, odds of joining in a run of k nonzeros: (k+1)(18/19)^{k}/19²
The first factor is how many different positions within the run you could have joined.
You do the math.
wee free kings
Re: mean of geometric distribution: *intuitive* reason
Qaanol wrote:Odds of joining table in between consecutive zeros: 1/19²
Odds of joining with one other number between zeros: 2∙18/19³
Odds of joining in a run of exactly two nonzeros: 3∙18²/19^{4}
In general, odds of joining in a run of k nonzeros: (k+1)(18/19)^{k}/19²
The first factor is how many different positions within the run you could have joined.
You do the math.
Exactly. You're more likely to join the table during a long run between zeroes than a short one, and that weighting makes the expected length of the run that you join at double the ordinary average run length.
Re: mean of geometric distribution: *intuitive* reason
jaap wrote:Exactly. You're more likely to join the table during a long run between zeroes than a short one, and that weighting makes the expected length of the run that you join at double the ordinary average run length.
It's sort of related to the following.
You don't find the average family size by asking people how many siblings they have. (People from large families are overrepresented.)
You don't find the average number of customers in your local fast food place by taking note of the number of customers when you're there. You're a customer, and you're more likely to be there at the times that tend to have more customers.
Re: mean of geometric distribution: *intuitive* reason
And your friends have more friends than average (they have you)
Re: mean of geometric distribution: *intuitive* reason
jaap wrote:No, when you join the table it really is the case that the expected number of throws between the previous and the next occurrence of 0/00 is 37. How can that be?
Damn, I really should be paying attention during statistics colleges next time. Not only did my intuition put me off but I also couldn't figure out why the expected number of throws between zeros would be 37.
Luckily Qaanol gave an excellent explanation. (and I ran an few simulations and plots for visualisation)
Who is online
Users browsing this forum: No registered users and 13 guests