## Dice probability question (some number appears at most once)

For the discussion of math. Duh.

Moderators: gmalivuk, Moderators General, Prelates

skullturf
Posts: 556
Joined: Thu Dec 07, 2006 8:37 pm UTC
Location: Chicago
Contact:

### Dice probability question (some number appears at most once)

If a fair die, having k sides, is rolled n times, what's the probability that there's at least one side of the die that comes up at most once?

As a starting point, consider k = 6.

For instance,
Spoiler:
say you roll 12 times. If you get 314156226354, then there's no side that came up at most once, but if you get 314156226355, then the side labeled 4 came up at most once.

Note that in the case k = 6, then for n < 12, the probability is 1. (If you make fewer than 12 rolls, then the average number of times a side comes up is less than 2, and at least one side of the die must make an at-most-average number of appearances.)

With a somewhat lengthy argument involving inclusion-exclusion, I think I have a formula for the answer, but it's a rather complicated sum and isn't especially enlightening to look at. Maybe I've overlooked something, or maybe the exact answer is intrinsically complicated.

So I'm wondering:

(1) have I missed a clever shortcut?

(2) even if the exact answer is complicated, is there a clever way to get a reasonable approximation or reasonable bounds? (Edit: One way to get some kind of bounds is to use that in inclusion-exclusion, the partial sums are alternately overestimates and underestimates.)

(3) does this overlap with anything well-known or well-studied (e.g. an exercise in a classic text)? I'll also investigate this the next time I'm on campus, but I thought it would be worth asking in case any of you recognize the problem offhand.

Thanks!

By the way,
Spoiler:
what inspired this was an offhand comment elsewhere on the internet where people were discussing the birthday problem. One person claimed that in their school of 3500 students, they were the only one with their birthday, and asked "What are the odds of that?" I interpreted "that" to mean: What's the probability that among 3500 students, there's at least one birthday that appears at most once?

Tirian
Posts: 1891
Joined: Fri Feb 15, 2008 6:03 pm UTC

### Re: Dice probability question (some number appears at most o

Without knowing what your argument is, how can we say whether you missed a shortcut? I'll go ahead and guess that you didn't -- I don't think IEP can be skipped and that makes solutions unpleasant to look at.

If I were looking for a quick estimate, I would ignore the possibility that there could be more than one such day. Certainly, with a pool of 3500 students, the probability of one day having at most one match is really really small as it is, and the probability of two would be bounded above by its square if my mind is working properly. I have to think your calculation gets more error from the assumption that birthdays are evenly distributed throughout the year.

Charlie!
Posts: 2035
Joined: Sat Jan 12, 2008 8:20 pm UTC

### Re: Dice probability question (some number appears at most o

This is 1 - P(each side shows up at least twice).

Which is actually pretty easy, I think.
Spoiler:
You just do counting: the total number of possibilities is k^n, while the number of ways to have each side at least twice is, I believe, k^(n-2*k) * n!/((n-2*k)!*(2^k)). 2*k rolls are set aside to be doubles, and then the first term in the formula is just how many ways you can arrange the other rolls. The second term is then how many ways you can insert the k interchangeable pairs into those other rolls.

The end answer is then 1 - k^(-2*k)* n!/((n-2*k)!*(2^k))
Some people tell me I laugh too much. To them I say, "ha ha ha!"

skullturf
Posts: 556
Joined: Thu Dec 07, 2006 8:37 pm UTC
Location: Chicago
Contact:

### Re: Dice probability question (some number appears at most o

Charlie! wrote:This is 1 - P(each side shows up at least twice).

Which is actually pretty easy, I think.
Spoiler:
You just do counting: the total number of possibilities is k^n, while the number of ways to have each side at least twice is, I believe, k^(n-2*k) * n!/((n-2*k)!*(2^k)). 2*k rolls are set aside to be doubles, and then the first term in the formula is just how many ways you can arrange the other rolls. The second term is then how many ways you can insert the k interchangeable pairs into those other rolls.

The end answer is then 1 - k^(-2*k)* n!/((n-2*k)!*(2^k))

I'm pretty sure this results in overcounting. Let's consider k=6 and n=13.

Your formula for the number of ways to have each side at least twice gives 583783200 in that case.

With 13 rolls, having each side at least twice means that some number appears 3 times and the remaining 5 numbers each appear 2 times.

There are 6 ways to choose the number that appears 3 times, and then 13!/(3!2!2!2!2!2!) ways to arrange the 13 rolls. (For example, there are 13!/(3!2!2!2!2!2!) ways to arrange the rolls 1122334455566.)

This gives 6*13!/(3!2!2!2!2!2!) ways to have each side at least twice, which works out to 194594400, which is exactly one third of your answer.

I think in your method, the rolls that are "set aside to be doubles" are effectively being treated as special in some way, or something like that.

Yakk
Poster with most posts but no title.
Posts: 11129
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

### Re: Dice probability question (some number appears at most o

Why don't you post your argument? This looks like homework, and helping someone with homework is much easier if you display what you have already done.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

skullturf
Posts: 556
Joined: Thu Dec 07, 2006 8:37 pm UTC
Location: Chicago
Contact:

### Re: Dice probability question (some number appears at most o

It's not homework. I'm not a student.

My formula for the probability that at least one number comes up at most once is:

Code: Select all

`sum((-1)^(m-1)*binomial(k,m)*sum(binomial(m,j)*(k-m)^(n-j)*product(n-i,i=0..j-1),j=0..m),m=1..k)/k^n`

I'm using Maple syntax. As in the original post, n is the number of rolls and k is the number of sides the die has.

My formula seems to give the right answer for k=6 and n up to 13.

Here's how I obtained that formula:

Spoiler:
We can think of the n rolls of the die as generating a "word" of length n over the "alphabet" {1,...,k}. Denote the set of all such words by [k]^n. Then [k]^n consists of precisely k^n words.

For i = 1,...,k, we'll say i is "rare" in a word w if i appears at most once in w. We'll say w is "good" if there's at least one i such that i is rare in w.

For i = 1,...,k, define W_i to be the number of words w in [k]^n such that i is rare in w.

The number of "good" words in [k]^n is

| W_1 \cup W_2 \cup \cdots \cup W_k |

which, by inclusion-exclusion, is equal to

|W_1| + |W_2| + \cdots + |W_k|
- ( |W_1 \cap W_2| + |W_1 \cap W_3| + \cdots + |W_{k-1} \cap W_k| )
+ ( |W_1 \cap W_2 \cap W_3| + |W_1 \cap W_2 \cap W_4| + \cdots + |W_{k-2} \cap W_{k-1} \cap W_k| )
- ... (finite sum, of course).

Now what is |W_i|? The set W_i is the disjoint union of:
--the set of words such that i appears exactly 0 times
--the set of words such that i appears exactly 1 time.
The former contains (k-1)^n words, and the latter contains n*(k-1)^{n-1} words
(first choose the position containing the 1 appearance of i, then fill the other positions with things that aren't i).

What is |W_i \cap W_j|? The set W_i \cap W_j is the disjoint union of:
--the set of words such that i appears 0 times and j appears 0 times
--the set of words such that i appears 0 times and j appears 1 time
--the set of words such that i appears 1 time and j appears 0 times
--the set of words such that i appears 1 time and j appears 1 time.
These four sets respectively contain (k-2)^n words, n*(k-2)^{n-1} words,
n*(k-2)^{n-1} words, and n*(n-1)*{k-2}^{n-2} words.

From the previous two paragraphs, we have (switching to Maple syntax on the right side)
|W_i| = binomial(1,0)*(k-1)^n + binomial(1,1)*(k-1)^(n-1)
|W_i \cap W_j| = binomial(2,0)*(k-2)^n + binomial(2,1)*n*(k-2)^(n-1) + binomial(2,2)*n*(n-1)*(k-2)^(n-2)
This pattern continues.

So the number of good words is

binomial(k,1)*( binomial(1,0)*(k-1)^n + binomial(1,1)*n*(k-1)^(n-1) )
- binomial(k,2)*( binomial(2,0)*(k-2)^n + binomial(2,1)*n*(k-2)^(n-1) + binomial(2,2)*n*(n-1)*(k-2)^(n-2) )
+ ...

or more compactly

binomial(k,1)*sum( binomial(1,j)*(k-1)^(n-j)*product(n-i,i=0..j-1), j = 0..1 )
- binomial(k,2)*sum( binomial(2,j)*(k-2)^(n-j)*product(n-i,i=0..j-1), j = 0..2 )
+ binomial(k,3)*sum( binomial(3,j)*(k-3)^(n-j)*product(n-i,i=0..j-1), j = 0..3 )
- ...

which is where I got my formula from.

From the time I first posted, I was fairly sure my formula was correct.

Furthermore, it might be essentially the "right" way to do the problem. As mentioned earlier in the thread, the fact that we use inclusion-exclusion often tends to mean the counting "really is" tricky -- loosely, it's hard to "get around" inclusion-exclusion.

But the fact that it's a complicated sum, where the outermost sum has k terms, is sort of annoying. Even if it turns out there's no shortcut to an exact answer, I'm interested in the question of how to get reasonable approximations reasonably quickly.

Yakk
Poster with most posts but no title.
Posts: 11129
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

### Re: Dice probability question (some number appears at most o

Here is an approach. No clue if it is better, but it is at least slightly different.

First figure out the chance that at one side comes up zero times, two sides come up zero times, etc. (that is pretty easy)

Now add in a random die to a random spot.

This additional die will either match one of the "zero times", or not. This is also pretty easy to calculate.

If it matches, it upgrades it to "one time". In that case (here comes the beauty) you have counted it in one of the other "zero time" plus a die cases, unless you are completely out of "zero times" sides.

That last case ends up being trickier. There is some double counting, where there was a singleton before you added your random die, and now you have two singletons. That case is counted in an "upgrade" from both null sides (and similarly for three, four and five "singletons").

The first attack I can think of ends up being an inclusion-exclusion around having singletons recursed on n. Pretty ugly. :/
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.