## Osteosarcoma cluster case in West Salem

For the discussion of the sciences. Physics problems, chemistry equations, biology weirdness, it all goes here.

Moderators: gmalivuk, Moderators General, Prelates

polymer
Posts: 125
Joined: Mon Feb 04, 2008 7:14 am UTC

### Osteosarcoma cluster case in West Salem

I have taken a probability sequence, but I have not taken a statistics sequence. Over the past 5 years there have been 5 cases of osteosarcoma in my home town of West Salem: http://www.kgw.com/news/neighborhood-news/salem/EPA-to-look-into-Salem-cancer-cases-182107571.html.

My family asked me if they should be worried about external causes. Generally, a communities response to cluster cases can be rash, and so I'm inclined to downplay the problem. I really didn't know though, so I considered this simple model:

Approximately 400 people under the age of 20 get osteosarcoma each year, that gives 2000 cases over 5 years. There are roughly 80 million individuals under the age of 20 in the United States. Approximately 1700 students attend West Salem high school each year - so over 5 years approximately 4000 unique students go through West Salem High School.

I'm curious what the probability is for 5 or more cases of osteosarcoma to crop up at some school. I'll assume all schools can be binned into 4000 students, and that every person under the age of 20 is a student.

The probability of a student getting osteosarcoma is, p=P(O) = 2000/8 107= .25 10-4.

The probability of n kids getting osteosarcoma at a school is f(n) = pn(1-p)4000-n (4000 choose n)

The probability of 4 or less kids getting osteosarcoma at a school is f(0) + f(1) + f(2) + f(3) + f(4).

The probability of 4 or less kids getting osteosarcoma for all schools is K=(f(0) + f(1) + f(2) + f(3) + f(4))2*10^4

Finally the probability of some school having 5 or more kids with osteosarcoma is 1 - K.

Putting this into a TI I calculate 1-K to be 0.002.

So the probability that this happens with no unusual causal factor is 0.002. Is my reasoning sensible, and is this number "small"? This event is definitely unusual, but doesn't seem impossible either. I'm curious what other more statistically minded folks might have to say.

tomandlu
Posts: 1111
Joined: Fri Sep 21, 2007 10:22 am UTC
Location: London, UK
Contact:

### Re: Osteosarcoma cluster case in West Salem

polymer wrote:I have taken a probability sequence, but I have not taken a statistics sequence. Over the past 5 years there have been 5 cases of osteosarcoma in my home town of West Salem: http://www.kgw.com/news/neighborhood-news/salem/EPA-to-look-into-Salem-cancer-cases-182107571.html.

My family asked me if they should be worried about external causes. Generally, a communities response to cluster cases can be rash, and so I'm inclined to downplay the problem. I really didn't know though, so I considered this simple model:

Approximately 400 people under the age of 20 get osteosarcoma each year, that gives 2000 cases over 5 years. There are roughly 80 million individuals under the age of 20 in the United States. Approximately 1700 students attend West Salem high school each year - so over 5 years approximately 4000 unique students go through West Salem High School.

I'm curious what the probability is for 5 or more cases of osteosarcoma to crop up at some school. I'll assume all schools can be binned into 4000 students, and that every person under the age of 20 is a student.

The probability of a student getting osteosarcoma is, p=P(O) = 2000/8 107= .25 10-4.

The probability of n kids getting osteosarcoma at a school is f(n) = pn(1-p)4000-n (4000 choose n)

The probability of 4 or less kids getting osteosarcoma at a school is f(0) + f(1) + f(2) + f(3) + f(4).

The probability of 4 or less kids getting osteosarcoma for all schools is K=(f(0) + f(1) + f(2) + f(3) + f(4))2*10^4

Finally the probability of some school having 5 or more kids with osteosarcoma is 1 - K.

Putting this into a TI I calculate 1-K to be 0.002.

So the probability that this happens with no unusual causal factor is 0.002. Is my reasoning sensible, and is this number "small"? This event is definitely unusual, but doesn't seem impossible either. I'm curious what other more statistically minded folks might have to say.

Is there not a problem with self-selecting bias? Is your calculation asking "if I pick a random location, what are the chances" or "if I pick a location with a high number of cases, what are the chances?" e.g. are you asking "if I pick a random group of people, what are the chances one of them will be struck twice by lightening?" or "if I pick a group of people who've all been struck by lightening, what are the chances one of them will be struck again."
How can I think my way out of the problem when the problem is the way I think?

GeoffreyY
Posts: 69
Joined: Sat Mar 02, 2013 2:41 pm UTC
Location: Center of MY Observable Universe

### Re: Osteosarcoma cluster case in West Salem

Calculating the probability of certain event AFTER it happened does NOT make sense.
George Carlin wrote:Think of how stupid a average person is, and realize half of them are stupider than that.

tomandlu
Posts: 1111
Joined: Fri Sep 21, 2007 10:22 am UTC
Location: London, UK
Contact:

### Re: Osteosarcoma cluster case in West Salem

GeoffreyY wrote:Calculating the probability of certain event AFTER it happened does NOT make sense.

That's not quite how I'd phrase it. Certainly trying to calculate the probability for equally unlikely events (e.g. lottery numbers) is meaningless, but a high frequency of unlikely events can still be instructive. If I look at a record of a series of 20 coin tosses, HHTHTTHHHTTHTHHHTTTT, I can make no meaningful statement about probability, but that doesn't stop HHHHHHHHHHHHHHHHHHHH indicating that something's up...

We had a court case in the UK a few years ago, where a woman had two children die from cot-death. At the first trial, she was found guilty of murder on the basis on 8500*8500 (73,000,000) probability that someone could have two children die of cot-death. In the retrial, the probability was correctly given to the jury of just 8500. She was acquitted at the second trial.

http://en.wikipedia.org/wiki/Sally_Clark

I'm also reminded of the Love Canal and Cancer Alley. This is worth a read too... http://en.wikipedia.org/wiki/Cancer_cluster
How can I think my way out of the problem when the problem is the way I think?

Meteoric
Posts: 333
Joined: Wed Nov 23, 2011 4:43 am UTC

### Re: Osteosarcoma cluster case in West Salem

GeoffreyY wrote:Calculating the probability of certain event AFTER it happened does NOT make sense.

Sure it does. If a gambler wins a game of chance fifty times in a row, the casino doesn't say "Oh well, I guess that was a 100% chance!" They throw him out for cheating - or, if we lived in a world populated entirely by statisticians, they'd analyze that outcome and say something like "we conclude with 95% confidence that this gambler is cheating", and then throw him out.
The ability to analyze how likely something was, even after it happened, is the reason we can gather statistical data to support or reject a hypothesis.

I'm not really sure how to begin analyzing such a situation; 1 in 500 is pretty unlikely, but there are lots of other diseases or other calamities that would also be alarming to have a sudden rash of. My uninformed guess would be that investigation is warranted, but public outcry/panic is probably premature.
No, even in theory, you cannot build a rocket more massive than the visible universe.

Tass
Posts: 1909
Joined: Tue Nov 11, 2008 2:21 pm UTC
Location: Niels Bohr Institute, Copenhagen.

### Re: Osteosarcoma cluster case in West Salem

Indeed there seems to be selection bias.

There are hundreds of schools in the USA. There are hundreds of possible diseases that could cluster. A 0.002 even happens one time out of five hundred. You'd expect to find many such events across the country.

Maybe warrants a bit of investigation, but definitely not mass panic.

Posts: 5654
Joined: Wed Jun 11, 2008 11:03 am UTC
Location: The Netherlands

### Re: Osteosarcoma cluster case in West Salem

What you really want to know is not "What are the odds of X occurring" but "Given that X is occurring, what are the odds that it is due to chance".

How often do clusters of rare diseases occur due to random chance? That's a straightforward calculation, as done in this thread. How often do they occur due to factors other than chance? That's the big unknown. Let's call the former number A and the latter number B. If B is much smaller than A, than regardless of the size of A, it's overwhelmingly likely to be due to chance, and the size of A just says something about how unlucky you were. By the same token, if B is much larger than A, then regardless of the size of A it's very probably not due to chance.
It's one of those irregular verbs, isn't it? I have an independent mind, you are an eccentric, he is round the twist
- Bernard Woolley in Yes, Prime Minister