## Knowledge from group averages

## What do you think the population of Indonesia is?

< 1,000,000
0
1,000,001 - 10,000,000
0
10,000,001 - 20,000,000
5
6%
20,000,001 - 30,000,000
4
5%
30,000,001 - 40,000,000
5
6%
40,000,001 - 50,000,000
5
6%
50,000,001 - 60,000,000
3
4%
60,000,001 - 70,000,000
1
1%
70,000,001 - 80,000,000
5
6%
80,000,001 - 90,000,000
3
4%
90,000,001 - 100,000,000
3
4%
100,000,001 - 120,000,000
6
7%
120,000,001 - 140,000,000
5
6%
140,000,001 - 160,000,000
2
2%
160,000,001 - 180,000,000
3
4%
180,000,001 - 200,000,000
6
7%
200,000,001 - 250,000,000
11
13%
250,000,001 - 300,000,000
5
6%
300,000,001 - 350,000,000
3
4%
350,000,001 - 400,000,000
1
1%
400,000,001 - 450,000,000
2
2%
450,000,001 - 500,000,000
1
1%
> 500,000,001
3
4%

Blatm
Posts: 638
Joined: Mon Jun 04, 2007 1:43 am UTC

### Knowledge from group averages

If I were to ask many people to guess some fact (not necessarily numerical) they didn't know before hand and averaged their answers in some way, would they converge to the correct answer? I've included a poll in an effort to gather some data, but I don't expect it to be very conclusive. If someone could provide some more convincing evidence one way or another, it would be appreciated.

If you answered > 500,000,001, I'd like it if you posted your guess. Thanks.

Ulc
Posts: 1301
Joined: Sun Jun 21, 2009 8:05 pm UTC
Location: Copenhagen university

### Re: Knowledge from group averages

Blatm wrote:If I were to ask many people to guess some fact (not necessarily numerical) they didn't know before hand and averaged their answers in some way, would they converge to the correct answer?

If you sate that none of them knows the answer, why in the world do you assume that they might converge on anything at all? And even if it does converge, why would you assume it's the right answer by anything but chance?

Until you have some sort of method by which the majority somehow guesses correctly, without knowing the answer or parts thereof, this poll is essentially pointless "wonder if [insert own crackpot theory] happens to be true" work.
Patashu
Posts: 378
Joined: Mon Mar 12, 2007 8:54 am UTC
### Re: Knowledge from group averages

I think the idea is that no one's memorized the population of Indonesia, but our individual hunches about what it might be will average around the true population.

Personally, I think that it would have a log-normal distribution, if anything - as in, say it's really a 7 digit number, then people are just as likely to guess a 6 digit number as they are an 8 digit number.

Velifer
Posts: 1132
Joined: Wed Dec 26, 2007 4:05 pm UTC
Location: 40ºN, 83ºW

### Re: Knowledge from group averages

Your methods will only answer a very narrow question: "what do people on this site who choose to answer your survey think is the population of Indonesia." Until they have the ability to alter reality with their minds, your knowledge is mostly limited to information about the respondents, not the population of Indonesia. Oh, and telling them they're wrong could really upset them. You wouldn't like them when they're angry.

The question is one where people have some prior information. They might know the answer. They might know something about populations and relative sizes of countries. It's not naive. Then we get into survey and sample methodologies--your categories provide information too. The response could contain some small amount of information about the true population of Indonesia, but unless you already know a great deal about your respondents, it's going to tell you more about them.

A question where nobody really has information is going to have a uniform distribution.

That said, the "ask the audience" cheat will work if enough of them have an expectation of knowing the answer. If you surveyed a group of Indonesian demographers, you could get close. You now have a population that has a certain expectation of knowing the answer.
fooliam
Posts: 73
Joined: Thu Apr 01, 2010 9:23 pm UTC

### Re: Knowledge from group averages

It seems possible that even if people don't have that specific information, they may have enough tangential knowledge to infer a relatively accurate guess. But it seems like it would depend too much on the accuracy of peoples' assumptions based on that tangential knowledge to really be accurate.

Technical Ben
Posts: 2986
Joined: Tue May 27, 2008 10:42 pm UTC

### Re: Knowledge from group averages

The example I saw of a similar experiment was "how much does this melon weigh". However, in that question, you can both pickup the melon, look at it, and use your intuition (calculate via observation) the weight. This does allow you to get an average number of guesses that end up very close to the result. But a totally random "what are the lottery numbers tomorrow" attempt will fail, as there is no calculation, or prior knowledge to help.
fooliam
Posts: 73
Joined: Thu Apr 01, 2010 9:23 pm UTC

### Re: Knowledge from group averages

Technical Ben wrote:The example I saw of a similar experiment was "how much does this melon weigh". However, in that question, you can both pickup the melon, look at it, and use your intuition (calculate via observation) the weight. This does allow you to get an average number of guesses that end up very close to the result. But a totally random "what are the lottery numbers tomorrow" attempt will fail, as there is no calculation, or prior knowledge to help.

I'm not prepared to make that conclusion without first conducting an experiment. Lets got everyone in the state of California to guess next week's lottery numbers and see if more than the predicted amount of people are able to correctly guess the lottery numbers.

Probably wouldn't happen...but if an unexpectedly high number of people guessed correctly, it would certainly be interesting.

Velifer
Posts: 1132
Joined: Wed Dec 26, 2007 4:05 pm UTC
Location: 40ºN, 83ºW

### Re: Knowledge from group averages

fooliam wrote:I'm not prepared to make that conclusion without first conducting an experiment.
...
Probably wouldn't happen...but if an unexpectedly high number of people guessed correctly, it would certainly be interesting.

Ha! 5% of my huge number of samples were significant at p=.05, so I just found out what they had in common and wrote up the paper...

Back to the OP, you're assuming that you have a whole bunch of flawed measuring instruments with random variance, and the central limit theorem will swoop in and save the day. The margin of error should be proportional to n-1/2, right? Now this would be really cool if it worked, because then I could have a random number generator converge on any question I could ever think to ask.

But you're violating the assumptions of the theorem.
mbrigdan
False Alarm! There's more rum.
Posts: 109
Joined: Wed Jun 25, 2008 2:45 am UTC

### Re: Knowledge from group averages

Velifer wrote:
Back to the OP, you're assuming that you have a whole bunch of flawed measuring instruments with random variance, and the central limit theorem will swoop in and save the day. The margin of error should be proportional to n-1/2, right? Now this would be really cool if it worked, because then I could have a random number generator converge on any question I could ever think to ask.

But you're violating the assumptions of the theorem.

Well not really. I know nearly nothing about indonesia, but can say with reasonable certainty that it has more than 1 million people, and less than 500 million. Knowing where it's located, I can guess that it will have a reasonably high population. So, if a lot of people have a general idea, and a few have a better idea, it should at least give a better answer than one person's guess.
Charlie!
Posts: 2035
Joined: Sat Jan 12, 2008 8:20 pm UTC

### Re: Knowledge from group averages

Hmm, from lottery data on the number of winners per number, you could eventually figure out which numbers people are most likely to pick.

You'd need rather a lot of samples though
fooliam
Posts: 73
Joined: Thu Apr 01, 2010 9:23 pm UTC

### Re: Knowledge from group averages

Charlie! wrote:Hmm, from lottery data on the number of winners per number, you could eventually figure out which numbers people are most likely to pick.

You'd need rather a lot of samples though

On a tangential point, I've always been curious if it would be worthwhile to construct a retrospective statistical model of past winning numbers to see if certain numbers have a higher likelihood of being chosen. I know its assumed all the numbers have an equal probability of being chosen, but I don't have enough faith in people to assume that any method they come up with is going to perfectly fit the theoretical model.

poxic
Eloquently Prismatic
Posts: 4749
Joined: Sat Jun 07, 2008 3:28 am UTC

### Re: Knowledge from group averages

It's been done. Apparently the number most likely to be drawn in our main national lottery (Lotto 6/49) is 33, by some small margin. So yeah, people who study this tend to pick sets of numbers with a 33 in them.

More practically, it means you'll share the jackpot with more people if you do that. (Even more practically, you've just spent \$2 to buy a small piece of paper that you'll be throwing out in a few days.)

ETA: and the "numbers people pick" thing has been studied, too. People like numbers with 1s and 3s in them, and odd numbers in general IIRC.
Josephine
Posts: 2142
Joined: Wed Apr 08, 2009 5:53 am UTC

### Re: Knowledge from group averages

poxic wrote:It's been done. Apparently the number most likely to be drawn in our main national lottery (Lotto 6/49) is 33, by some small margin. So yeah, people who study this tend to pick sets of numbers with a 33 in them.

More practically, it means you'll share the jackpot with more people if you do that. (Even more practically, you've just spent \$2 to buy a small piece of paper that you'll be throwing out in a few days.)

ETA: and the "numbers people pick" thing has been studied, too. People like numbers with 1s and 3s in them, and odd numbers in general IIRC.

7 and 37 are the top 1 and 2 digit numbers, I believe.
gmalivuk
GNU Terry Pratchett
Posts: 26724
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

### Re: Knowledge from group averages

fooliam wrote:Lets got everyone in the state of California to guess next week's lottery numbers and see if more than the predicted amount of people are able to correctly guess the lottery numbers.
Right. Which is to say, let's implement the California state lotto...
Zamfir
I built a novelty castle, the irony was lost on some.
Posts: 7588
Joined: Wed Aug 27, 2008 2:43 pm UTC
Location: Nederland

### Re: Knowledge from group averages

Back to the OP, you're assuming that you have a whole bunch of flawed measuring instruments with random variance, and the central limit theorem will swoop in and save the day. The margin of error should be proportional to n-1/2, right? Now this would be really cool if it worked, because then I could have a random number generator converge on any question I could ever think to ask.

But you're violating the assumptions of the theorem.

If you want to cast this in a CLT framework, you could say that perhaps people's estimates can be modeled as draws from a distribution that has the correct value as its mean.That's clearly not a perfect model, but there is something to say for it.

On the other hand, a random number generator can be very well modeled as draws from a distribution, but there is no mechanism at all to relate its mean to the value of interest. So those two cases are not comparable.

InNombreDeQuién
Posts: 16
Joined: Sun Jul 11, 2010 6:05 pm UTC
Location: Tscherrmanny

### Re: Knowledge from group averages

Hans Rosling did prove, that a group of chimpanzees, who answered five questions regarding the higher rate of death of infants in 10 Countries, reached an average score of 2.5 out of 5... swedish professors just 2.1
Does that mean, monkeys are more intelligent than professors?
In my eyes it shows two things:
1) only 50% of the population understand statistic
2) biases affect the brain substantialy

p.s: look for TED-Talks and "hans rosling" and you will finde several videos of him, each more interesting than the other...

Velifer
Posts: 1132
Joined: Wed Dec 26, 2007 4:05 pm UTC
Location: 40ºN, 83ºW

### Re: Knowledge from group averages

Zamfir wrote:If you want to cast this in a CLT framework, you could say that perhaps people's estimates can be modeled as draws from a distribution that has the correct value as its mean.That's clearly not a perfect model, but there is something to say for it.

On the other hand, a random number generator can be very well modeled as draws from a distribution, but there is no mechanism at all to relate its mean to the value of interest. So those two cases are not comparable.

I do want to cast this in the CLT framework, to show exactly that people's estimates are draws from a distribution that has no mechanism1 to relate its mean to the value of interest.

1 People are going to be better guessers than a random number generator, but the amount of information in any guess is going to be very small. I agree with mbrigdan's point above, in that many very flawed guesses might (should) get closer to the true value than one guess, assuming respondents use some information when answering. I did post above about prior information.
scarecrovv
It's pronounced 'double u'
Posts: 674
Joined: Wed Jul 30, 2008 4:09 pm UTC
Location: California

### Re: Knowledge from group averages

We keep arguing as though we have no data. In fact, we have 48 data points last time I looked. Interestingly, the mode is the bin with the correct answer, and the adjacent bins are also above average. For both the highest single bin, and the bin with the highest total of it's value and the value of the two adjacent bins, the correct answer wins on both counts.

I suppose this sort of makes sense. Those people who know what they're talking about will answer approximately correctly. Those people who don't are random number generators. Perhaps I should start another poll to test the "largest relatively tight grouping is approximately correct" hypothesis.

I'm ashamed to say I low-balled the correct answer by nearly an order of magnitude.

fooliam
Posts: 73
Joined: Thu Apr 01, 2010 9:23 pm UTC

### Re: Knowledge from group averages

gmalivuk wrote:
fooliam wrote:Lets got everyone in the state of California to guess next week's lottery numbers and see if more than the predicted amount of people are able to correctly guess the lottery numbers.
Right. Which is to say, let's implement the California state lotto...

oh. right.

gmalivuk
GNU Terry Pratchett
Posts: 26724
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

### Re: Knowledge from group averages

scarecrovv wrote:Interestingly, the mode is the bin with the correct answer, and the adjacent bins are also above average. For both the highest single bin, and the bin with the highest total of it's value and the value of the two adjacent bins, the correct answer wins on both counts.
Sure, but the median is in the 120-140 bin, and the mean is about 150M. Also, the fact that results seem clustered around the right answer comes partly from the fact that those three bins suddenly account for a range of 120 million people, whereas all the bins below that are smaller.

So sure, it just so happens that the most common answer is for the 50-million-person range that includes the correct answer, but the other two most common measures of location give significantly lower results. Furthermore, this is almost certainly because there are more bins over that lower range than there are above the correct answer, and if the correct answer was in e.g. the lowest bin available, probably very few people would have picked it.
khanofmongols
Posts: 21
Joined: Wed Apr 21, 2010 2:43 am UTC

### Re: Knowledge from group averages

Wouldn't it have been better to use a logarithmic sort of bining? Less bias towards larger numbers?

meatyochre
Posts: 1524
Joined: Mon Apr 05, 2010 7:09 am UTC
Location: flying with the Conchords

### Re: Knowledge from group averages

idk, I just picked one that looked like it was right about in the middle. I don't know if your knowledge of my methodology will help you or not, but there you have it.
Technical Ben
Posts: 2986
Joined: Tue May 27, 2008 10:42 pm UTC

### Re: Knowledge from group averages

Perhaps I can throw another spanner in the works?
assuming a totally random number picked by 99% of the participants, you would have an equal number of people in each "bin". Then you only need one person who does know the answer to then make the average/mode/mean select the correct bin.
Does this prove averages will give you the correct answer from random data, or that one person knew the answer.
gmalivuk
GNU Terry Pratchett
Posts: 26724
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

### Re: Knowledge from group averages

That will only make the mode the right answer (as it is in this case). The median and mean of random guessing will be determined by how the bins are distributed in the first place.
Duban
Posts: 352
Joined: Fri May 01, 2009 1:22 pm UTC

### Re: Knowledge from group averages

I believe it's like 4 in the top 10 most populous countries, below the US but above pakistan/bangladesh. Soo 200-250 million
Various Varieties
Posts: 505
Joined: Tue Mar 04, 2008 7:24 pm UTC

### Re: Knowledge from group averages

Technical Ben wrote:The example I saw of a similar experiment was "how much does this melon weigh". However, in that question, you can both pickup the melon, look at it, and use your intuition (calculate via observation) the weight. This does allow you to get an average number of guesses that end up very close to the result. But a totally random "what are the lottery numbers tomorrow" attempt will fail, as there is no calculation, or prior knowledge to help.

Interesting that the two examples you mention are guessing an object's weight and predicting lottery numbers - because when Derren Brown did his lottery number prediction stunt last year, his "explanation" revolved around taking the idea of wisdom of crowds when estimating the number of pebbles in a jar, and linking it to the ability to make a prediction about what the lottery numbers would be.

It was utter nonsense, dragged out for an hour - very disappointing, because I normally enjoy going along for the ride with the the psychological/charlatan-debunking explanations he gives in his shows (even though I'm aware they're largely just modern patter for fairly well-established magic tricks).