## 2118: "Normal Distribution"

This forum is for the individual discussion thread that goes with each new comic.

Moderators: Moderators General, Prelates, Magistrates

keithl
Posts: 661
Joined: Mon Aug 01, 2011 3:46 pm UTC

### 2118: "Normal Distribution"

title text: "It's the NORMAL distribution, not the TANGENT distribution"

50%? No way.

Much fewer than 50% of the pixels are gray, and more than 80% are white. Very few are black. The pixel production department is running out of of white pixels, and HR warns us that we may be sued for pixel hiring bias.

80-watt Hamster
Posts: 12
Joined: Tue May 13, 2014 1:17 pm UTC

### Re: 2118: "Normal Distribution"

keithl wrote:Much fewer than 50% of the pixels are gray, and more than 80% are white.

Is that for the whole image? To the eye, the distribution looks much more even under the curve.

Any input as to what about this, exactly, would this annoy statisticians? As a non-statistician, it confuses me because I don't know what someone would be trying to demonstrate by highlighting the given region.
80-watt Hamster

Flumble
Yes Man
Posts: 2249
Joined: Sun Aug 05, 2012 9:35 pm UTC

### Re: 2118: "Normal Distribution"

Normal normal for comparison:

80-watt Hamster wrote:Any input as to what about this, exactly, would this annoy statisticians? As a non-statistician, it confuses me because I don't know what someone would be trying to demonstrate by highlighting the given region.

Normally you would mark a horizontal sweep around the middle to indicate a useful range of values (in which 50% of random samples will lie, or where a random sample will end up with 50% chance). And the middle helpfully corresponds to the mean value.
Now the vertical midpoint is 1/(2σ√π) and both the height of lines and the corner points, I think, will tell you nothing. But the worst part is that it makes no sense as a visual aid—rather it's closer to visual AIDS.

I wonder how far apart those lines are, though.
Last edited by Flumble on Fri Mar 01, 2019 9:55 pm UTC, edited 1 time in total.

Rysto
Posts: 1460
Joined: Wed Mar 21, 2007 4:07 am UTC

### Re: 2118: "Normal Distribution"

80-watt Hamster wrote:Any input as to what about this, exactly, would this annoy statisticians? As a non-statistician, it confuses me because I don't know what someone would be trying to demonstrate by highlighting the given region.

It doesn't demonstrate anything at all, which is why it would annoy statisticians. Randall has shaded an arbitrary portion of the distribution as if it's meaningful, but it's not.

keithl
Posts: 661
Joined: Mon Aug 01, 2011 3:46 pm UTC

### Re: 2118: "Normal Distribution"

80-watt Hamster wrote:Any input as to what about this, exactly, would this annoy statisticians? As a non-statistician, it confuses me because I don't know what someone would be trying to demonstrate by highlighting the given region.

Statisticians sometimes highlight the "50%" center of the distribution with vertical lines and a gray area to the left and right of the peak. The horizontal cuts are abnormal and meaningless. If this graph represents the likelihood of some variable - say, the probability (vertical axis) of a person being X centimeters tall (horizontal axis), then the shaded area doesn't represent something measurable, because probability is an aggregate, not an individual, quantity.

Either that, or the lines indicate where the lost airplane is presumed to have crashed into the mountain, and where to send the search parties first.

Note: I sold electronic products based on statistics. I've learned that, with enough data, no distributions are exactly Gaussian, and typically the "tails" are fatter on one side or both. If a fat tail represents too-far-from-average transistors on a billion-transistor silicon integrated circuit, that leads to excessive production test failures and expensive field returns. Bart Kosko's popular science book "Noise" is an excellent introduction to "fat tails".

rhhardin
Posts: 80
Joined: Fri Apr 09, 2010 2:11 pm UTC

### Re: 2118: "Normal Distribution"

It's independent of the variance, is the great thing.

rick.s
Posts: 16
Joined: Mon Sep 02, 2013 9:29 pm UTC

### Re: 2118: "Normal Distribution"

With the midpoint at 52.7%, I think this is a clear example of grade inflation.

rmsgrey
Posts: 3633
Joined: Wed Nov 16, 2011 6:35 pm UTC

### Re: 2118: "Normal Distribution"

keithl wrote:I've learned that, with enough data, no distributions are exactly Gaussian

Something I've learned is that, with enough processing and discarding of outliers, all distributions are Gaussian to within any reasonable margin of error...

Fungo4
Posts: 15
Joined: Mon Oct 13, 2014 2:48 pm UTC

### Re: 2118: "Normal Distribution"

rmsgrey wrote:
keithl wrote:I've learned that, with enough data, no distributions are exactly Gaussian

Something I've learned is that, with enough processing and discarding of outliers, all distributions are Gaussian to within any reasonable margin of error...

What if we made a plot of all those distributions based on how gaussian they are...

hamjudo
Posts: 110
Joined: Wed Feb 16, 2011 6:56 pm UTC

### Re: 2118: "Normal Distribution"

https://en.wikipedia.org/wiki/Bean_machine
Dalton Boards, also known as Bean Machines, are in quite a few museums, but none of them demonstrate the XKCD Demarcation of the Normal Distribution.

We could make a modified Dalton Board that did split the marbles into two roughly equal groups using Randall's arbitrary technique.

Since it would be meaningless, we could install the machine in an art museum, rather than a science museum.

We won't do this because it would require a lot of tedious effort, and not demonstrate anything interesting. Separating the central 52% without losing marbles would be a bit of an engineering challenge.

Soupspoon
You have done something you shouldn't. Or are about to.
Posts: 4060
Joined: Thu Jan 28, 2016 7:00 pm UTC
Location: 53-1

### Re: 2118: "Normal Distribution"

Code: Select all

`                ⊥                   ⊥                                                              ⊥  ⊥                     ⊥                                                                ⊥           ⊥                   ⊥      ⊥             ⊥   ⊥        ⊥                     ⊥⊥               ⊥        ⊥       ⊥                   ⊥           ⊥                        ⊥`

keithl
Posts: 661
Joined: Mon Aug 01, 2011 3:46 pm UTC

### Re: 2118: "Normal Distribution"

keithl wrote:Either that, or the lines indicate where the lost airplane is presumed to have crashed into the mountain, and where to send the search parties first.

Good news! A search party found the airplane! All the passengers survived, except for the three who were eaten. The passengers also ate the search party. Then the passengers were eaten by the other search parties.
We will publish a paper on the nutritional value of searchers and passengers soon. There will be gaussian distributions for all the essential nutrients, after we eat enough of the researchers who disagree.

sotanaht
Posts: 238
Joined: Sat Nov 27, 2010 2:14 am UTC

### Re: 2118: "Normal Distribution"

rmsgrey wrote:
keithl wrote:I've learned that, with enough data, no distributions are exactly Gaussian

Something I've learned is that, with enough processing and discarding of outliers, all distributions are Gaussian to within any reasonable margin of error...

Of course. If you discard all the data that doesn't fit your distribution, your distribution is perfect.

cellocgw
Posts: 2055
Joined: Sat Jun 21, 2008 7:40 pm UTC

### Re: 2118: "Normal Distribution"

sotanaht wrote:
rmsgrey wrote:
keithl wrote:I've learned that, with enough data, no distributions are exactly Gaussian

Something I've learned is that, with enough processing and discarding of outliers, all distributions are Gaussian to within any reasonable margin of error...

Of course. If you discard all the data that doesn't fit your distribution, your distribution is perfect.

Does a single data point fit a Gaussian distribution? Why or why not?

To be considered, all answers must be written on the back of a \$100 bill and mailed to the contest address by April 1st ± sigma
https://app.box.com/witthoftresume
Former OTTer
Vote cellocgw for President 2020. #ScienceintheWhiteHouse http://cellocgw.wordpress.com
"The Planck length is 3.81779e-33 picas." -- keithl
" Earth weighs almost exactly π milliJupiters" -- what-if #146, note 7

Soupspoon
You have done something you shouldn't. Or are about to.
Posts: 4060
Joined: Thu Jan 28, 2016 7:00 pm UTC
Location: 53-1

### Re: 2118: "Normal Distribution"

sotanaht wrote:Of course. If you discard all the data that doesn't fit your distribution, your distribution is perfect.

Ironically my little 'joke' above (in whose creation I realised I hadn't even added the Math::Trig module here until now) was the third run of the script to create. The first two runs showed a clear and tight x=y grouping of the scatter. Given it was a randomised scattering in polar coordinates, I knew it wasn't an error in my implementation (assuming no peculiar resonances of the inbuilt PRNG values around what effectively became 45 and 225 degrees on every second call for a value!), but I went on until it lost the pattern anyway. Because that was the mood I was in. And I still trusted a script to be more random in distribution (with all the possibilities of a pattern sneaking through via fluke) than any attempt to make it up manually.

cubicquitous
Posts: 4
Joined: Mon Mar 04, 2019 7:45 pm UTC

### Re: 2118: "Normal Distribution"

I'm surprised no-one's explained Randall's maths.

Call h the height of the Normal distribution pdf at zero (the mean), ie h = 1/sqrt(2pi).

Then the horizontal lines meet the pdf at say
(+/-a, h(1-p/2)) and
(+/-b, h(1+p/2))
(lower line and upper line respectively).

If p = 0.52682, then a = 1.69790 and b = 0.73479 (assumed sd of 1), and using the cumulative distribution, give or take a couple of rectangles, we get that the shaded area is 0.50000 as claimed. p rounds to 52.7%.

There you go, a bit of Maths pulled me out of 10+ years of lurking on this thread to actually register and post!