How do you work this out.
Say you have weighed a number or rocks.
For example you weigh 5 rocks, and their weights are 6,7,7,8,9.
How would I work out what the 95% confidence interval is for the weight I will get when I weigh the next rock.
I get a sample mean of 7.4, a sample variance of 1.04 and an unbiased estimate for the population variance of 1.3.
Now if I knew for certain what the population mean was I could just use a Tdistribution, but I can't since I don't know what the population mean is.
Statistics question.
Moderators: gmalivuk, Moderators General, Prelates

 Posts: 44
 Joined: Sun May 01, 2011 5:18 am UTC
Statistics question.
http://officeofstrategicinfluence.com/spam/
That link kills spam[/size][/b][/u]
That link kills spam[/size][/b][/u]
Re: Statistics question.
That's not even possible unless you make same assumption about (or know) the distribution of the rocks weights. What you can do is calculate a confidence interval that tells you that your sample had a 95% chance to originate from a population with the mean within that interval. Be aware though that a sample size of just 5 is too small to make accurate statements.
Please be gracious in judging my english. (I am not a native speaker/writer.)
http://decodedarfur.org/
http://decodedarfur.org/
Re: Statistics question.
You can still use the tdistrubtion, as long as the rocks are normally distributed. Check out this wikipedia junk. The tvalue's distribution is independent of the sample distribution's true mean and standard deviation as long as you assume the samples are normal and independent. You don't need to know anything about the true mean to use it, and that's the magic. You want to come up a number a so that Pr(xbar  a <= mu <= xbar + a) >= 0.95.
Now start maniuplating the inequalities inside the Pr:
a <= mu  xbar <= a
a/(s/sqrt(n)) <= (mu  xbar)/(s/sqrt(n)) <= a/(s/sqrt(n))
a/(s/sqrt(n)) <= t <= a/(s/sqrt(n))
So we want to go look in our table for the tsquare distribution with n1 degrees of freedom and find the number C so that
Pr(C <= t <= C) >= 0.95
We then solve C = a/(s/sqrt(n)) for a, and that's the radius of our confidence interval. Note that the s in the formula for t is the sample variance, so you know what it is.
Now start maniuplating the inequalities inside the Pr:
a <= mu  xbar <= a
a/(s/sqrt(n)) <= (mu  xbar)/(s/sqrt(n)) <= a/(s/sqrt(n))
a/(s/sqrt(n)) <= t <= a/(s/sqrt(n))
So we want to go look in our table for the tsquare distribution with n1 degrees of freedom and find the number C so that
Pr(C <= t <= C) >= 0.95
We then solve C = a/(s/sqrt(n)) for a, and that's the radius of our confidence interval. Note that the s in the formula for t is the sample variance, so you know what it is.
What they (mathematicians) define as interesting depends on their particular field of study; mathematical anaylsts find pain and extreme confusion interesting, whereas geometers are interested in beauty.
Re: Statistics question.
z4lis wrote:You can still use the tdistrubtion, as long as the rocks are normally distributed.
Be aware that that is a big assumption to make. Always consider where your data comes from. Rocks for example are usually not normally distributed. (There are a lot more tiny pebble stones than big boulders.) At least you should do a normality test.
Please be gracious in judging my english. (I am not a native speaker/writer.)
http://decodedarfur.org/
http://decodedarfur.org/

 Posts: 44
 Joined: Sun May 01, 2011 5:18 am UTC
Re: Statistics question.
z4lis wrote:You can still use the tdistrubtion, as long as the rocks are normally distributed. Check out this wikipedia junk. The tvalue's distribution is independent of the sample distribution's true mean and standard deviation as long as you assume the samples are normal and independent. You don't need to know anything about the true mean to use it, and that's the magic. You want to come up a number a so that Pr(xbar  a <= mu <= xbar + a) >= 0.95.
Now start maniuplating the inequalities inside the Pr:
a <= mu  xbar <= a
a/(s/sqrt(n)) <= (mu  xbar)/(s/sqrt(n)) <= a/(s/sqrt(n))
a/(s/sqrt(n)) <= t <= a/(s/sqrt(n))
So we want to go look in our table for the tsquare distribution with n1 degrees of freedom and find the number C so that
Pr(C <= t <= C) >= 0.95
We then solve C = a/(s/sqrt(n)) for a, and that's the radius of our confidence interval. Note that the s in the formula for t is the sample variance, so you know what it is.
No, that tells me the 95% confidence interval for the mean.
http://officeofstrategicinfluence.com/spam/
That link kills spam[/size][/b][/u]
That link kills spam[/size][/b][/u]
Re: Statistics question.
Oh, did not check the formula provided. It is indeed possible to calculate a prediction interval with the t distribution from a sample assuming the population is normal.
Your prediction interval is the sample mean plus/minus t*s*sqrt(1+1/n) where t is the relevant percentile from the t distribution and s ist the sample standard deviation. Wikipedia explains how this works.
Your sample still is very small and rocks are still rarely normally distributed.
Your prediction interval is the sample mean plus/minus t*s*sqrt(1+1/n) where t is the relevant percentile from the t distribution and s ist the sample standard deviation. Wikipedia explains how this works.
Your sample still is very small and rocks are still rarely normally distributed.
Please be gracious in judging my english. (I am not a native speaker/writer.)
http://decodedarfur.org/
http://decodedarfur.org/
Re: Statistics question.
blademan9999 wrote:No, that tells me the 95% confidence interval for the mean.
I'm assuming this is a lowlevel stats course, so you can probably assume normality simply because that's what all these classes do at that level. If you know the current mean and standard deviation, the 95% CI is just two standard deviations in each direction from the mean. This means that 95% of rocks in your population will be in this interval.
As others have said, rocks are likely not distributed normally, and the sample size is small, but I'll wager that this is an intro stats problem and thus we're meant to just blindly plow ahead with normal distributions and small, easytocalculate samples (a bit like how into physics always neglects air resistance).
"With malleus aforethought, mammals got an earful of their ancestor's jaw"  J. Burns, Biograffiti
Re: Statistics question.
blademan9999 wrote:No, that tells me the 95% confidence interval for the mean.
Whoops, my formula is off by a factor of sqrt(n1/n) then!
What they (mathematicians) define as interesting depends on their particular field of study; mathematical anaylsts find pain and extreme confusion interesting, whereas geometers are interested in beauty.
Re: Statistics question.
I would expect rock sizes / masses to be better modeled by a power law than a normal distribution. A power law works fairly well for asteroids and other space rocks, but I guess it'd be a bit more complicated for terrestrial rocks, although geologists use power laws to model rock fracture sizes, too.
From Asteroid size distribution
FWIW, this power law also applies to meteoroids and hence to craters on airless bodies. According to the documentation of pamcrater, a crater simulation program originally written a couple of decades ago by John Walker:
From Extent of powerlaw scaling for natural fractures in rock
Also see
On the Size Distributions of Asteroid Taxonomic Classes: The Collisional Interpretation
From Asteroid size distribution
Wikipedia wrote:The number of asteroids decreases markedly with size. Although this generally follows a power law, there are 'bumps' at 5 km and 100 km, where more asteroids than expected from a logarithmic distribution are found.
FWIW, this power law also applies to meteoroids and hence to craters on airless bodies. According to the documentation of pamcrater, a crater simulation program originally written a couple of decades ago by John Walker:
John Walker wrote:The number of craters of a given size varies as the reciprocal of the area as described on pages 31 and 32 of Peitgen and Saupe, "The Science Of Fractal Images"; cratered bodies in the Solar System are observed to obey this relationship. The formula used to obtain crater radii governed by this law from a uniformly distributed pseudorandom sequence was developed by Rudy Rucker.
From Extent of powerlaw scaling for natural fractures in rock
Geology wrote:Abstract
New data sets from natural faults and extension fractures exhibit simple powerlaw scaling across 3.4–4.9 orders of magnitude, regardless of rock type or movement mode. The data show no evidence of natural gaps or scaling changes. Each data set consists of independent measurements made at different observational scales; a powerlaw regression to the subset of smaller fractures in each case provides an extrapolation that accurately predicts associated larger fractures. Consequently, data representing a limited range of fracture sizes may be used to characterize a much broader spectrum of fracture sizes.
Also see
On the Size Distributions of Asteroid Taxonomic Classes: The Collisional Interpretation

 Posts: 44
 Joined: Sun May 01, 2011 5:18 am UTC
Re: Statistics question.
lorb wrote:Oh, did not check the formula provided. It is indeed possible to calculate a prediction interval with the t distribution from a sample assuming the population is normal.
Your prediction interval is the sample mean plus/minus t*s*sqrt(1+1/n) where t is the relevant percentile from the t distribution and s ist the sample standard deviation. Wikipedia explains how this works.
Your sample still is very small and rocks are still rarely normally distributed.
I figured that would probably how it worked.
Well, now I know for sure how to do these types of questions. Also that example with the rocks was just that, an example. I really just wanted to know how to do those types of questions. Thanks.
http://officeofstrategicinfluence.com/spam/
That link kills spam[/size][/b][/u]
That link kills spam[/size][/b][/u]
Who is online
Users browsing this forum: No registered users and 15 guests