standard deviation on number of events.

For the discussion of math. Duh.

Moderators: gmalivuk, Moderators General, Prelates

>-)
Posts: 527
Joined: Tue Apr 24, 2012 1:10 am UTC

standard deviation on number of events.

Given the times of a series of events, I can find the distribution of the amount of time elapsed between each event. Given this, I can predict the mean number of events which will happen in a certain time-frame -- but how can I find the standard deviation of that prediction? I'm not sure how to approach or solve this problem.

doogly
Dr. The Juggernaut of Touching Himself
Posts: 5538
Joined: Mon Oct 23, 2006 2:31 am UTC
Location: Lexington, MA
Contact:

Re: standard deviation on number of events.

It sounds like what you want is the standard error:
http://en.wikipedia.org/wiki/Standard_error

It is the standard deviation of the values in your sample, divided by sqrt(sample size)
LE4dGOLEM: What's a Doug?
Noc: A larval Doogly. They grow the tail and stinger upon reaching adulthood.

Keep waggling your butt brows Brothers.
Or; Is that your eye butthairs?

cyanyoshi
Posts: 418
Joined: Thu Sep 23, 2010 3:30 am UTC

Re: standard deviation on number of events.

You might be looking for the standard error of the mean. The usual way to estimate it is to divide the sample standard deviation by the square root of the number of data points. Let's say you found your estimate of the mean, xbar = (x1 + x2 + ... +xN)/N . The sample standard deviation is given by s = {[(x1-xbar)2 + (x2-xbar)2 + ... + (xN-xbar)2]/(N-1)}0.5 . You can then usually get a decent estimate of the "standard deviation" of the distribution of xbar about the true mean by computing s/(N0.5) . I'm glossing over the finer details, but this is only valid if you have good reason to assume that your observations are independent from each other.

...aaaand ninja'd.

>-)
Posts: 527
Joined: Tue Apr 24, 2012 1:10 am UTC

Re: standard deviation on number of events.

I'm afraid I wasn't making myself clear enough. If I knew the times between the events had a mean of say, 5 and a standard deviation of 3, then the standard error of the mean would tell me that the average of the times between nine events would have a deviation of 3/sqrt(9) = 1.

But I want to know, in 20 seconds, how many events will happen and the standard deviation of that prediction. I know how to find the mean of that -- 4 -- but not the standard deviation.

Tirian
Posts: 1891
Joined: Fri Feb 15, 2008 6:03 pm UTC

Re: standard deviation on number of events.

>-)
Posts: 527
Joined: Tue Apr 24, 2012 1:10 am UTC

Re: standard deviation on number of events.

I'm not sure I understand.

I understand vaguely/intuitively why the standard deviation of the harmonic mean would help -- but once I have it, what do I do with it?

Tirian
Posts: 1891
Joined: Fri Feb 15, 2008 6:03 pm UTC

Re: standard deviation on number of events.

I'm not 100% sure either. My hunch from way over here is that if your interval data was a, b, c, d, ..., that you'd find the harmonic mean and deviation of 20/a, 20/b, 20/c, ... to get the distribution for the number of events per period.

But I confess that statistics makes my head spin a little at this depth. If X is a normally distributed random variable, then 20/X would not have a normal distribution. But I've never figured out whether it is illuminating or distracting to pretend that it did.

doogly
Dr. The Juggernaut of Touching Himself
Posts: 5538
Joined: Mon Oct 23, 2006 2:31 am UTC
Location: Lexington, MA
Contact:

Re: standard deviation on number of events.

I doubt the harmonic mean has anything to do with the original question.
LE4dGOLEM: What's a Doug?
Noc: A larval Doogly. They grow the tail and stinger upon reaching adulthood.

Keep waggling your butt brows Brothers.
Or; Is that your eye butthairs?

cyanyoshi
Posts: 418
Joined: Thu Sep 23, 2010 3:30 am UTC

Re: standard deviation on number of events.

>-) wrote:I'm afraid I wasn't making myself clear enough. If I knew the times between the events had a mean of say, 5 and a standard deviation of 3, then the standard error of the mean would tell me that the average of the times between nine events would have a deviation of 3/sqrt(9) = 1.

But I want to know, in 20 seconds, how many events will happen and the standard deviation of that prediction. I know how to find the mean of that -- 4 -- but not the standard deviation.

Let's get this straight. You have a known distribution for the estimated mean (xbar). You are now trying to find the distribution for 20/xbar. Is that more-or-less accurate?

In that scenario, you could maybe get away with linearizing your 1/x function about the expected mean. If the true mean is μ, then xbar=μ+ε, where ε is the error (assumed to be really small). The standard error is simply the square root of the expected value of ε2. Now,

20/xbar = 20/(μ+ε)
= (20/μ)/(1+ε/μ)
= (20/μ)*(1- ε/μ + (ε/μ)2 - ...)
~ 20/μ - 20*ε/μ2

The new standard error then would be (E[202ε24])0.5 = 20*σ/μ2 , or roughly 20 times your estimated standard error divided by your estimated mean squared.

Using the numbers you gave earlier, xbar = 5 and sxbar = 1. The number of events you expect to occur in a 20-second span is 20/5 = 4, and the standard error of this prediction is 20*1/52 = 4/5.

Sorry if I screwed up those symbols. It's past my bedtime.

>-)
Posts: 527
Joined: Tue Apr 24, 2012 1:10 am UTC

Re: standard deviation on number of events.

Yes, that's what I was looking for -- thank you.

>-)
Posts: 527
Joined: Tue Apr 24, 2012 1:10 am UTC

Re: standard deviation on number of events.

Yes, that's what I was looking for -- thank you.

edit: what happens if i have a large standard deviation making ε > μ
this makes (20/μ)*(1- ε/μ + (ε/μ)2 - ...) diverge and you end with ???

Tirian
Posts: 1891
Joined: Fri Feb 15, 2008 6:03 pm UTC

Re: standard deviation on number of events.

Then you read the top comment to the top answer to the link that I posted that discusses the limitations of the answer that I already gave you.

cyanyoshi
Posts: 418
Joined: Thu Sep 23, 2010 3:30 am UTC

Re: standard deviation on number of events.

Then things get more complicated. You would have to go through with calculating a nonlinear function of a random variable. This might help if you are interested: https://www.cis.rit.edu/class/simg713/L ... 713-04.pdf . Long story short, if you have a probability density function φ(x), then the distribution of y=g(x) is given by:

φ(x1)/|g'(x1)| + ... + φ(xm)/|g'(xm)| where {xi} is every x such that g(x) = y.

Since g(x) = 20/x has a nice inverse, this is rather straightforward. The mean of this new distribution is not even 20/x, and you have to be smart about what you assume φ(x) to be. Is it possible for you to measure a negative time until the next event? If not, then φ(x) shouldn't be nonzero outside of nonnegative x.

>-)
Posts: 527
Joined: Tue Apr 24, 2012 1:10 am UTC

Re: standard deviation on number of events.

with some algebra i think you can find it for e > u.
let e be epsilon, t be time (20), u be mu.

t/(u+e) = t/u + [ - t/u + t/(u+e) ]
= t/u + [tu - t (u+e)]/(ue+u^2)
= t/u - te/(u^2+ue)

so std = ts/(u^2 + us) where s is the original std.
which is remarkably similar to the other answer, ts/u^2, other than the us term in the denominator

cyanyoshi
Posts: 418
Joined: Thu Sep 23, 2010 3:30 am UTC

Re: standard deviation on number of events.

>-) wrote:with some algebra i think you can find it for e > u.
let e be epsilon, t be time (20), u be mu.

t/(u+e) = t/u + [ - t/u + t/(u+e) ]
= t/u + [tu - t (u+e)]/(ue+u^2)
= t/u - te/(u^2+ue)

so std = ts/(u^2 + us) where s is the original std.
which is remarkably similar to the other answer, ts/u^2, other than the us term in the denominator

That second term is not the standard deviation. Remember that you are working with probability distributions rather than simple numbers, because you will not know the true value of e ahead of time. That trick with replacing e with s only more-or-less works on linear functions, and te/(u^2+ue) is very much nonlinear. It even has a singularity!

Let me make a few things clear. The mean of a continuous distribution X with probability density function f(x) is defined as μ=E[X], the expected value of X, which is the integral from -∞ to +∞ of x*f(x) dx. The standard deviation is then defined as σ=(E[(X - μ)^2])^(0.5). Note that E[g(X)] is absolutely not equal to g(E[X]). It can sometimes be a good first-order approximation, but it is rarely equal (for instance, when g(X) is a linear function). You can't even claim that the mean of your new distribution is t divided by the old distribution, let alone say anything about the standard deviation without more information.

mfb
Posts: 950
Joined: Thu Jan 08, 2009 7:48 pm UTC

Re: standard deviation on number of events.

Depending on the situation, the whole approach could be flawed. Are the times between your events independent of each other? If they are: fine, see above.
If they are not, you are probably overestimating (or underestimating) the uncertainty. Let's consider an extreme example: the events actually come exactly every 5 seconds, but your measurements have an (independent) uncertainty of ~1-2 seconds. That is in agreement with your previous description, but it changes the prediction a lot. For 100 seconds, for example, you'll see on average 20 events, maybe 19 or 21, but certainly not 15, which would be perfectly possible for independent events.