I know there are algorithms like the Box-Muller transform, the Marsaglia polar method, and the ziggurat algorithm that transform uniformly-distributed inputs to produce normally-distributed outputs. However, unless I am missing something huge, I am rather certain that those algorithms as implemented, end up "missing" substantial portions of the representable floating-point numbers. As in, there are lots of floats/doubles that simply cannot be the output of those functions, regardless of the input. So, while the output is normally-distributed among its possible outputs, it is not normally-distributed across the whole range of representable outputs.

I am looking for a way to generate normally-distributed values such that, for any representable x, the probability of the algorithm outputting x is exactly tail(mean(x, nextafterf(x, -FLT_MAX))) - tail(mean(x, nextafterf(x, FLT_MAX))), where tail(z) gives the area under the normal PDF to the right of z. (The listed formula is for floats. For doubles, do the same thing but replace double-precision things with long doubles, and floats with doubles.)

The only way I've come up with to do so is inefficient and calculation-intensive (not to mention requiring the use of doubles to produce float values, and long doubles to produce doubles) and goes like this:

Spoiler:

For each possible exponent of a float, precalculate the probability of getting that exponent from a true normal distribution. For floats there are 255 exponents to check, and for doubles there are 2047, so this is pretty fast to calculate using erfc (or erfcl). Also precalculate the odds of getting the smallest (hence most likely) value with that exponent, using the area under the PDF for all values closer to that number than any other representable number.

Then when you need a value, generate a uniform random number with your favorite method, and use it to check which exponent to use, going by the precalculated odds (might need two random numbers to get enough precision to differentiate the boundaries.)

Now you know what exponent your float/double will have, use rejection sampling to get a number with that exponent. First generate the rest of the bits uniformly at random (so, 24 value bits for a float, 53 for a double, and a sign bit.) Then calculate the probability of getting closer to that number than any other representable number from the normal distribution, as a fraction of the odds of getting the smallest number with that exponent (which was precalculated,) to use as a rejection threshold.

Then generate another uniform value and check if it is less than the rejection threshold just calculated. If it is, return that value. If not, repeat to generate another candidate with the same exponent.

As you can imagine, this procedure is quite slow. Is there any known efficient way to produce normally-distributed float/double values that have exactly the ideal probability of producing each representable value?

headprogrammingczar wrote:Apply a known transformation to a larger space than float/double, and then convert, might be a more tractable solution.

That will certainly smooth out a lot of the granularity, but if I’m going to generate normally-distributed doubles with a standard method, I’d probably just do all my calculations at double precision rather than casting to float right away.

What's your motivation here: what are the limitations of any of the standard algorithms that you're trying to fix? Anywhere beyond about 10 sigma the Gaussian tails are so small that they will probably be dominated in any physical process by other error sources. Far out on the Gaussian tails the probabilities get absurdly small, and if you're doing any sort of transformations from a finite-period PRNG you won't be able to represent these probabilities.

That is, suppose you have a PRNG with period T ~ 2^t, taking roughly t bits of state --- e.g., the Mersenne twister MT19937 has t=19937 --- and so giving you probability discretized in chunks of 1/T. The single-tail Gaussian integral for x > M >> 1 is

i.e., you can't accurately model a tail any farther out than roughly the square root of the number of state bits: so for MT19937, with ~19937 bits of state, you can't model the tail past x ~ sqrt(19937) ~ 140.

You can of course build a bigger PRNG, but (1) the scaling is pretty bad, and (2) I can't see any real application for it.

Also, a nitpick: what you probably want for the probability of outputting x is the probability integrated over either [x,nextafter(x)] or [nextbefore(x),x] or [(x+nextbefore(x))/2,(x+nextafter(x))/2], not [nextbefore(x),nextafter(x)]; the latter double-counts the probability.

Mostly it’s just my aesthetic as a mathematician. Certainly for any macro-scale interval, the standard algorithms do a stand-up job of approximating the Gaussian probability of landing in that interval. But when you zoom in to the scale of representable floats, suddenly the PDF that is actually employed looks chaotic: nearby floats have vastly different probabilities of being selected. Yes, the running average over a reasonable window is nice and smooth, but for individual floats the odds jump up and down substantially. And I’m talking about the whole distribution, not just the tails: instead of being a textbook bell curve, the PDF is a spiky monstrosity that jumps from zero up to many times higher than the bell curve and back down to zero, all the time.

Isn't that true of all operations on floats, though? I mean, if you take a uniformly distributed float between 2 and 3, and then subtract 1, for all practical purposes you'll get a uniformly distributed float between 1 and 2. But with 64-bit floats, there's 2^{52} numbers between 1 and 2 and only 2^{51} between 2 and 3, so some clustering is inevitable. And this is just after an operation as simple as subtracting 1.

If your aesthetic as a mathematician makes you distrust micro-scale patterns beyond your control, you should probably just stay away from floats altogether. It's a chronic problem with them. You should also avoid any contact with floating-point in a system that needs to be cryptographically secure.

generating random integers is "easy" enough, so my suggestion, if you want a random float, just take a transcendental number like pi, and multiply it by a random int, then divide by a random int, and convert to necessary range.