hotaru wrote:it can actually be improved more than that, by getting 32 bits at a time from arc4random() instead of 1 at a time from arc4random_uniform(), so one call to arc4random is only needed 1 in 4096 times, a second is only needed 1 in 17592186044416 times, a third is only needed one in 75557863725914323419136 times, etc.:

Code: Select all

`void randomDoubleArray(double *v, const int n)`

{ assert(sizeof(double) * CHAR_BIT == 64);

arc4random_buf(v, n * sizeof(double));

for(int i = 0; i < n; ++i)

{ uint64_t *const p = (uint64_t *)v + i,

e = 1022;

uint32_t r = *p >> 52;

int i = 12;

while(e && r & 1)

{ --e; --i; r >>= 1;

if(!i)

{ r = arc4random();

i = 32; }}

*p &= 0x000ffffffffffffful;

*p |= e << 52; }}

Yeah, I realized that and implemented it shortly after I posted. Be careful with your two different variables “i” though. My version looks like this inside the “if” from my previous post:

Code: Select all

` e = 1010; bool proceeding = true;`

while (proceeding && e) {

uint32_t b = arc4random();

proceeding = (b == -1);

while (b & 1) { b >>= 1; e--; }

e = (e > 1024) ? 0 : e;

}

However, testing reveals no significant difference in speed from the previous. I mean, the difference can only possibly come up 1 in 8,192 times, and half of those times it’s just saving a single extra function call.

hotaru wrote:i'm pretty sure using nextafter like that would make the results (very slightly) non-uniform.

No more so than it already is. Currently, what we are doing has the same outcome as this process:

1. Partition [0, 1) into half-open subintervals bounded by the representable doubles.

2. Choose a real in [0, 1) uniformly at random.

3. Find the lower bound of the subinterval the chosen number is in.

4. Return that lower bound.

Inserting a single “nextafter” transforms it into:

1. <as above>

2. <as above>

3. Find the upper bound of the subinterval the chosen number is in.

4. Return that upper bound.

Essentially, as an approximation for a true uniform distribution, the “standard” version is slightly on the low side, and the “nextafter” version is slightly on the high side.

If we wanted to make the standard version as “perfect” as possible, we’d have to do something pretty subtle. I’m not even sure exactly what. If there were only one bit so the options were 0 and ½, then we’d always have to pick ½ to make the average correct, but that would blow up our variance. If the options were 0, ¼, ½, and ¾, then…I haven’t thought it through and I’m not about to.

hotaru wrote:if generating subnormals is a problem, replacing "e" in the while statement with "e > 1" should fix that, and produce values in [2^{-1022}, 1).

True, though that prevents both 0 and 1 from being rolled.

KnightExemplar wrote:Can't you do a 2nd pass for subnormals?

Basically, a two pass algorithm that does:

Pass 1: What you did in your original post

Pass 2:

Poisson distribution for normals. Basically the pseudocode would be:

for(i=0; i<array_length; i+= poisson_distribution(1/2^52)){

array[i] = random_subnormal();

}

And random_subnormal() would be just a simple random-bits with the exponent set to... whatever the magic number for subnormals was. Poisson distribution would not have to be "efficient", because it would be generated _very_ rarely. In fact, subnormals will only happen once every 32

Petabytes on the average (once every 2^52, and each double takes up 8 bytes). But if the Poisson distribution did its magic, even the subnormals will be generated at the correct rate with the above pseudocode.

The other thing you can do is say declare your code is correct "on the average" for all buffer sizes of less than 1TB. (Which roughly has a probability of generating a subnormal at 2^40 / 2^52 or 1/2^12).

I’m not quite sure I follow. If my original function doesn’t produce any doubles with granularity finer than 1/2

^{52}, why would I add to it the ability to generate subnormals specifically?

In any case, the reason I am taking the natural log of numbers in (0, 1] in the first place is to generate exponentially-distributed random variates (either for their own use, or as a step in the Box-Muller transform to generate normal variates). So it seems a little silly to use a Poisson distribution in the production of the initial uniform variates.

I suppose I could switch to a ziggurat algorithm for more complex distributions, but frankly I’d rather not at this point.

KnightExemplar wrote:Something I would be worried about though is that smaller and smaller numbers have less precisions. One in eight numbers (ie: all numbers 0.125 and lower) generated by your example will only have 2^49 bits generated (bottom 3 bits would be left to the rounder). So your numbers get less and less precision the smaller they get. Maybe doing a 2nd pass with some extra random bits would do the trick? I think your methodology is very good at choosing the correct exponent efficiently. But tacking on some extra bits on the end can't hurt.

While you are technically correct, the 49 bits in your example represent exactly the value that I want represented. There is no “rounding” when going from 1.125 to 0.125. Yes, the latter has more trailing zeros, but both represent exact values—namely some multiple of 2

^{−52}.

All multiples of 2

^{−52} are equally likely, and none get skipped, so I don’t see the problem. That said, the more I think about it the more I like Hotaru’s approach. Especially for my purposes, where finer granularity at smaller values corresponds directly to a smoother distribution of output from the natural log, and thus a more representative mix of exponential variates.