Analysis of the Java RNG

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Moderators General, Prelates

>-)
Posts: 527
Joined: Tue Apr 24, 2012 1:10 am UTC

Analysis of the Java RNG

Postby >-) » Sat Apr 21, 2018 11:39 am UTC

Given a seed, the java code for setSeed is:

Code: Select all

seed = (seed ^ 0x5DEECE66DL) & ((1L << 48) - 1)


The code to get the next n bits is

Code: Select all

seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1)
result = (int)(seed >>> (48 - n))


And finally the code to get the nextLong (which is 64 bits) is

Code: Select all

result = ((long)next(32) << 32) + next(32)

(it may be useful to know that java evaluates the next(32) on the left before the next(32) on the right)

Suppose I pick an arbitrary starting seed m, then call setSeed and call nextLong. Call the returned value x. Given the lower 48 bits of x, I want to recover the upper 16 bits of x. Is it possible?

Hopefully, this should be relatively easy since setSeed already throws away 16 bits of m away, leaving 48 bits of uncertainty. In order to accomplish this task, it would be sufficient to figure out the lower 48 bits of m.

Another question: What can we say about the lower 48 bits of x? Are all possible values covered, just most of them, or almost none of them?

I don't have much ideas on what to try other than throwing everything into a SAT solver. Is there a smarter way?

Some motivating background: It's relatively easy to recover the lower 48 bits of a minecraft world's seed. However, the upper 16 bits are more difficult. The full 64 bit seed is generated by the procedure described in this problem. In addition, can we rule out parts of the lower-48-bits-space to make the search easier?

Tub
Posts: 399
Joined: Wed Jul 27, 2011 3:13 pm UTC

Re: Analysis of the Java RNG

Postby Tub » Mon Apr 23, 2018 10:23 am UTC

There's plenty of research about Linear Congruential Generators, if you feel like diving in. They have a few weaknesses, but AFAIK none that would help you here.

The state of the RNG has 48 bits, thus only 2^48 of the possible 2^64 different longs can ever be returned by nextLong. According to the docs, they're "uniformly distributed", and we have no further guarantees than that.

Let me reorder your questions a little:
Another question: What can we say about the lower 48 bits of x? Are all possible values covered, just most of them, or almost none of them?

With no further knowledge than "uniformly distributed", let's ask stochastics: You pick 2^48 numbers at random from a pool of 2^48 numbers. How likely is it that x was picked at least once?
p(x is in the sequence) = 1 - (1 - 1/2^48)^(2^48) = ~63%

Every number has a ~63% chance to be picked, so we'd expect 63% of all possible values of the lower 48 bits to appear at least once, and ~37% of the values to never appear.

Gaining deeper insight into the structure of the RNG (either mathematically or by brute forcing) would yield a more accurate answer, because a LCG is not perfectly uniformly distributed. But it's extremely unlikely for every value to be picked exactly once.

Suppose I pick an arbitrary starting seed m, then call setSeed and call nextLong. Call the returned value x. Given the lower 48 bits of x, I want to recover the upper 16 bits of x. Is it possible?

As per your previous question, there may be multiple possibilities for the upper 16 bits.

You can just advance the java RNG 2^48 times and compare all 2^48 possible long values with your lower 48 bits. But that would take a few hundred years, or $100.000 for a cloud provider.

You can reduce that. Your RNG goes through three states for nextLong(), s0 -> s1 -> s2. The upper 32 bits of s1 are the upper 32 bits of your long; the upper 32 bits of s2 are the lower 32 bits of your long.
Thus, you know the middle 16 bits of s1. You can try all 2^32 possible states for s1 with the known middle bits and advance them once, comparing the new state to the known 32 bits of s2. If you find a match, then you have both s1 and s2, so you can derive the full value of x. Again, you may get multiple x, and then you'll have to manually check.

Brute forcing 2^32 advances should be possible in a couple of hours, if you use optimized code. This is one where I'd dedust the old c compiler.

Some motivating background: It's relatively easy to recover the lower 48 bits of a minecraft world's seed. However, the upper 16 bits are more difficult. The full 64 bit seed is generated by the procedure described in this problem. In addition, can we rule out parts of the lower-48-bits-space to make the search easier?

So there's just 2^16 possible worlds that'd match your lower 48 bits, and you could generate those and compare them to your savegame. Not the fastest or easiest solution, but it's another possible approach.

User avatar
Xanthir
My HERO!!!
Posts: 5327
Joined: Tue Feb 20, 2007 12:49 am UTC
Location: The Googleplex
Contact:

Re: Analysis of the Java RNG

Postby Xanthir » Mon Apr 23, 2018 9:48 pm UTC

Tub wrote:
But it's extremely unlikely for every value to be picked exactly once.

And, due the way LCGs work, once you *do* repeat a value, you loop the sequence. Anything not hit before the first repeat will never be hit.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))

User avatar
Thesh
Made to Fuck Dinosaurs
Posts: 6271
Joined: Tue Jan 12, 2010 1:55 am UTC
Location: Colorado

Re: Analysis of the Java RNG

Postby Thesh » Mon Apr 23, 2018 11:36 pm UTC

Only if the size is of the state equals the output. I believe in this case there should be 2^16 unique sequences of digits for a 32 bit number.
Summum ius, summa iniuria.

User avatar
Xanthir
My HERO!!!
Posts: 5327
Joined: Tue Feb 20, 2007 12:49 am UTC
Location: The Googleplex
Contact:

Re: Analysis of the Java RNG

Postby Xanthir » Tue Apr 24, 2018 6:31 pm UTC

Oh duh, yeah, sorry, was assuming that the output was the state. If the output is trimmed from the state, you can def have repeated outputs that still have unique state and thus different successors.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))


Return to “Computer Science”

Who is online

Users browsing this forum: No registered users and 6 guests