Hmm. A fast XdY function could be fun.

Here is an attempt:

Use a binomial distribution trick to calculate the number of X dice with Y faces that land on 1.

Then eliminate that many dice from your pool. Do the same for the number of X' dice with Y-1 faces that land on 1 -- this is the number of 2s you get.

Repeat until you eat all of the faces (ie, when you have X'''''..''' dice with 1 face, all of them land on 1, so it is guaranteed to terminate).

Now you can precalculate the exact binomial distributions and store them in a table of dimension X x Y x Y.

If the largest dice you use are d20s, and you use them in clumps of 100 dice, that's a table with 40000 entries, each of them a probability of (X dice with Y sides) having (exactly Q dice landing on a 1).

For a given XdY roll, you break X down into chunks of 100. Do a RNG. Accumulate up the table until you hit the threshold. Repeat for each face of the die.

This gives you mathematically perfect results in O(Y) time for up to 100 dice being rolled.

For beyond 100 dice... you could just scale the result (so 15000d4 is 150 * 100d4 ) naively, or you could do a double-scaling of both magnitude and variance 15000d4 = {[(100d4 - 100*2.5)*SD_scale_factor] + 100*2.5} * 150. Then fuzzy it with a random +/- (SD_scale_factor * 150)/2 factor thrown on top (so you don't get discreet pumps every SD_scale_factor * 150 units).

And for lower amounts (like 1d4), you just roll 1d4.

In C++, I'd implement this as a die roll description object. Ie, die( 4 ) would be a 1d4, unrolled.

Calling .roll() would do work.

die(4)^100 might be 100d4 (the interior term rolled 100 times, and added).

die(4)+50 would be 1d4+50.

die(4)*50 would be 50*1d4.

( 2*die(4)+die(8)+2 )^10 is 10d4 *2 + 10d8 + 20.

The object would have a collection of (die size) (die count) and (multiplier), and a constant factor (or the d1) term.

^ number scales the die count of the term (and the constant term).

* and / scales the multipliers (and the constant term).

+ adds two sets of terms.

.roll() then runs something like the above, doing different things for low die-totals than huge die totals, in order to generate a reasonably fast result.

If I was really crazy, I'd extend it to support (1d4)d6. (not that tricky, but probably not worth it).

The best part would be that the 'using' code would now look more natural, and I could change how dice are rolled willy-nilly without ever touching client code every again (by modifying how .roll() works).

But I like coding this kind of thing.