@wumpus: Thanks! That was quite helpful. My code is now

Code: Select all

`def multifactor(n):`

# Note that the multiprocessing incurs overhead to the tune of 0.1s.

# This is generally negligible compared to the time to factor the sort of numbers this is fed

# and the time already spent in the first two phases of the factoring.

# I'm defining a function inside another function. Something's wrong with me.

def factory(method, n, output):

output.put(method(n))

facs = mp.Queue()

p = mp.Process(target=factory, args=(pollardRho_brent, n, facs))

e = mp.Process(target=factory, args=(ecm, n, facs))

m = mp.Process(target=factory, args=(mpqs, n, facs))

p.start(); e.start(); m.start()

while facs.empty(): sleep(0.01)

p.terminate(); e.terminate(); m.terminate()

return facs.get()

And as for the premature optimization... this parallel function is called after the number of interest has already been chugged on for several million rounds of trial division and Pollard's Rho algorithm. What remains in the unfactored portion of the number should therefore be only large primes (which

ecm does a rather good job of extracting) and very large primes (which is

mpqs's specialty). I'm including

pollardRho_brent in the mix just in case a small factor or two somehow escaped the first few million rounds (which occasionally happens in my testing). I would call these without parellelism, but to do so effectively would involve determining the size of the factors of the input beforehand — and the only real way to do that is to just factor the number, so we're back where we started.

Also, thanks for pointing out PyPy. It cut the time on my benchmarks by a factor of 10!