## Torsion subgroups for elliptic curves over Q

For the discussion of math. Duh.

Moderators: gmalivuk, Moderators General, Prelates

LucasBrown
Posts: 299
Joined: Thu Apr 15, 2010 2:57 am UTC
Location: Poway, CA

### Torsion subgroups for elliptic curves over Q

TL;DR: Is there a way faster than the Nagell-Lutz theorem to tell whether the torsion subgroup of the elliptic curve y2 = x3 + ax + b over Q, where a and b are integers, is trivial?

Context:

As a personal project I wrote some code to analyze an elliptic curve over the rationals to extract and classify its torsion subgroup. It includes some lines to collect statistics about how often the 15 possibilities enumerated in Mazur's theorem are encountered.

It loops over all curves of the form y2 = x3 + ax + b (where a and b are integers) with |a| + |b| = 0, then all curves of that form with |a| + |b| = 1, then all curves of that form with |a| + |b| = 2, etc.

So far, it's examined 680,030,280 curves, and of those, it's found 1 torsion subgroup of order 5, 1 of order 9, 2 of order 7, 2 of order 8 (both 2x4, neither cyclic), 30 of order 6, 994 of order 3, 1636 of order 4 (287 cyclic, 1349 2x2), and 330794 of order 2. The remaining 679696820 torsion subgroups (99.95%) were trivial.

Currently, my code computes all points in the torsion subgroup via the Nagell-Lutz theorem. While this is pretty quick for individual curves, it's a bit slow when applied to such large numbers of curves. To that end, I am looking for a faster way to tell whether a given curve's torsion subgroup is trivial.

Code: Select all

#! /usr/bin/env python3

from fractions import Fraction as Frac
from labmath import divisors, factorint, isqrt
from itertools import count, chain

class ECPoint(object):
def __init__(self, a, b, x, y):
self.a, self.b, self.x, self.y, self.isinf = a, b, Frac(x), Frac(y), False
if y**2 != x**3 + a*x + b: raise Exception("The point (%s, %s) is not on the given curve (%d, %d)." % (x, y, a, b))

def __eq__(self, Q):
if self.a != Q.a or self.b != Q.b: return False
if Q.isinf or self.isinf: return Q.isinf and self.isinf
return self.x == Q.x and self.y == Q.y

def __str__(self): return "(%s, %s) on (%d, %d)" % (self.x, self.y, self.a, self.b)

def __neg__(self): return ECPoint(self.a, self.b, self.x, -self.y)

if (self.a, self.b) != (Q.a, Q.b): raise Exception("Cannot add points on different curves.")
if isinstance(Q, ECPinf): return self
x1, y1, x2, y2 = self.x, self.y, Q.x, Q.y
if (x1, y1) == (x2, y2):
if y1 == 0: return ECPinf(self.a, self.b)
m = (3*x1*x1 + self.a) / (2*y1)
else:
if x1 == x2: return ECPinf(self.a, self.b)    # vertical line
m = (y2 - y1) / (x2 - x1)
x3 = m*m - x2 - x1
y3 = m*(x3 - x1) + y1
return ECPoint(self.a, self.b, x3, -y3)

def __sub__(self, Q): return self + -Q

def __mul__(self, n):
if not isinstance(n, int): raise Exception("Points can only be multiplied by integers.")
if n < 0: return -self * -n
if n == 0: return ECPinf(self.a, self.b)
Q = self
R = self if n & 1 == 1 else ECPinf(self.a, self.b)
i = 2
while i <= n:
Q = Q + Q
if n & i == i: R = Q + R
i = i << 1
return R

def __rmul__(self, n): return self * n

class ECPinf(ECPoint):
def __init__(self, a, b): self.a, self.b, self.isinf = a, b, True
#def __eq__(self, Q): return (self.a, self.b) == (Q.a, Q.b) and Q.isinf    # TODO exception if different curves?
def __str__(self): return "(\u221e,\u221e) on (%d, %d)" % (self.a, self.b) # That's unicode for infinity.
def __neg__(self): return self
if (self.a, self.b) != (Q.a, Q.b): raise Exception("Cannot add points on different curves.")
return Q
def __mul__(self, n):
if not isinstance(n, int): raise Exception("Points can only be multiplied by integers.")
return self

ttypes = ('1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '12', '2x2', '2x4', '2x6', '2x8')
tstats = {x:0 for x in ttypes}
facs = {}

def spiral():
global facs, tstats
yield (0,0)
for s in count(1):
#print('\b'*42, s, end='', flush=True)
if s % 10 == 0: facs = {}
for x in range(s-1): yield (s-x-1,x)
for x in range(s-1): yield (-x,s-x-1)
for x in range(s-1): yield (1+x-s,-x)
for x in range(s-1): yield (x,1+x-s)
print('\b'*180 + str(s) + '    ' + '    '.join("%s: %d" % (x,tstats[x]) for x in ttypes), end='', flush=True)

def divs(n, facs):
if n not in facs: facs[n] = factorint(n)
yield from divisors(facs[n])

def torstruct(a, b, torsion):
if len(torsion)+1 in (1, 2, 3, 5, 6, 7, 9, 10): return str(len(torsion)+1)
if len(torsion)+1 == 16: return "2x8"
for (x,y) in torsion:
P = ECPoint(a, b, x, y)
Q, n = P, 1
while Q != ECPinf(a,b):
Q += P
n += 1
#assert Q == n*P
if n == len(torsion)+1: return str(len(torsion)+1)
return "2x%d" % ((len(torsion)+1)//2)

for (a,b) in spiral():
#print('\b'*42, abs(a)+abs(b), a, b, end='    ', flush=True)
D = abs(4*a*a*a + 27*b*b)
if D == 0: continue

# y^2 == x^3 + a*x + b is an elliptic curve over Q.  Now we find integer points on it.

Z = set()   # This will be the set of all points on the curve with finite order.
for y in chain([0], divs(D, facs)):
if y != 0 and D % (y*y) != 0: continue

# We must now find the integer roots of x^3 + a*x + b - y^2.
# By the rational roots theorem, we just have to test 0 and divisors of b-y^2.
for x in divs(abs(b-y*y), facs):
if b + a*x + x*x*x == y*y: Z |= {( x,y), ( x,-y)}
if b - a*x - x*x*x == y*y: Z |= {(-x,y), (-x,-y)}
if b == y*y:    # The rational roots theorem doesn't help us when the constant term is zero.
# We're looking for roots of x^3 + ax.  These are 0, sqrt(-a), and -sqrt(-a).
Z |= {(0,y), (0,-y)}
ar = isqrt(-a)
if ar*ar == -a: Z |= {(ar,y), (-ar,y), (ar,-y), (-ar,-y)}

# Z should now contain all points on the curve with finite order, and possibly some other points.

torsion = set()
for (x,y) in Z:
P = ECPoint(a, b, x, y)
# The denominators on these points get very big very fast.  It's much more efficient to work up to 13P in 1P increments
# and bail early than to jump straight to 13P.
Q = inf = ECPinf(a, b)
for n in range(1, 13):
Q += P
if Q == inf: torsion.add((x,y)); break  # P is a torsion point.
if Q.x.denominator != 1 != Q.y.denominator: break   # P is not a torsion point.
else: assert False  # My understanding of torsion points needs work.
ts = torstruct(a, b, torsion)
tstats[ts] += 1
if ts not in ('1,2,3,4,2x2'): print('\b'*180, a, b, ' '*8, len(Z), '  ', ts, ' '*8, *sorted(torsion), ' '*64)

print()

eta oin shrdlu
Posts: 451
Joined: Sat Jan 19, 2008 4:25 am UTC

### Re: Torsion subgroups for elliptic curves over Q

One thing that occurs to me is that for sufficiently large b-y2, rather than factoring b-y2 and using the rational-root theorem, it's faster to estimate the real zeroes of the cubic x3+ax+(b-y2) (e.g., using the closed-form solution to the cubic) and then check nearby integers.

Your torsion-point test (compute 2P, ..., 13P until reaching either "inf" (==0) or a noninteger point) could in principle have some shortcuts: e.g., you can stop if Q is a torsion point previously found (or if x(Q) is the x coordinate of a torsion point previously found). It may also be faster to use the duplication formula to directly compute x(P), x(2P), x(4P), ... until reaching either a noninteger x (=> P not torsion) or an x either infinite or previously seen in the list (=> P torsion). But in practice I don't think these will speed things up much if at all, since integer nontorsion points are fairly rare.

LucasBrown
Posts: 299
Joined: Thu Apr 15, 2010 2:57 am UTC
Location: Poway, CA

### Re: Torsion subgroups for elliptic curves over Q

Computing 12P directly is quite inefficient when P isn't a torsion point: for non-torsion points, the denominators of nP are about twice the size (in terms of bit length) of the denominators of (n-1)P, which slows down the computation dramatically. So it's much more efficient for non-torsion points to compute P, 2P, 3P, etc. and bail when the denominators cease to be 1 than to jump directly to 12P. And since the overwhelming majority of elliptic curves have trivial torsion, this is definitely the way to go.

I did some profiling of the code and found that the single biggest time-consumer was the factoring (which is about what I expected), so your point about estimating the roots via the cubic formula is a good idea. I'll look into that.

There is also a small speedup to be had by recognizing that P is a torsion point if and only if -P is also a torsion point. I have modified the code to take advantage of this fact, but this provides only a small constant-factor speedup. A quick way to identify curves of trivial torsion without using Nagell-Lutz would be a much more significant optimization.

eta oin shrdlu
Posts: 451
Joined: Sat Jan 19, 2008 4:25 am UTC

### Re: Torsion subgroups for elliptic curves over Q

LucasBrown wrote:Computing 12P directly is quite inefficient when P isn't a torsion point: for non-torsion points, the denominators of nP are about twice the size (in terms of bit length) of the denominators of (n-1)P, which slows down the computation dramatically. So it's much more efficient for non-torsion points to compute P, 2P, 3P, etc. and bail when the denominators cease to be 1 than to jump directly to 12P. And since the overwhelming majority of elliptic curves have trivial torsion, this is definitely the way to go.
I agree that you don't want to compute 12P directly. What I was suggesting was to compute (P,2P,4P,8P,...) instead of (P,2P,3P,4P,...), still stopping if the denominator is not 1. Your "arithmetic sequence" of points is guaranteed to include 0P=(inf,inf) if P is torsion (at or before the 12th element) and guaranteed to eventually produce a noninteger point if not. My "geometric sequence" will also eventually produce a noninteger point if P is not torsion; if P is torsion, the sequence will either include 0P (if P is of order 2, 4, or 8) or will have a periodic tail (e.g., if P is of order 10 then the sequence will be P,2P,4P,8P,6P,2P,...).

The geometric sequence is guaranteed to repeat after at most six elements for a torsion point (the worst case is a point of order 9, with sequence (P,2P,4P,8P,7P,5P,P,...)), so at most six doublings are required to check a point. As with your note we can recognize that -P is torsion iff P is, so that if the sequence contains -P then we can stop (the duplication formula gives x(2P) as a rational function of x(P), and since x(P)=x(-P) this already identifies P and -P), and the sequence of x coordinates is guaranteed to repeat or become zero after at most three elements (for torsion P). So this is pretty efficient.

A quick way to identify curves of trivial torsion without using Nagell-Lutz would be a much more significant optimization.
Right. Having to factor the discriminant, and then consider all of its square divisors, is a lot of work once the numbers get big. There is an alternative, which is essentially to directly compute the possibly-rational torsion points for the elliptic curve over the complex numbers C, and then check to see if they actually are rational. Over C an elliptic curve is doubly periodic, and there's a doubly-periodic map (based on the Weierstrass p function for the complex lattice) from the complex plane to the complex points of the curve. Finding the torsion points of order dividing n just means computing the images of the n*n-divided sublattice, so then you just have to find out whether these are rational; the practical complication is that you have to make sure your code for computing the p function is sufficiently accurate. To test whether the torsion is trivial it would be enough to check n=2,3,5,7. [This is "Doud's algorithm"; e.g. see p.13 of http://wstein.org/projects/kleinerman_senior_thesis.pdf.]