Brent-Pollard ρ factorisation

% Remco Bloemen % 2009-12-05, last updated 2014-03-04

In the previous post the Pollard ρ method was by far the fastest of the methods tested to factor an 64 bit integer. In this post I implement an improved version by John Pollard and Richard Brent. It is described in this paper:

Richard P. Brent (1980) – An Improved Monte Carlo Factorization Algorithm

The straightforward implementation in C++ is:

uint64 brent_pollard_factor(uint64 n)
{
        const uint64 m = 1000;
        uint64 a, x, y, ys, r, q, g;
        do
                a = random() % n;
        while(a==0||a==n-2);
        y = random() % n;
        r = 1;
        q = 1;

        do {
                x = y;
                for(uint64 i=0; i < r; i++) {
                        // y = y² + a mod n
                        y = mul(y, y, n);
                        y += a;
                        if(y < a)
                                y += (max_uint64 - n) + 1;
                        y %= n;
                }

                uint64 k =0;
                do {
                        for(uint64 i=0; i < m && i < r-k; i++) {
                                ys = y;

                                // y = y² + a mod n
                                y = mul(y, y, n);
                                y += a;
                                if(y < a)
                                        y += (max_uint64 - n) + 1;
                                y %= n;

                                // q = q |x-y| mod n
                                q = mul(q, (x>y)?x-y:y-x, n);
                        }
                        g = gcd(q, n);
                        k += m;
                } while(k < r && g == 1);

                r <<= 1;
        } while(g == 1);

        if(g == n) {
                do {
                        // ys = ys² + a mod n
                        ys = mul(ys, ys, n);
                        ys += a;
                        if(ys < a)
                                ys += (max_uint64 - n) + 1;
                        ys %= n;

                        g = gcd((x>ys)?x-ys:ys-x, n);
                } while(g == 1);
        }

        return g;
}

The parameter m has been fixed to 1000, this seemed to be the fastest choice.

Given a composite number n it will return a non-trivial factor of n. To turn this method into a prime factorisation method this factor must be further split up in factors until the numbers are prime. I develop the function listed below to do this.

It maintains a stack of possibly composite factors and a list of prime already discovered. It pops a possible composite from the stack, checks its primality (with this method). If the number is prime, it is removed from all the other factors on the stack and put on the prime list, else, the number is split using either trial division, if it is small, or brent_pollard_factor.

vector<uint64> prime_factors(uint64 n)
{
        vector<uint64> factors;
        vector<uint64> primes;

        uint64 factor = brent_pollard_factor(n);
        factors.push_back(n / factor);
        factors.push_back(factor);

        do {
                uint64 m = factors[factors.size() - 1];
                factors.pop_back();

                if(m == 1)
                        continue;

                if(is_prime(m)) {
                        primes.push_back(m);

                        // Remove the prime from the other factors
                        for(int i=0; i < factors.size(); i++) {
                                uint64 k = factors[i];
                                if(k % m == 0) {
                                        do
                                                k /= m;
                                        while(k % m == 0);
                                        factors[i] = k;
                                }
                        }
                } else {
                        factor = (m < 100) ? small_factor(m) : brent_pollard_factor(m);
                        factors.push_back(m / factor);
                        factors.push_back(factor);
                }
        } while(factors.size());

        return primes;
}

Benchmark and conclusion

Compared to the methods posted previously, this method can perform the task in 0.3 s. A tenfold improvement over the previous best and three thousand times faster than trial division. Also, the time spent in gcd is less than 20%, much less than the previous 90%.

Remco Bloemen
Math & Engineering
https://2π.com