Miller-Rabin primality test (now in 64 bit!)

% Remco Bloemen % 2009-11-27, last updated 2014-03-03

In the previous post I showed an implementation of Miller-Rabin that overflowed when the number was more that 32 bits. In this post I present a way to fix this.

I could remember an instruction I used before, mulq which does 128 bit multiplication. It multiplies two 64 bit number and stores the least significant 64 bits in rax and the significant half in rdx.

Now we still need to reduce this number modulo $m$. Originally I planned to create a table of $2^n\!\!\mod m$, use the bits of rdx to add them and then add rax % m. But after consulting this x86-64 manual I found out about divq, which takes a 128 bit number stored in rdx:rax, a 64 bit number $m$ and p roduces the quotient and remainder. The whole thing can be coded as:

/// Modular multiplication
/// @param a The first factor, a < m
/// @param a The second factor, b < m
/// @param m The modulus
/// @return The reduced product, a b mod m < m
uint64 mul(uint64 a, uint64 b, uint64 m)
{
        // Perform 128 multiplication and division
        uint64 q; // q = ⌊a b / m⌋
        uint64 r; // r = a b mod m
        asm("mulq %3;"
            "divq %4;"
            : "=a"(q), "=d"(r)
            : "a"(a), "rm"(b), "rm"(m));
        return r;
}

This should work fast, but I still need to look into Montgomery reduction. Perhaps I should benchmark the two.

Modular exponentiation can now be implemented using exponentiation by squaring:

/// Modular exponentiation
/// @param b The base, b < m
/// @param e The exponent
/// @param m The modulus
/// @returns The reduced power of a, a^b mod m
uint64 pow(uint64 b, uint64 e, uint64 m)
{
        uint64 r = 1;
        for(; e; e >>= 1) {
                if(e & 1)
                        r = mul(r, b, m);
                b = mul(b, b, m);
        }
        return r;
}

And the Miller-Rabin test becomes:

/// Miller-Rabin probabilistic primality testing
/// @param n The number to test for  primality
/// @param k The witness for primality
/// @returns True iff when n is a k-stong pseudoprime
bool MillerRabin(uint64 n, uint64 k)
{
        // Factor n-1 as d*2^s
        uint64 s = 0;
        uint64 d = n - 1;
        for(; !(d & 1); s++)
                d >>= 1;

        // Verify x = k^(d 2^i) mod n != 1
        uint64 x = pow(k % n, d, n);
        if(x == 1 || x == n-1)
                return true;
        while(s-- > 1) {
                // x = x^2 mod n
                x = mul(x, x, n);
                if(x == 1)
                        return false;
                if(x == n-1)
                        return true;
        }
        return false;
}

Together with the prime sieve I created earlier a fast probablilistic prime checking procedure can be developed:

const vector<uint64> small_primes = prime_sieve(1<<20);

/// Miller-Rabin probabilistic primality testing
/// @param n The number to test for  primality
/// @returns False when n is not a prime
bool is_prime(uint64 n)
{
        // Handle small primes fast
        for(int i = 0; i < small_primes.size(); i++) {
                uint64 p = small_primes[i];
                if(n == p)
                        return true;
                if(n % p == 0)
                        return false;
        }

        // Do a few Miller-Rabin rounds
        for(int i = 0; i < 10; i++)
                if(!MillerRabin(n, small_primes[i]))
                        return false;

        // Assume prime
        return true;
}

This algorithm correctly identifies the first ten primes less than $2^{64}$. But then again, so does a single Miller-Rabin round with $K= 17$. Perhaps I should implement the Baillie-PSW algorithm, there are rumours that it has no false positives for 64 bits!

Remco Bloemen
Math & Engineering
https://2π.com