2009-11-23 (updated 2014-03-04)

Sieve of Erastosthenes in C++ and STL

The von Staudt-Clausen implementation in one of my previous posts required a list of small primes. A fast way to calculate them is using the Sieve of Erastosthenes. Here’s an efficient implementation in C++ and STL:

vector<uint64> prime_sieve(uint64 max)
{
        vector<bool> sieve((max+1)/2);
        vector<uint64> primes;
        primes.reserve(floor(boost::math::expint(log(max))));
        primes.push_back(2);
        for(uint64 i=3; i <= max; i+=2) {
                if(sieve[(i-3)/2])
                        continue;
                for(uint64 m=i*i; m <= max; m+=2*i)
                        sieve[(m-3)/2] = true;
                primes.push_back(i);
        }
        return primes;
}

This implementation employs a number of well known tricks:

Using a packed bitmap to store the list of numbers. This one is easy, since vector<bool> is one of the best examples of template specialization.
Reserving space for the prime list. The logarithmic integral function provides a tight upper bound for the number of primes, this prevents a reallocations. (Well, it’s not always an upper bound, see Skewes' number, but it is for $x < 1.397\!\cdot\!10^{316}$ , which can not be attained in 64 bits). This line can be removed to drop the boost dependency.
Skipping all even numbers. The only even prime, two, is special cased and the loop is run two steps at a time.
For every prime $p$ only multiples $\geq p^2$ are marked, since the other numbers have smaller prime factors and are already marked
The previous trick also ensures that the algorithm stops marking as soon as the loop reaches $\sqrt{\mathrm{max}}$ .
There’s no separate prime collecting phase, primes are pushed on the list as soon as they are found. Don’t know if this is really faster if you take cache locality into account, but it gives a nice and short implementation.

If you need something faster than this, you could look into the Sieve of Atkin and D. J. Bernsteins implementation.