Since there weren't any takers for the 'Sundaram vs. Eratosthenes' problem, I sat down and analysed it. Result: classic Sundaram's has strictly higher overdraw than an odds-only Eratosthenes; if you apply an obvious, small optimisation then the overdraw is exactly the same - for reasons that shall become obvious. If you fix Sundaram's to avoid overdraw entirely then you get something like Pritchard's Sieve, which is massively more complicated.
The exposition of Sundaram's Sieve in Lucky's Notes is probably the best by far; slightly rewritten to use a hypothetical (i.e. non-standard and not supplied here) type bitmap_t
it looks somewhat like this. In order to measure overdraw the bitmap type needs an operation corresponding to the BTS
(bit test and set) CPU instruction, which is available via the _bittestandset() intrinsic with Wintel compilers and with MinGW editions of gcc. The intrinsics are very bad for performance but very convenient for counting overdraw.
Note: for sieving all primes up to N one would call the sieve with max_bit = N/2; if bit i of the resulting bitmap is set then the number (2 * i + 1) is composite. The function has '31' in its name because the index math breaks for bitmaps greater than 2^31; hence this code can only sieve numbers up to 2^32-1 (corresponding to max_bit <= 2^31-1).
uint64_t Sundaram31_a (bitmap_t &bm, uint32_t max_bit)
{
assert( max_bit <= UINT32_MAX / 2 );
uint32_t m = max_bit;
uint64_t overdraw = 0;
bm.set_all(0);
for (uint32_t i = 1; i < m / 2; ++i)
{
for (uint32_t j = i; j <= (m - i) / (2 * i + 1); ++j)
{
uint32_t k = i + j + 2 * i * j;
overdraw += bm.bts(k);
}
}
return overdraw;
}
Lucky's bound on j
is exact but the one on i
is very loose. Tightening it and losing the m
alias that I had added to make the code look more like common expositions on the net, we get:
uint64_t Sundaram31_b (bitmap_t &bm, uint32_t max_bit)
{
uint32_t i_max = uint32_t(std::sqrt(double(2 * max_bit + 1)) - 1) / 2;
uint64_t overdraw = 0;
bm.set_all(0);
for (uint32_t i = 1; i <= i_max; ++i)
{
for (uint32_t j = i; j <= (max_bit - i) / (2 * i + 1); ++j)
{
uint32_t k = i + j + 2 * i * j;
overdraw += bm.bts(k);
}
}
return overdraw;
}
The assert
was dumped in order to reduce noise but it is actually still valid and necessary. Now it's time for a bit of strength reduction, turning multiplication into iterated addition:
uint64_t Sundaram31_c (bitmap_t &bm, uint32_t max_bit)
{
uint32_t i_max = uint32_t(std::sqrt(double(2 * max_bit + 1)) - 1) / 2;
uint64_t overdraw = 0;
bm.set_all(0);
for (uint32_t i = 1; i <= i_max; ++i)
{
uint32_t n = 2 * i + 1;
uint32_t k = n * i + i; // <= max_bit because that's how we computed i_max
uint32_t j_max = (max_bit - i) / n;
for (uint32_t j = i; j <= j_max; ++j, k += n)
{
overdraw += bm.bts(k);
}
}
return overdraw;
}
Transforming the loop condition to use k
allows us to lose j
; things should be looking exceedingly familiar by now...
uint64_t Sundaram31_d (bitmap_t &bm, uint32_t max_bit)
{
uint32_t i_max = uint32_t(std::sqrt(double(2 * max_bit + 1)) - 1) / 2;
uint64_t overdraw = 0;
bm.set_all(0);
for (uint32_t i = 1; i <= i_max; ++i)
{
uint32_t n = 2 * i + 1;
uint32_t k = n * i + i; // <= max_bit because that's how we computed i_max
for ( ; k <= max_bit; k += n)
{
overdraw += bm.bts(k);
}
}
return overdraw;
}
With things looking the way they do, it's time to analyse whether a certain obvious small change is warranted by the math. The proof is left as an exercise for the reader...
uint64_t Sundaram31_e (bitmap_t &bm, uint32_t max_bit)
{
uint32_t i_max = unsigned(std::sqrt(double(2 * max_bit + 1)) - 1) / 2;
uint64_t overdraw = 0;
bm.set_all(0);
for (uint32_t i = 1; i <= i_max; ++i)
{
if (bm.bt(i)) continue;
uint32_t n = 2 * i + 1;
uint32_t k = n * i + i; // <= m because we computed i_max to get this bound
for ( ; k <= max_bit; k += n)
{
overdraw += bm.bts(k);
}
}
return overdraw;
}
The only thing that still differs from classic odds-only Eratosthenes (apart from the name) is the initial value for k
, which is normally (n * n) / 2
for the old Greek. However, substituting 2 * i + 1
for n
the difference turns out to be 1/2, which rounds to 0. Hence, Sundaram's is odds-only Eratosthenes without the 'optimisation' of skipping composites to avoid at least some crossings-out of already crossed-out numbers. The value for i_max
is the same as the Greek's max_factor_bit
, just arrived at using completely different logical steps and computed using a marginally different formula.
P.S.: after seeing overdraw
in the code so many times, people will probably want to know what it actually is... Sieving the numbers up to 2^32-1 (i.e. a full 2^31 bit bitmap) Sundaram's has an overdraw of 8,643,678,027 (roughly 2 * 2^32) or 444.6%; with the small fix that turns it into odds-only Eratosthenes the overdraw becomes 2,610,022,328 or 134.2%.