I wrote a Java program for Project Euler #35: Circular Primes:
The number, 197, is called a circular prime because all rotations of the digits: 197, 971, and 719, are themselves prime.
There are thirteen such primes below 100: 2, 3, 5, 7, 11, 13, 17, 31, 37, 71, 73, 79, and 97.
How many circular primes are there below one million?
My code compiles and runs fine, however, it gives different results depending on the data structure I use.
The algorithm works like this:
Get pre-calculated primes. This is the call to
MathUtils.getPrimes(1000000)
, which gets all primes equal to or less than a million. I store this in anotherSet
because it is implemented by returning a subset, and performance is ridiculously terrible unless I copy the primes into their own data structure.While the set of primes is not empty, get the next prime.
Get all rotations of that prime. E.g. 197, 971, 719. These rotations need not be prime themselves, because I need to verify them anyway.
If the prime set contains all of the rotations, add the count of the rotations to a running total.
Remove all of the rotations from the set of primes if they exist.
I noticed two weird things about this code. If I use a TreeSet
to store the primes the performance is very fast and it produces correct results:
Answer: 55
Time: 76ms
If I switch to a HashSet
the performance is a lot worse and the results are incorrect.
Answer: 50
Time: 2527ms
I put code at the top to double check that the sets both contain the same values before the code runs, and they always do.
Why does using the
HashSet
produce incorrect results as compared to theTreeSet
? There are no nulls or other weird values, only positive, distinctInteger
instances. The sets start out containing the exact same data. The algorithm is the same, because it is the exact same code. Due to ordering differences between the implementations and the size of the data, it is nigh impossible to compare the state of the algorithms as they are running. If I reduce the input size, the two produce the same results up to 100,000.Why does the
TreeSet
perform so much faster than theHashSet
, when it has to perform all of those removals and tree rotations that are not applicable to aHashSet
? Looking at the code forHashMap
which backsHashSet
, there is no resizing or shuffling of contents going on other than that localized to a specific bin. Furthermore, the primes are fairly evenly distributed. While there is no easy way to validate, I would expect that there would not be the worst-case performance issue of many items occupying a small number of bins in the table.
Code follows. You can switch the Set
implementation by swapping the variable names at the top.
import java.util.Collection;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.NavigableSet;
import java.util.TreeSet;
public class Problem_0035 {
public static void main(String[] args) {
// Swap these two variable names to compare.
Collection<Integer> primes = new TreeSet<>(sieve(1000000));
Collection<Integer> primes2 = new HashSet<>(sieve(1000000));
if (!primes.containsAll(primes2) || !primes2.containsAll(primes)
|| (primes.size() != primes2.size())) {
System.out.println("Primes are not the same!");
}
final long start = System.currentTimeMillis();
int result = 0;
// Keep getting a prime and checking for its rotations. Remove the primes checked.
while (!primes.isEmpty()) {
Integer next = primes.iterator().next();
Collection<Integer> rotations = getRotations(next);
if (primes.containsAll(rotations)) {
result += rotations.size();
}
primes.removeAll(rotations);
}
System.out.println("Answer: " + result);
// 55
System.out.println("Time: " + (System.currentTimeMillis() - start) + "ms");
}
/** Enumerate all rotations of the given integer. */
private static Collection<Integer> getRotations(Integer argValue) {
Collection<Integer> results = new LinkedList<>();
final int start = argValue.intValue();
// Count the digits
int magnitude = 1;
for (int i = start; i > 9; i /= 10) {
magnitude *= 10;
}
int current = start;
do {
results.add(Integer.valueOf(current));
current = ((current % 10) * magnitude) + (current / 10);
} while (current != start);
return results;
}
/** Sieve of Eratosthenes. */
private static Collection<Integer> sieve(int argCeiling) {
NavigableSet<Integer> primes = new TreeSet<>();
for (int i = 2; i <= argCeiling; ++i) {
primes.add(Integer.valueOf(i));
}
for (Integer number = primes.first(); number != null; number = primes.higher(number)) {
int n = number.intValue();
for (int i = n * 2; i <= argCeiling; i += n) {
primes.remove(Integer.valueOf(i));
}
}
return primes;
}
//
// Filter the set through this method to remove the problematic primes.
// See answers for an explanation.
//
/**
* Any prime number with a zero or five anywhere in its number cannot have prime
* rotations, since no prime can end in five or zero. Filter those primes out.
*/
private static Collection<Integer> filterImpossiblePrimes(Collection<Integer> in) {
Collection<Integer> out = new TreeSet<>();
for (Integer prime : in) {
if (!willBeRotatedComposite(prime)) {
out.add(prime);
}
}
return out;
}
/** If the prime is guaranteed to be rotated to a composite, return true. */
private static boolean willBeRotatedComposite(Integer prime) {
int p = prime.intValue();
boolean result = false;
if (p > 10) {
while (p > 0) {
// Primes must end in 1, 3, 7, or 9. Filter out all evens and 5s.
if ((p % 5 == 0) || (p % 2 == 0)) {
result = true;
break;
}
p /= 10;
}
}
return result;
}
}