You can let your object remember if it has been put in that hashset. Just have a boolean field to store if it was added to the hash set. Then you don't need to call contains on the HashSet but just read the field value of your object. This method will only work if the object is put in exactly one hashset that will check the boolean field.
It might be extended to a constant number of hashsets using java.util.BitSet
in the object contained in the hashset where every hashset can be identified by a unique integer when the number of hashsets is known before the algorithm starts.
Since you are saying that you are calling contains
frequently, it makes sense to replace newly generated objects with equal existing objects (object pooling), since the overhead of that will amortize by having contains being only a single field read.
As requested here is some sample code. The special set implementation is about 4 times faster than a normal hash set on my machine. However the question is how well this code reflects your use case.
public class FastSetContains {
public static class SetContainedAwareObject {
private final int state;
private boolean contained;
public SetContainedAwareObject(int state) {
this.state = state;
}
public void markAsContained() {
contained = true;
}
public boolean isContained() {
return contained;
}
public void markAsRemoved() {
contained = false;
}
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + state;
return result;
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
SetContainedAwareObject other = (SetContainedAwareObject) obj;
if (state != other.state)
return false;
return true;
}
}
public static class FastContainsSet extends
HashSet<SetContainedAwareObject> {
@Override
public boolean contains(Object o) {
SetContainedAwareObject obj = (SetContainedAwareObject) o;
if (obj.isContained()) {
return true;
}
return super.contains(o);
}
@Override
public boolean add(SetContainedAwareObject e) {
boolean add = super.add(e);
e.markAsContained();
return add;
}
@Override
public boolean addAll(Collection<? extends SetContainedAwareObject> c) {
boolean addAll = super.addAll(c);
for (SetContainedAwareObject o : c) {
o.markAsContained();
}
return addAll;
}
@Override
public boolean remove(Object o) {
boolean remove = super.remove(o);
((SetContainedAwareObject) o).markAsRemoved();
return remove;
}
@Override
public boolean removeAll(Collection<?> c) {
boolean removeAll = super.removeAll(c);
for (Object o : c) {
((SetContainedAwareObject) o).markAsRemoved();
}
return removeAll;
}
}
private static final Random random = new Random(1234L);
private static final int additionalObjectsPerIteration = 10;
private static final int iterations = 100000;
private static final int differentObjectCount = 100;
private static final int containsCountPerIteration = 50;
private static long nanosSpentForContains;
public static void main(String[] args) {
Map<SetContainedAwareObject, SetContainedAwareObject> objectPool = new HashMap<>();
// switch comment use different Set implementaiton
//Set<SetContainedAwareObject> set = new FastContainsSet();
Set<SetContainedAwareObject> set = new HashSet<>();
//warm up
for (int i = 0; i < 100; i++) {
addAdditionalObjects(objectPool, set);
callSetContainsForSomeObjects(set);
}
objectPool.clear();
set.clear();
nanosSpentForContains = 0L;
for (int i = 0; i < iterations; i++) {
addAdditionalObjects(objectPool, set);
callSetContainsForSomeObjects(set);
}
System.out.println("nanos spent for contains: " + nanosSpentForContains);
}
private static void callSetContainsForSomeObjects(
Set<SetContainedAwareObject> set) {
int containsCount = set.size() > containsCountPerIteration ? set.size()
: containsCountPerIteration;
int[] indexes = new int[containsCount];
for (int i = 0; i < containsCount; i++) {
indexes[i] = random.nextInt(set.size());
}
Object[] elements = set.toArray();
long start = System.nanoTime();
for (int index : indexes) {
set.contains(elements[index]);
}
long end = System.nanoTime();
nanosSpentForContains += (end - start);
}
private static void addAdditionalObjects(
Map<SetContainedAwareObject, SetContainedAwareObject> objectPool,
Set<SetContainedAwareObject> set) {
for (int i = 0; i < additionalObjectsPerIteration; i++) {
SetContainedAwareObject object = new SetContainedAwareObject(
random.nextInt(differentObjectCount));
SetContainedAwareObject pooled = objectPool.get(object);
if (pooled == null) {
objectPool.put(object, object);
pooled = object;
}
set.add(pooled);
}
}
}
Anothe Edit:
using the following as the Set.contains implementation makes it about 8 times faster than a normal hashset:
@Override
public boolean contains(Object o) {
SetContainedAwareObject obj = (SetContainedAwareObject) o;
return obj.isContained();
}
EDIT:
This technique has a bit with the class enhancement of OpenJPA in common. The enhancement of OpenJPA enables a class to track its persistent state which is used by the entity manager. The suggested method enables an object to track if itself is contained in a set which is used by the algorithm.