Comparing
You can achieve this without the use of any library, just using java's Comparator
For instance, with the following object
public class A {
private String a;
private Double b;
private String c;
private int d;
// getters and setters
}
You can use a comparator like
Comparator<AA> comparator = Comparator.comparing(AA::getA)
.thenComparing(AA::getB)
.thenComparingInt(AA::getD);
This compares the fields a
, b
and the int d
, skipping c
.
The only problem here is that this won't work with null values.
Comparing nulls
One possible solution to do a fine grained configuration, that is allow to check for specific null fields is using a Comparator
class similar to:
// Comparator for properties only, only writed to be used with Comparator#comparing
public final class PropertyNullComparator<T extends Comparable<? super T>>
implements Comparator<Object> {
private PropertyNullComparator() { }
public static <T extends Comparable<? super T>> PropertyNullComparator<T> of() {
return new PropertyNullComparator<>();
}
@Override
public int compare(Object o1, Object o2) {
if (o1 != null && o2 != null) {
if (o1 instanceof Comparable) {
@SuppressWarnings({ "unchecked" })
Comparable<Object> comparable = (Comparable<Object>) o1;
return comparable.compareTo(o2);
} else {
// this will throw a ccn exception when object is not comparable
@SuppressWarnings({ "unchecked" })
Comparable<Object> comparable = (Comparable<Object>) o2;
return comparable.compareTo(o1) * -1; // * -1 to keep order
}
} else {
return o1 == o2 ? 0 : (o1 == null ? -1 : 1); // nulls first
}
}
}
This way you can use a comparator specifying the allowed null fields.
Comparator<AA> comparator = Comparator.comparing(AA::getA)
.thenComparing(AA::getB, PropertyNullComparator.of())
.thenComparingInt(AA::getD);
If you don't want to define a custom comparator you can use something like:
Comparator<AA> comparator = Comparator.comparing(AA::getA)
.thenComparing(AA::getB, Comparator.nullsFirst(Comparator.naturalOrder()))
.thenComparingInt(AA::getD);
Difference method
The difference (A - B) method could be implemented using two TreeSets
.
static <T> TreeSet<T> difference(Collection<T> c1,
Collection<T> c2,
Comparator<T> comparator) {
TreeSet<T> treeSet1 = new TreeSet<>(comparator); treeSet1.addAll(c1);
if (treeSet1.size() > c2.size()) {
treeSet1.removeAll(c2);
} else {
TreeSet<T> treeSet2 = new TreeSet<>(comparator); treeSet2.addAll(c2);
treeSet1.removeAll(treeSet2);
}
return treeSet1;
}
note: a TreeSet
makes sense to be used since we are talking of uniqueness with a specific comparator. Also could perform better, the contains
method of TreeSet
is O(log(n))
, compared to a common ArrayList
that is O(n)
.
Why only a TreeSet
is used when treeSet1.size() > c2.size()
, this is because when the condition is not met, the TreeSet#removeAll
, uses the contains
method of the second collection, this second collection could be any java collection and its contains
method its not guaranteed to work exactly the same as the contains
of the first TreeSet
(with custom comparator).
Edit (Given the more context of the question)
Since collection1 is a set that could contains repeated elements acording to the custom equals
(not the equals
of the object) the solution already provided in the question could be used, since it does exactly that, without modifying any of the input collections and creating a new output set.
So you can create your own static function (because at least i am not aware of a library that provides a similar method), and use the Comparator
or a BiPredicate
.
static <T> Set<T> difference(Collection<T> collection1,
Collection<T> collection2,
Comparator<T> comparator) {
collection1.stream()
.filter(element1 -> !collection2.stream()
.anyMatch(element2 -> comparator.compare(element1, element2) == 0))
.collect(Collectors.toSet());
}
Edit (To Eugene)
"Why would you want to implement a null safe comparator yourself"
At least to my knowledge there isn't a comparator to compare fields when this are a simple and common null, the closest that i know of is (to raplace my sugested PropertyNullComparator.of()
[clearer/shorter/better name can be used]):
Comparator.nullsFirst(Comparator.naturalOrder())
So you would have to write that line for every field that you want to compare. Is this doable?, of course it is, is it practical?, i think not.
Easy solution, create a helper method.
static class ComparatorUtils {
public static <T extends Comparable<? super T>> Comparator<T> shnp() { // super short null comparator
return Comparator.nullsFirst(Comparator.<T>naturalOrder());
}
}
Do this work?, yes this works, is it practical?, it looks like, is it a great solution? well that depends, many people consider the exaggerated (and/or unnecessary) use of helper methods as an anti-pattern, (a good old article by Nick Malik). There are some reasons listed there, but to make things short, this is an OO language, so OO solutions are normally preferred to static helper methods.
"As stated in the documentation : Note that the ordering maintained by a set (whether or not an explicit comparator is provided must be consistent with equals if it is to correctly implement the Set interface. Further, the same problem would arise in the other case, when size() > c.size() because ultimately this would still call equals in the remove method. So they both have to implement Comparator and equals consistently for this to work correctly"
The javadoc says of TreeSet the following, but with a clear if:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface
Then says this:
See Comparable or Comparator for a precise definition of consistent with equals
If you go to the Comparable javadoc says:
It is strongly recommended (though not required) that natural orderings be consistent with equals
If we continue to read the javadoc again from Comparable (even in the same paragraph) says the following:
This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all key comparisons using its compareTo (or compare ) method, so two keys that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.
By this last quote and with a very simple code debug, or even a reading, you can see the use of an internal TreeMap, and that all its derivated methods are based on the comparator
, not the equals
method;
"Why is this so implemented? because there is a difference when removing many elements from a little set and the other way around, as a matter of fact same stands for addAll"
If you go to the definition of removeAll
you can see that its implementation is in AbstractSet
, it is not overrided. And this implementation uses a contains
from the argument collection when this is larger, the beavior of this contains
is uncertain, it isn't necessary (nor probable) that the received collection (e.g. list, queue, etc) has/can define the same comparator.
Update 1:
This jdk bug is being discussed (and considerated to be fixed) in here https://bugs.openjdk.java.net/browse/JDK-6394757