27

I'm trying to make a shallow copy of a HashSet of Points called myHash. As of now, I have the following:

HashSet<Point> myNewHash = (HashSet<Point>) myHash.clone();

This code gives me an unchecked cast warning however. Is there a better way to do this?

Tim
  • 4,295
  • 9
  • 37
  • 49

2 Answers2

51

You can try this:

HashSet<Point> myNewHash = new HashSet<Point>(myHash);
Ted Hopp
  • 232,168
  • 48
  • 399
  • 521
  • 2
    yet this code is equal to `addAll(myHash)`, so it iterates over the collection and adds every element by hand, while `clone()` does a faster recreation of the underlying Map using package-private method `putMapEntries()` ... *sigh* that's exactly why I wrote my own HashMap/HashSet ages ago, and that's exactly why so many Collection replacements exist to day... –  Apr 07 '15 at 02:07
  • @vaxquis - Yes, Oracle's implementation for this constructor simply creates an empty `HashSet` and then calls `addAll`. The only differences between using the constructor and calling `addAll` by hand are readability and that the constructor packages up the capacity calculation to achieve the desired loading factor of 0.75. Do you have any performance data regarding your approach versus the stock constructor? – Ted Hopp Apr 07 '15 at 11:11
  • @TedHopp yup, I had a bottleneck in my parser engine, and it boiled down to HashMap efficiency. Replacing HashMap-based HashSet with a real one (not storing the value, only the key), replacing both with a custom-tailored versions with e.g. unsafe iterators, providing a *real* immutable versions (doing precalcs on data, possible because of using with builder pattern) of them and doing set/map copying by means of System.arraycopy(). Got 20% performance boost, that's good enough for me. Also, as a side note, Microsoft/C# uses higher load factors, up to 1.0, for some data sets it's quite better. –  Apr 07 '15 at 13:13
  • @vaxquis - 20% performance improvement is huge; well done. For some reason, the Java team decided to make the 75% load factor part of the spec. They did provide a two-argument constructor that allows controlling the load factor of the new `HashSet`. Using `clone()` will preserve whatever loading factor happened to be in effect with the original set (good or bad). – Ted Hopp Apr 07 '15 at 15:32
  • @TedHopp it's just that `java.util` sucks in terms of speed, and it's not even remotely hard to beat it, being far from the tuning of e.g. C++ stdlib - I made only some simple hacks, while e.g. Trove guys (http://trove.starlight-systems.com/) did all what I did and a lot of more, their code can get even 2x-3x faster in some situations AFAIR. http://stackoverflow.com/questions/629804/what-is-the-most-efficient-java-collections-library clearly shows there's a good room for Collections improvement. –  Apr 11 '15 at 21:44
12

A different answer suggests using new HashSet<Point>(myHash). However, the intent of clone() is to obtain a new object of the same type. If myHash is an instance of a subclass of HashSet, any additional behavior added by subclassing will be lost by using new HashSet<Point>(myHash).

An unchecked cast warning is just a warning. There are many situations in which the cast is safe, but the compiler just isn't smart enough to determine that it is safe. You can, however, isolate the warning into a single method that can be annotated with @SuppressWarnings("unchecked"):

@SuppressWarnings("unchecked")
static <T implements Cloneable> clone(T o) { return (T)(o.clone()); }
Atsby
  • 2,277
  • 12
  • 14