2
public class SomeClass {
    private HashSet<SomeObject> contents = new HashSet<SomeObject>();
    private Set<SomeObject> contents2 = new HashSet<SomeObject>();
}

What's the difference? In the end they are both a HashSet isn't it? The second one looks just wrong to me, but I have seen it frequently used, accepted and working.

Tom
  • 16,842
  • 17
  • 45
  • 54
The Surrican
  • 29,118
  • 24
  • 122
  • 168
  • 4
    I suggest brushing up on general object oriented programming concepts. – colithium Oct 14 '10 at 19:00
  • 1
    To expand on what others have said, the second is programming to an interface, the first is using a specific implementation. – aperkins Oct 14 '10 at 19:03
  • http://pragmaticjava.blogspot.com/2008/08/program-to-interface-not-implementation.html – helpermethod Oct 14 '10 at 19:05
  • 3
    @colithium - I think that's exactly what the OP is doing... – Tony Ennis Oct 14 '10 at 19:09
  • okay thanks guys i got it :) altough i find it a bit contradicting that the majority supports a not so strict declaration. doesn't this somehow undermine the concept? please bear with me i come from the world of php where you can multiply strings ... – The Surrican Oct 14 '10 at 19:38
  • @Joe, for private variables only visible inside of classes, it honestly doesn't matter. But for public facing properties, it gives you some leeway in that you can completely switch out how it's implemented without affecting the code that is using your class. Also, when accepting parameters, callers like it when you accept the most general thing possible so they don't have to fanangle their data structure to match yours exactly – colithium Oct 14 '10 at 19:57
  • @Tony To be on topic, answers to this question can't go into depth about the relevant OO concepts. A general primer would be helpful in my opinion. – colithium Oct 14 '10 at 19:58

5 Answers5

22

Set is an interface, and HashSet is a class that implements the Set interface.

Declaring the variable as type HashSet means that no other implementation of Set may be used. You may want this if you need specific functionality of HashSet.

If you do not need any specific functionality from HashSet, it is better to declare the variable as type Set. This leaves the exact implementation open to change later. You may find that for the data you are using, a different implementation works better. By using the interface, you can make this change later if needed.

You can see more details here: When should I use an interface in java?

Community
  • 1
  • 1
Alan Geleynse
  • 24,821
  • 5
  • 46
  • 55
  • I would just add a link to this question : When best to use un interface in java, http://stackoverflow.com/questions/2586389/when-best-to-use-an-interface-in-java – amirouche Oct 14 '10 at 19:01
  • Thanks, I added the link since it seems to have additional details that I did not include about when to use each. – Alan Geleynse Oct 14 '10 at 19:03
  • @Tony Ennis - Thanks! This is by far the most votes I have had on an answer so far. – Alan Geleynse Oct 14 '10 at 20:29
3

Set is a collection interface that HashSet implements.

The second option is usually the ideal choice as it's more generic.

Tyler Treat
  • 14,640
  • 15
  • 80
  • 115
3

Since the HashSet class implements the Set interface, its legal to assign a HashSet to a Set variable. You could not go the other way however (assign a Set to a more specific HashSet variable).

stew
  • 11,276
  • 36
  • 49
3

Set is an interface that HashSet implements, so if you do this:

Set<E> mySet = new HashSet<E>();

You will still have access to the functionality of HashSet, but you also have the flexibility to replace the concrete instance with an instance of another Set class in the future, such as LinkedHashSet or TreeSet, or another implementation.

The first method uses a concrete class, allowing you to replace the class with an instance of itself or a subclass, but with less flexibility. For example, TreeSet could not be used if your variable type was HashSet.

This is Item 52 from Joshua Bloch's Effective Java, 2nd Edition.

Refer to Objects by their interfaces

... You should favor the use of interfaces rather than classes to refer to objects. If appropriate interface types exist, then parameters, return values, variables, and fields should all be declared using interface types. The only time you really need to refer to an object's class is when you're creating it with a constructor...

// Usually Good - uses interface as type

List<T> tlist = new Vector<T>();

// Typically Bad - uses concrete class as type!

Vector<T> vec = new Vector<T>();

This practice does carry some caveats - if the implementation you want has special behavior not guaranteed by the generic interface, then you have to document your requirements accordingly.

For example, Vector<T> is synchronized, whereas ArrayList<T> (also an implementer of List<T>) does not, so if you required synchronized containers in your design (or not), you would need to document that.

Community
  • 1
  • 1
wkl
  • 77,184
  • 16
  • 165
  • 176
2

One thing worth to mention, is that interface vs. concrete class rule is most important for types exposed in API, eg. method parameter or return type. For private fields and variables it only ensures you aren't using any methods from concrete implementation (i.e. HashSet), but then it's private, so doesn't really matter.

Another thing is that adding another type reference will slightly increase size of your compiled class. Most people won't care, but these things adds up.

Eugene Kuleshov
  • 31,461
  • 5
  • 66
  • 67