20

EnumSet, as old as the enum itself (both since Java 5), is supposed to be a noncompromizing replacement for the use case of bitfields: as fast and lean as the bitfield (well, except for not being a primitive type), and typesafe to boot. On the other hand, the most recent and for years the most anticipated Java API—the Streams API—unashamedly employs bitfields for Spliterator's characteristics.

Should I consider the above as a clear admission by the core Java experts that EnumSet is not that good after all? Should I reconsider the common best-practice advice to never use bitfields?

assylias
  • 321,522
  • 82
  • 660
  • 783
Marko Topolnik
  • 195,646
  • 29
  • 319
  • 436
  • 1
    Do you mean `Collector.Characteristics`? – fge Mar 23 '14 at 20:17
  • 2
    I don't really understand -- when you speak about bitfields, do you talk about the actual implementations of the stream classes? Also, internally an EnumSet is just a bitfield (even a single long if less than 64 values) – fge Mar 23 '14 at 20:20
  • 1
    @fge, Rohit: sorry for the confusion, the question is constrained only to Collector characteristics. See [`Spliterator.characterstics()`](http://download.java.net/jdk8/docs/api/java/util/Spliterator.html#characteristics--), for example. – Marko Topolnik Mar 23 '14 at 20:22

2 Answers2

24

Was rather suprised to see that it is using bitfields rather than EnumSet. The rational though is discussed in this mailing list thread. It seems like the reason was to be able to set and unset various characteristics without affecting the one on the caller end. With an EnumSet, to implement this, one would need to create a new EnumSet object every time there is a need to change it in different stages. I guess this is the reason why bit fields wins the race there.

The concluding sentence of that thread essentially anticipates your question here:

The presence of such flags in a Java 8 API would (and should) raise a lot of eyebrows, because it goes against what people have been told for well over a decade. If it's adopted as is, there had better be a good explanation for doc readers of why alternatives were rejected. "We were comfortable with int flags and nothing else significantly better suggested itself" won't cut it. "We know int flags aren't great for an API, but we tried very hard to find better alternatives, to no avail" would (if it were true).

Marko Topolnik
  • 195,646
  • 29
  • 319
  • 436
Rohit Jain
  • 209,639
  • 45
  • 409
  • 525
  • 2
    Great reference, thanks! Seems like I hit a real sore spot---Josh Bloch defending his enums vs. Doug Lea the master of C :) – Marko Topolnik Mar 23 '14 at 20:45
  • 1
    @MarkoTopolnik You're welcome :) I'm enjoying the discussion myself. Two greats there. Great source of learning. – Rohit Jain Mar 23 '14 at 20:46
  • I find it interesting that Java programmers are apparently viewed as "people that must be told something". – Ingo Mar 23 '14 at 21:00
  • 2
    @ingo I sure want to be told how the creators of Java intend their language to be used. If they say "use EnumSet", rest assured I will want to use it. – Marko Topolnik Mar 23 '14 at 21:03
  • 1
    @Ingo Actually its with other languages too. Take Python for example. When I started learning it, it seem to me like there were a set of rules that every Python programmer is expected to follow. It was in fact strange for me. Talking about this one, I guess introduction of `EnumSet` in Java was clear indication that, we as a Java programmer can avoid using bit fields, as there is a better alternative. The above post in the thread is only talking about the situation where people might follow from the source code that we should prefer bitfields over `EnumSet`. – Rohit Jain Mar 23 '14 at 21:04
  • 3
    @RohitJain Now that you mentioned it, let me give you an opposite example: ruby. The advice there is "do whatever you like, and here is a complete arsenal of intriguingly imaginative ways to shoot yourself in the foot". – Marko Topolnik Mar 23 '14 at 21:06
  • Simon Peyton Jones, the author of the Glasgow Haskell Compiler (GHC) recently gave a talk about Edward Kmetts "Lens" package. On that occasion he expressed astonishment about what can be done in Haskell, things he himself never imagined. So, the notion that "the creators" are somehow all-knowing and all-anticipating is a quite naive one, and probably applies only to very primitive things. Java, of course, is not one of those prmitive things. I say this without the intention to offend anyone. – Ingo Mar 23 '14 at 22:44
-2

Should I reconsider the common best-practice advice to never use bitfields?

Yes. You should generally reconsider any advice that contains the words "always" or "never", no matter if it is "common" or not so common.

Ingo
  • 36,037
  • 5
  • 53
  • 100
  • 4
    As a general statement, that sure works. However, I am asking specifically about EnumSets, which should by design be virtually a *replacement* for bitfields. It should be as strong as the advice to always use `ArrayList` instead of `Vector`. – Marko Topolnik Mar 23 '14 at 20:26
  • When it is about speed, you just want a shift and a mask operation instead `invokevirtual`. I find it ok as long as you don't expose the bitfield. – Ingo Mar 23 '14 at 20:34
  • But consider that `invokevirtual` will in practice result in inlined code after JITting---and then it will look almost exactly the same as if there was a in instance bitfield variable. – Marko Topolnik Mar 23 '14 at 20:36
  • The fact that a bitfield is a primitive type sounds to me like the largest difference. However, I do find it atypical for the general Java mindset to so uncompromizingly choose performance over every other concern. – Marko Topolnik Mar 23 '14 at 20:37
  • The JIT can inline some code, but this code must go through an extra reference. (It would be cool if the JIT could inline data also!). The general Java mindset is to know next to nothing about bits, bytes and so on, but I guess the JDK maintainers are a bit more elevated. For some people, shift and bit masks are as natural as + and -, and nobody argues that one should not use `ìnt`s, so there you have it that the advice against bitfields is non-sensical, at least non-consequential. – Ingo Mar 23 '14 at 20:57
  • 2
    Actually, the thread which Rohit found and linked in his answer tells a completely different story---basically it anticipates my worries here. It has been the position of the *core Java team themselves* that EnumSet is a surefire replacement for bitfields. Josh Bloch chimed in to protest against the bitfield in the Streams API. – Marko Topolnik Mar 23 '14 at 21:01
  • 1
    Oh yes, another interesting topic in that thread (it's really worth a read!) is that those guys---like you here---lament the lack of value types, and quote exactly that as the tipping argument in favor of bitfields. – Marko Topolnik Mar 23 '14 at 21:16
  • @MarkoTopolnik They know their clientel. Nevertheless I do not understand why then we're not supposed to `new Integer(42).plus(new Integer(2))`. We see here on SO that the concept of primitive types is a deep mystery to some Java programmers. – Ingo Mar 23 '14 at 22:24
  • 1
    It's because `int`s are typesafe when used as integers, but not so when used for enum constants or bitfields. This is not at all about object vs. primitive type. – Marko Topolnik Mar 23 '14 at 22:26
  • @MarkoTopolnik This, of course, is not quite true. Javas type system lack some useful features here, like the possibility to say something like `type Minutes = int;` so that minutes can only get added to minutes. Int is not per se typesafe, nothing prevents you from adding dollars and durations and subtract the cardinality of a set from it. – Ingo Mar 23 '14 at 22:36
  • And how would the use of `Integer` help? – Marko Topolnik Mar 23 '14 at 22:39
  • It doesn't. It is the counter-argument for another claim. The type safety with bitsets is "dont have elements of different enums in a set". It is on the same level as adding dollar-cent amounts and number of chickens. Have you ever "been told" not to do this? Or not to use `int` lest it happens by accident? So, it comes down to "Let us give millions of programmers easy recipes." And when it smells like that, I shudder. – Ingo Mar 23 '14 at 22:50
  • So do you claim that an EnumSet has the same type-safety issues which you attribute to ints? What exactly mistake could I do if the API used EnumSets instead of bitfields? – Marko Topolnik Mar 23 '14 at 23:11
  • No, the other way. Bitfields and ints have same type safety. ENumSet is safer, but, unfortunately, not as lightweight as one would wish if one *needs* a bit set. – Ingo Mar 23 '14 at 23:53
  • By "bitfield" I assumed the usage of a primitive int as a bitfield. Anyway, I think you have only shown that the gap in terms of type safety between an int and an EnumSet is even wider than I had in mind, which means that the decision to use an int needs of an even stronger justification. – Marko Topolnik Mar 24 '14 at 06:25