19

If a String object is immutable (and thus obviously cannot change its length), why is length() a method, as opposed to simply being public final int length such as there is in an array?

Is it simply a getter method, or does it make some sort of calculation?

Just trying to see the logic behind this.

Acidic
  • 6,154
  • 12
  • 46
  • 80
  • 6
    Why don't you look at the source and find out? – skaffman Jan 03 '12 at 23:40
  • 1
    What benefit would obtain if it were a field? Conversely, why didn't you ask why arrays have a `length` field rather than a `length()` method? – erickson Jan 03 '12 at 23:44
  • 1
    The classes that date back to the early versions of the JDK don't show much consistency – Strings have a `length()` method, arrays have a field, Collections have `size()`, neither expose JavaBeans getters for these (while the DOM `NodeList` does.) I'd chalk this up to a design wart preserved for the sake of compatibility but can't really back the opinion with anything. – millimoose Jan 03 '12 at 23:47
  • 1
    @erickson because for me it is obvious that a public field provides better performance than an accessor method. (at least before the method gets inlined, if at all) As far as I understand, there is no reason to create a getter on a `final` variable. (unless the getters does something more than simply returning the variable, which doesn't seem to be the case.) – Acidic Jan 03 '12 at 23:47
  • 3
    @skaffman, that wouldn't necessarily explain the design decision. It's a good question, actually. Perhaps you can take a look at the source code and explain why it's `length()` and not `length` - I don't see anything to indicate the "why". – Paul Jan 03 '12 at 23:48
  • @erickson Not violating conventions of the platform / programming paradigm would be one benefit. Methods (at least in Javaland) should be verbs, not nouns. – millimoose Jan 03 '12 at 23:49
  • 1
    It can provide better performance, but in practice, it makes no difference. Direct field references break encapsulation, and are a bad design choice---especially when driven by unfounded beliefs about performance. – erickson Jan 03 '12 at 23:50
  • @Inerdial If a collection's size is not permanent, then I think it is obvious that a getter is required since the variable behind it will not be `final` and thus cannot be public. – Acidic Jan 03 '12 at 23:50
  • @erickson so a getter method can never be slower than accessing a public member directly? – Acidic Jan 03 '12 at 23:53
  • Can someone post a link to the String source code? – calebds Jan 03 '12 at 23:53
  • 2
    @Acidic do you want an answer/discussion or do you want to argue about nanoseconds? – user949300 Jan 03 '12 at 23:54
  • @Acidic Most people would prefer having a consistent way of getting the number of elements from collection-like objects rather than trying to indicate (im)mutability by the choice of final field vs. getter. Anyway, what I was trying to say there might not be *any* reasonable explanation. JDK 1.0 was rough around the edges, and some of these just weren't ever smoothed over. There's a good chance the reason is "someone coded it this way and nobody bothered to change it". – millimoose Jan 03 '12 at 23:56
  • 1
    @user949300 I simply want to understand the logic behind the decision of making `length()` a method. nanoseconds or not, writing MORE code for LESS perofrmance with NO benefits seems odd to me. Unless this is not the case - which is what I am trying to find out. – Acidic Jan 03 '12 at 23:56
  • 4
    CharSequence is an example of why their decision was correct. – user949300 Jan 03 '12 at 23:58
  • Hmm, how do you interpret, "It *can* provide better performance," as meaning that field access can't be faster than a getter? – erickson Jan 03 '12 at 23:58
  • @user949300 Unfortunately, I fail to understand the benefit of that. – Acidic Jan 03 '12 at 23:59
  • 1
    The benefit is in usage, not performance. Libraries are written for the benefit of programmers. – erickson Jan 03 '12 at 23:59
  • @erickson I simply wish to make sure that I understand what you are saying; I am not trying to be aggressive, I am simply flabbergasted. Back to the subject, I still see no reason to create a basic getter for a `final` variable, when in similar situations (such as Arrays) a `public final` variable is indeed provided. – Acidic Jan 04 '12 at 00:02
  • 3
    @Acidic The (after the fact) benefit is that you can have multiple implementations of a `CharSequence`. One of these is `StringBuffer`, which is obviously mutable, and couldn't use a field to expose its length. This lets you write code that can work with both `String` and `StringBuffer` instances transparently. While this isn't common, the Wicket framework uses this in its API. – millimoose Jan 04 '12 at 00:05
  • 2
    @Acidic Arrays being "special" and requiring manual conversion to / from collections instead of at least partially implementing `List` (like .NET's arrays do) is, to me, a major annoyance in Java, and not something I'd hold up as a benefit. – millimoose Jan 04 '12 at 00:07
  • @Inerdial I did not bring an array as an example to why it SHOULD be a member, simply as an example to an alternative within Java itself. – Acidic Jan 04 '12 at 00:17
  • @erickson, Array *do not* have `length` field. – bestsss Jan 07 '12 at 10:43
  • @Acidic, length() should be a function, `count` may not even exist and instead String can use the underlying array.length which would suffice most of the time. Presently, length() is the same as accessing the field directly once the JIT takes care. Morealso the JIT tends to generate even more optimized code (SSE instructions on x64). As a rule of the thumb fields should not be exposed since they bind the implementation and internal details of the latter. i.e. `length()` is a good decision but CharSequence that was introduced to help NIO's CharBuffer (1.4) is not a good support of the idea. – bestsss Jan 07 '12 at 10:49
  • @bestsss According to *[The Java Language Specification](http://java.sun.com/docs/books/jls/third_edition/html/arrays.html#64347),* as well as the common knowledge of myriads of Java programmers, you are completely wrong. Java arrays have field `public final int length`. – erickson Jan 07 '12 at 23:49
  • @erickson, try this: `new int[0].getClass().getDeclaredField("length")` arrays do not have any fields. While accessed as field, the byte code OP is `ARRAYLENGTH`, not `GETFIELD`. `length` field does not have a reflective counter part and it's handled rather poorly by the spec. by being a syntax only. – bestsss Jan 08 '12 at 00:26
  • @bestsss That is a good point, but I don't find it convincing; the specification is explicit in its language, language designers utilize the field terminology, etc. Obviously there are differences between an array and an `Object`, between a `Class` for an array and a `Class` for other objects. I'm more inclined to say that's simply an overlooked case in reflection. Anyway, can you explain your point? Is there a useful distinction? – erickson Jan 08 '12 at 05:14
  • @erickson, it's "overlooked" in JNI as well. Yes, there is a subtle distinction. There is no difference between Class objects of arrays and the rest objects. However, the field `length` just does not exists as normal field. It's a part of the object header where the value of `System.identityHashCode` is stored. So is `Class` reference. In JNI there is an extra function to get array length unlike using an offset for the rest of the fields. I admit `length` looks like a field in Java source code but outside the source it's not treated as field anywhere else. – bestsss Jan 08 '12 at 11:42
  • ...continuing on the reflection part constructors are handled properly even being a method named ``. The reflection code removes the declared `` (and `` - class initializers) methods in order to provide proper use as `java.lang.Constructor`. Even constructors are declared as methods but invoked a bit differently (`INVOKESPECIAL`) the API designers felt obliged to provide proper interface. Arrays got a special class (java.lang.Array) w/ exclusively static methods to carry the length/new instance and so on, truly treated differently. I was a bit surprise of the length omissi – bestsss Jan 08 '12 at 11:51

8 Answers8

21

Java is a standard, not just an implementation. Different vendors can license and implement Java differently, as long as they adhere to the standard. By making the standard call for a field, that limits the implementation quite severely, for no good reason.

Also a method is much more flexible in terms of the future of a class. It is almost never done, except in some very early Java classes, to expose a final constant as a field that can have a different value with each instance of the class, rather than as a method.

The length() method well predates the CharSequence interface, probably from its first version. Look how well that worked out. Years later, without any loss of backwards compatibility, the CharSequence interface was introduced and fit in nicely. This would not have been possible with a field.

So let's really inverse the question (which is what you should do when you design a class intended to remain unchanged for decades): What does a field gain here, why not simply make it a method?

Yishai
  • 90,445
  • 31
  • 189
  • 263
  • 2
    That is an interesting thought. A field, for me, is less code with equivalent - or better performance. – Acidic Jan 04 '12 at 00:04
  • 5
    With a tradeoff, of less encapsulation. It can't exist as part of an interface. (Interfaces have constants, but that isn't the same thing) and you can't change your mind about the field. Anyway, any supposed performance benefit is a premature optimization. http://en.wikiquote.org/wiki/Donald_Knuth – Yishai Jan 04 '12 at 00:10
  • "..you can't change your mind.." But if a `String` is immutable, what else can its number of characters be denoted by other than a `final` member? – Acidic Jan 04 '12 at 00:16
  • Good point. If Strings ever wanted to become mutable objects, length() would ensure backwards compatibility, where a final member would complicate such an evolution. – calebds Jan 04 '12 at 00:22
  • @Acidic: Well, the String is a `char[]`, an offset, and a length, right? Couldn't it also be a `char[]`, an offset, and a last index? In that case, length would be `end - start`. As a user, you shouldn't care which approach they take, and as a maintainer, you'd want the flexibility of changing approaches down the line. Fields don't give you that, so you end up locked into an approach which may limit you down the line. – yshavit Jan 04 '12 at 00:24
  • 1
    I don't want to sound like a broken record, but I see no reason to implement a method of calculating an accessible (to the `String` class itself) and constant value. I logically see no reason why a constant should ever be more than it is - a constant. – Acidic Jan 04 '12 at 00:28
  • @Acidic The reason would be that you're uncomfortable specifying a property of an object to be immutable (not constant) for the entire future of your codebase. If you're absolutely certain that a given property will never need to change to a mutable one and are willing to commit to this via an irreversible design choice, go for it. Given how the immutability of Strings is a very important property that will never be changed, whoever originally designed their API could've chosen this option. For whichever reason, he did not, and this choice was later justified by the `CharSequence` interface. – millimoose Jan 04 '12 at 00:47
  • 1
    @Acidic @Inerdial: Even then, don't go for it. Make an accessor method, if for no other reason than for consistency. Why should users of your library have to remember that such-and-such property is a field, but so-and-so is a method? If you want to retroactively add an interface (e.g. `CharSequence`), why make users wonder what's the difference between `length` and `length()`? You can't make all properties be fields, but you can make them all be methods, so just play by good OO rules and use accessor methods unless you have a really good reason to do otherwise. – yshavit Jan 04 '12 at 01:46
  • @Acidic, perhaps this will give you some further insight: http://martinfowler.com/bliki/UniformAccessPrinciple.html – Yishai Jan 04 '12 at 16:17
  • By the way, one advantage of calculating it vs. having a field store it is in an environment where memory is very constrained. Forcing that to be stored for every string, instead of calculated when needed (potentially much less often) may cause problems. I wouldn't worry about it, but as a standard, it forces the implementation into a corner to specify a field instead of a method. – Yishai Jan 04 '12 at 16:19
4

Perhaps a .length() method was considered more consistent with the corresponding method for a StringBuffer, which would obviously need more than a final member variable.

The String class was probably one of the very first classes defined for Java, ever. It's possible (and this is just speculation) that the implementation used a .length() method before final member variables even existed. It wouldn't take very long before the use of the method was well-embedded into the body of Java code existing at the time.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • 2
    Your original argument holds some water with `StringBuffer` in place of `StringBuilder`. – erickson Jan 03 '12 at 23:55
  • @erickson: Oh, right! It's been so long since I've used a `StringBuffer`... thanks. – Greg Hewgill Jan 03 '12 at 23:57
  • That seems logical, though I would guess that `Array` was also one of the earliest classes in Java and yet it uses a field instead. – Acidic Jan 04 '12 at 00:08
  • 3
    Arrays in Java are special, they're sort of classes but not really. You'll notice that arrays don't have any *methods* at all. – Greg Hewgill Jan 04 '12 at 00:10
4

This is a fundamental tenet of encapsulation.

Part of encapsulation is that the class should hide its implementation from its interface (in the "design by contract" sense of an interface, not in the Java keyword sense).

What you want is the String's length -- you shouldn't care if this is cached, calculated, delegates to some other field, etc. If the JDK people want to change the implementation down the road, they should be able to do so without you having to recompile.

yshavit
  • 42,327
  • 7
  • 87
  • 124
3

Perhaps because length() comes from the CharSequence interface. A method is a more sensible abstraction than a variable if its going to have multiple implementations.

calebds
  • 25,670
  • 9
  • 46
  • 74
  • 3
    `CharSequence` was introduced in Java 1.4, long after `String` was defined. – Greg Hewgill Jan 03 '12 at 23:48
  • `CharSequence` came along long after the `String` API was defined. However, it's a good point, because if `length` were a field, the `String` class would have to have been retrofitted with a redundant method in order to support an interface (which specifies methods only). Instance fields are not object-oriented, and should not be part of a public API. Objects that use them, like `Point` and arrays are old and inconsistent with later work. – erickson Jan 03 '12 at 23:52
3

You should always use accessor methods in public classes rather than public fields, regardless of whether they are final or not (see Item 14 in Effective Java).

When you allow a field to be accessed directly (i.e. is public) you lose the benefit of encapsulation, which means you can't change the representation without changing the API (you break peoples code if you do) and you can't perform any action when the field is accessed.

Effective Java provides a really good rule of thumb:

If a class is accessible outside its package, provide accessor methods, to preserve the flexibility to change the class's internal representation. If a public class exposes its data fields, all hope of changing its representation is lost, as client code can be distributed far and wide.

Basically, it is done this way because it is good design practice to do so. It leaves room to change the implementation of String at a later stage without breaking code for everyone.

Deco
  • 3,261
  • 17
  • 25
2

String is using encapsulation to hide its internal details from you. An immutable object is still free to have mutable internal values as long as its externally visible state doesn't change. Length could be lazily computed. I encourage you to take a look as String's source code.

Steve Kuo
  • 61,876
  • 75
  • 195
  • 257
  • 3
    I don't see how creating a getter method on a supposedly `final` variable provides any encapsulation. – Acidic Jan 03 '12 at 23:43
  • 1
    The final variable `count` is internal implementation that it doesn't (and shouldn't) expose. Furthermore, length() is specified by CharSequence, which String implements. – Steve Kuo Jan 03 '12 at 23:48
  • If by definition this `final int count` that you speak of is always equal to the return value of `length()`, using a method should provide no benefits and only slow down the system. (until it gets inlined, if it does at all) – Acidic Jan 03 '12 at 23:52
  • 1
    Bug String doesn't define `count`, it's not part of String's definition. The purpose of `length()` is encapsulation. See https://en.wikipedia.org/wiki/Encapsulation_%28object-oriented_programming%29 – Steve Kuo Jan 03 '12 at 23:55
  • `CharSequence` only exists since 1.4. The abstraction is a valid reason for why one might want to make this a method, but it doesn't explain why it was done this way in 1.0. – millimoose Jan 03 '12 at 23:59
  • 1
    As Yishai points out above, `CharSequence` was made possible in part because `String.length()` was done this way in 1.0. That's not an accident -- the Java designers knew that exposing methods instead of fields can have those sorts of benefits. – yshavit Jan 04 '12 at 00:27
1

Checking the source code of String in Open JDK it's only a getter.

But as @SteveKuo points out this could differ dependent on the implementation.

tidbeck
  • 2,363
  • 24
  • 35
1

In most current jvm implementations a Substring references the char array of the original String for content and it needs start and length fields to define their own content, so the length() method is used as a getter. However this is not the only possible way to implement String.

In a different possible implementation each String could have its own char array and since char arrays already have a length field with the correct length it would be redundant to have one for the String object, since String.length() is a method we don't have to do that and can just reference the internal array.length .

These are two possible implementations of String, both with their own good and bad parts and they can replace each other because the length() method hides where the length is stored (internal array or in own field).

josefx
  • 15,506
  • 6
  • 38
  • 63
  • Out of curiosity, what would happen if an implementation had a `String` type which simply included a `char[]` and an `int` (for `hashCode`), a `Tailstring` type which derived from `String`, and included a `StartIndex`, a `HeadString` type which derived from string and included a `PartialLength`, and a `SubString` type which derived from `String` and included both? My guess would be that too much code checks to see if something's type *is* `String` rather than seeing if it's `instanceOf String` for that to work, but it would seem to in some ways offer the best of all possible worlds. – supercat Jan 28 '13 at 00:09
  • @supercat code checking for String would be a problem. Then there is a slight performance penalty for virtual calls, the jvm has to find out at runtime if it has to call String.length(), TailString.length(), HeadString.length() or Substring.length() instead of just calling String.length() directly (this makes JIT optimizations harder) - so this can be a performance vs. memory footprint decision. – josefx Jan 28 '13 at 00:35
  • Well, obviously Java is what it is, but I wonder what would have been the consequences of having `string` as a compiler-recognized type, whose contents would be encapsulated in a `HeapString`. Such a thing would have allowed operators like `==` to perform string comparison when used on the `string` type; strings could be converted to `Object`, as with `Integer`, comparison on strings stored as `Object` would test reference equality. – supercat Jan 29 '13 at 05:21