19

Wrapper class are just fine and their purpose is also well understood. But why do we omit the primitive type ?

Robert Fraser
  • 10,649
  • 8
  • 69
  • 93
Ravi Gupta
  • 4,468
  • 12
  • 54
  • 85

6 Answers6

44

It depends what you mean by "primitive"

"Primitive" in Java is usually taken to mean "value type". However, C# has a string keyword, which acts exactly the same as Java's String, it's just highlighted differently by the editor. They are aliases for the classes System.String or java.lang.String. String is not a value type in either language, so in this way it's not a primitive.

If by "primitive" you mean built into the language, then String is a primitive. It just uses a capital letter. String literals (those things in quotes) are automatically converted to System.String and + is used for concatenation. So by this token, they (and Arrays) are as primitive as ints, longs, etc.

First, what is a String?

String is not a wrapper. String is a reference type, while primitive types are value types. The means that if you have:

int x = 5;
int y = x;

The memory of x and y both contain "5". But with:

String x = "a";
String y = x;

The memory of x and y both contain a pointer to the character "a" (and a length, an offset, a ClassInfo pointer, and a monitor). Strings behave like a primitive because they're immutable, so it's usually not an issue, however if you, say, used reflection to change the contents of the string (don't do this!), both x and y would see the change. In fact if you have:

char[] x = "a".toCharArray();
char[] y = x;
x[0] = 'b';
System.out.println(y[0] == 'b'); // prints "true"

So don't just use char[] (unless this is the behavior you want, or you're really trying to reduce memory usage).

Every Object is a reference type -- that means all classes you write, every class in the framework, and even arrays. The only things that are value types are the simple numeric types (int, long, short, byte, float, double, char, bool, etc.)

Why isn't String mutable like char[]?

There are a couple reasons for this, but it mostly comes down to psychology and implementation details:

  • Imagine the chaos you'd have if you passed a string into another function and that function changed it somehow. Or what if it saved it somewhere and changed it in the future? With most reference types, you accept this as part of the type, but the Java developers decided that, at least for strings, they didn't want users to have to worry about that.
  • Strings can't be dealt with atomically, meaning multithreading/synchronization would become an issue.
  • String literals (the things you put in your code in quotes) might be immutable at the computer's level1 (for security reasons). This could be gotten around by copying them all into another part of memory when the program starts up or using copy-on-write, but that's slow.

Why don't we have a value-type version of a string?

Basically, performance and implementation details, as well as the complexity of having 2 different string types. Other value types have a fixed memory footprint. An int is always 32 bits, a long is always 64 bits, a bool is always 1 bit, etc.2 Among other things, this means that they can be stored on the stack, so that all parameters to a function live in one place. Also, making gigantic copies of strings all over the place would kill performance.

See also: In C#, why is String a reference type that behaves like a value type?. Refers to .NET, but this is just as applicable in Java.

1 - In C/C++ and other natively-compiled languages, this is true because they are placed in the code segment of the process, which the OS usually stops you from editing. In Java, this is actually usually untrue, since the JVM loads the class files onto the heap, so you could edit a string there. However, there's no reason a Java program couldn't be compiled natively (there are tools which do this), and some architectures (notably some versions of ARM) do directly execute Java bytecode.

2 - In practice, some of these types are a different size at the machine level. E.x. bools are stored as WORD-size on the stack (32 bits on x86, 64 bits on x64). In classes/arrays they may be treated differently. This is all an implementation detail that's left up to the JVM -- the spec says bools are either true or false and the machine can figure out how to do it.

Community
  • 1
  • 1
Robert Fraser
  • 10,649
  • 8
  • 69
  • 93
  • What are other reference types in java ? – Ravi Gupta Jan 21 '10 at 12:12
  • Every class you create and use that isn't one of the primitive types – thecoop Jan 21 '10 at 12:21
  • 2
    @Ravi - All objects are reference types (that is to say ALL classes). The only Java value types are the ones that have keywords (int, char, double, etc.) – Robert Fraser Jan 21 '10 at 12:21
  • 1
    not sure how this answers the question..or am i misunderstanding the question? i thought the question asks, why java does not have a primitive type for string – Aadith Ramia Jan 21 '10 at 12:38
  • It answers my doubt for confusing wrapper with String. But yes, the other part "why we don't have string in java" is still left out. – Ravi Gupta Jan 21 '10 at 12:48
  • In .net, if `String` had been a value type with a single field of type `char[]`, and there were no special CLR handling for it, it could behave largely like `String` does not except (1) the default value could behave like an empty string, and (2) conversion of `String` to `Object` would require boxing of the wrapper (though not duplication of the character data). Things could be slightly improved if (IMHO the optimal design) there were a `HeapString` type which behaves largely as `String` does now, and a `String` value type which encapsulated a `HeapString` type. Actually, Java could have... – supercat Jan 31 '13 at 16:45
  • ...benefited from such a design as well, since if `string` were a compiler-recognized type it could have had an `==` operator based upon character-sequence equality (if two strings were cast to `Object`, however, the `==` operator would test reference equality). – supercat Jan 31 '13 at 16:47
10

The primitive type for String is char[].

This is true for many languages (C, Java, C#, C++ and many more...).

Oded
  • 489,969
  • 99
  • 883
  • 1,009
  • char[] is not primitive as far as i know – Ahmed Kotb Jan 21 '10 at 11:41
  • 1
    `char` is as primitive as it comes, and the array construct is part of the language. Strings _are_ arrays of `char`. – Oded Jan 21 '10 at 11:45
  • Yup, and String is not just a wrapper for a char[] – Robert Fraser Jan 21 '10 at 11:46
  • So it implies that int[], char[] and all arrays of primitive qualify as primitive type ? – Ravi Gupta Jan 21 '10 at 12:15
  • @Ravi - NO! A char[] is actually an Array class that has some special syntax. Arrays are passed by reference and are only "primitive" in the sense that they have special support in the language (but so do Strings). – Robert Fraser Jan 21 '10 at 12:23
  • 4
    Oded: Strings are not arrays of char. The String class may be implemented with an array of char as internal storage (and mostly is), but that is not required by the language or API specification. – jarnbjo Jan 21 '10 at 12:59
3

strings could be of arbitrary length. the fathers of java did not want to have a primitive type for which they could not assign a concrete memory size. this is one of the chief reasons string is not a primitive in java.

Aadith Ramia
  • 10,005
  • 19
  • 67
  • 86
  • hmm..but this sounds more of impementation issue rather than specification one. – Ravi Gupta Jan 21 '10 at 13:20
  • well..depends on how you perceive it..i wouldnt classify it as a total implementation issue..its just one of the considerations that went into the decision. – Aadith Ramia Jan 21 '10 at 17:14
0

String is sort of a special case. All the real primitive types (int, long, etc) are pass-by-value, and implemented directly in the JVM. String is a reference type, and so dealt with like any other class (capital letter, pass-by-reference...), except the compiler has special hooks to deal with it like a built-in type (+ for string concatentation, for example).

As it is already a reference type, it does not need a wrapper class like Integer to be able to use it as a class (in collections, for example)

thecoop
  • 45,220
  • 19
  • 132
  • 189
0

Primitive?

If Java there's no primitive for strings. The primitives are int, float, double, boolean, etc... and char.

So for using strings they've used an object. You instance it, it lives in the heap, you have a reference to it, etc.

How did they implement it? Saving the value it represents in a char array.

Inmutability

But they ensured inmutability. When you have a reference to a String object you know you can pass it freely to other objects knowing the value pointed by that reference will not change. All methods that modifies strings returns other instance of the string so it doesn't change the value represented by other references to String.

Can it be other way (like in .Net)

Yes. They could have defined a reserved word string and the compiler do the transformation.

But they didn't...

helios
  • 13,574
  • 2
  • 45
  • 55
-1

a String is an array of char. As it is an array, it cannot be a primitive ! :-)

Pierre
  • 34,472
  • 31
  • 113
  • 192
  • No it's not - it's an array, plus an offset, plus a length. I agree it's not just a primitive, but there's more to it than a char[]. – Jon Skeet Jan 21 '10 at 11:42
  • @Jon: And a monitor and a ClassInfo. Can't forget the 'ol syncronized() statement :-). – Robert Fraser Jan 21 '10 at 11:46
  • As Mr. Anderson of SO pointed out char[] stands out from String in many ways. – Ravi Gupta Jan 21 '10 at 12:19
  • @Robert: not sure what you meant by ClassInfo, but a char[] also has a Monitor (can be used for a synchronized block) and a Class, it's superclass is Object: `char[].class.getSuperclass() == Object.class` – user85421 Jan 21 '10 at 13:23