89

I've often seen messages that use [L then a type to denote an array, for instance:

[Ljava.lang.Object; cannot be cast to [Ljava.lang.String;

(The above being an arbitrary example I just pulled out.) I know this signifies an array, but where does the syntax come from? Why the beginning [ but no closing square bracket? And why the L? Is it purely arbitrary or is there some other historical/technical reason behind it?

Michael Berry
  • 70,193
  • 21
  • 157
  • 216

6 Answers6

88

[ stands for Array, the Lsome.type.Here; represent the type of the array. That's similar to the type descriptors used internally in the bytecode seen in §4.3 of the Java Virtual Machine Specification -- . The only difference is in that the real descriptors use / rather than . for denoting packages.

For instance, for primitives the value is: [I for array of ints, a two-dimensional array would be: [[I (strictly speaking Java doesn't have real two-dimensional arrays, but you can make arrays that consist of arrays).

Since classes may have any name, it would be harder to identify what class it is so they are delimited with L, followed by the class name and finishing with a ;

Descriptors are also used to represent the types of fields and methods.

For instance:

(IDLjava/lang/Thread;)Ljava/lang/Object;

... corresponds to a method whose parameters are int, double, and Thread and the return type is Object

edit

You can also see this in .class files using the java dissambler

C:>more > S.java
class S {
  Object  hello(int i, double d, long j, Thread t ) {
   return new Object();
  }
}
^C
C:>javac S.java

C:>javap -verbose S
class S extends java.lang.Object
  SourceFile: "S.java"
  minor version: 0
  major version: 50
  Constant pool:
const #1 = Method       #2.#12; //  java/lang/Object."<init>":()V
const #2 = class        #13;    //  java/lang/Object
const #3 = class        #14;    //  S
const #4 = Asciz        <init>;
const #5 = Asciz        ()V;
const #6 = Asciz        Code;
const #7 = Asciz        LineNumberTable;
const #8 = Asciz        hello;
const #9 = Asciz        (IDJLjava/lang/Thread;)Ljava/lang/Object;;
const #10 = Asciz       SourceFile;
const #11 = Asciz       S.java;
const #12 = NameAndType #4:#5;//  "<init>":()V
const #13 = Asciz       java/lang/Object;
const #14 = Asciz       S;

{
S();
  Code:
   Stack=1, Locals=1, Args_size=1
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return
  LineNumberTable:
   line 1: 0


java.lang.Object hello(int, double, long, java.lang.Thread);
  Code:
   Stack=2, Locals=7, Args_size=5
   0:   new     #2; //class java/lang/Object
   3:   dup
   4:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   7:   areturn
  LineNumberTable:
   line 3: 0


}

And in raw class file ( look at line 5 ):

enter image description here

Reference: Field description on the JVM specification

OscarRyz
  • 196,001
  • 113
  • 385
  • 569
  • 5
    Java doesn't have real two-dimensional arrays, but you can make arrays that consist of arrays; `[[I` just means array-of-array-of-int. – Jesper Feb 23 '11 at 03:04
60

JVM array descriptors.

[Z = boolean
[B = byte
[S = short
[I = int
[J = long
[F = float
[D = double
[C = char
[L = any non-primitives(Object)

To get the main data-type, you need:

[Object].getClass().getComponentType();

It will return null if the "object" is not an array. to determine if it is an array, just call:

[Any Object].getClass().isArray()

or

Class.class.isArray();
ronalchn
  • 12,225
  • 10
  • 51
  • 61
12

This is used in the JNI (and the JVM internally in general) to indicate a type. Primitives are denoted with a single letter (Z for boolean, I for int, etc), [ indicates an array, and L is used for a class (terminated by a ;).

See here: JNI Types

EDIT: To elaborate on why there is no terminating ] - this code is to allow the JNI/JVM to quickly identify a method and its signature. It's intended to be as compact as possible to make parsing fast (=as few characters as possible), so [ is used for an array which is pretty straightforward (what better symbol to use?). I for int is equally obvious.

EboMike
  • 76,846
  • 14
  • 164
  • 167
  • 3
    You're answering a different question. In fact, OP has explicitly stated he's not asking "what does it mean". – Nikita Rybak Feb 23 '11 at 00:57
  • @Nikita If you read through that doc, you'll find that the "L" means "Fully qualified class", and "[L" indicates a very specific type of array (an array of FQCs), not just any array. – Travis Webb Feb 23 '11 at 00:57
  • @Nikita: The question is "where does it come from"? Well, it comes from the JNI. – EboMike Feb 23 '11 at 00:58
  • 3
    @EboMike The question is 'why'. And that's a very interesting question, I'd like to know the answer too. While the question "in which chapter of JVM spec it's specified" is not. – Nikita Rybak Feb 23 '11 at 01:00
  • 2
    I think the question is where "L" and "Z" and these other arbitrary-sounding abbreviations come from. – ide Feb 23 '11 at 01:02
  • @EboMike Yeah. That addresses the question (part of it), so I remove downvote. – Nikita Rybak Feb 23 '11 at 01:03
  • 5
    These are not JNI specific but JVM internal representation. – OscarRyz Feb 23 '11 at 01:06
  • 1
    @OscarRyz is right, this was part of the JVM specification *before JNI even existed*. JNI is reusing the representation in the JVM spec, not the other way around. – Stephen C Feb 25 '11 at 23:43
9

[L array notation - where does it come from?

From the JVM spec. This is the representation of type names that is specified in the classFile format and other places.

  • The '[' denotes an array. In fact, the array type name is [<typename> where <typename> is the name of the base type of the array.
  • 'L' is actually part of the base type name; e.g. String is "Ljava.lang.String;". Note the trailing ';'!!

And yes, the notation is documented in other places as well.

Why?

There is no doubt that that internal type name representation was chosen because it is:

  • compact,
  • self-delimiting (this is important for representations of method signatures, and it's why the 'L' and the trailing ';' are there), and
  • uses printable characters (for legibility ... if not readability).

But it is unclear why they decided to expose the internal type names of array types via the Class.getName() method. I think they could have mapped the internal names to something more "human friendly". My best guess is that it was just one of those things that they didn't get around to fixing until it was too late. (Nobody is perfect ... not even the hypothetical "intelligent designer".)

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
7

I think it's because C was taken by char, so next letter in class is L.

Enerccio
  • 257
  • 1
  • 10
  • 23
3

Another source for this would be the documentation of Class.getName(). Of course, all these specifications are congruent, since they are made to fit each other.

Paŭlo Ebermann
  • 73,284
  • 20
  • 146
  • 210