2

In Java's instanceof pattern matching, an additional identifier is used to narrow the type:

Object o = "test";
if (o instanceof String s) {
    System.out.println(s.length()); // s is of type String
}

Why is that necessary? The JEP doesn't seem to justify it, but I'm sure there's a reason. What's undesirable about the following?

Object o = "test";
if (o instanceof String) {
    System.out.println(o.length());
}

TypeScript and Kotlin do similar things without another identifier.

I'm guessing it's a backwards-compatibility thing - Java's quirk's usually are - but I can't think of an example which would demonstrate that.

The only thing I can think is that someone may have written something similar to that second example pre-pattern matching and relies on it to not compile, but that seems like a weak justification.

Lino
  • 19,604
  • 6
  • 47
  • 65
Michael
  • 41,989
  • 11
  • 82
  • 128
  • 2
    I can only assume that it's a restriction by java itself, that a variable can only have a single type. I.e. in the example `o` is always of type `Object`. `if (obj instanceof String s) { ...` is just a shorthand for `if (obj instanceof String) { String s = (String) obj; ...`, but I assume you also know that.. – Lino May 31 '22 at 11:23
  • It's just a limitation of java. Kotlin does allow that because it seems to "reassign" the type of `o` to `String` after the `instanceof` check. – f1sh May 31 '22 at 11:24
  • @f1sh I inferred that it's a limitation, because they wouldn't make this feature more verbose than it needs to be. The question is basically "what specifically is the limitation?" – Michael May 31 '22 at 11:25
  • 1
    FYI: the kotlin code `if (o is String) println(o.length)` is essentially just converted to `if (o instanceof String) System.out.println(((String) o).length())`. Meaning kotlin is also working with the restrictions the JVM imposes. I'm assuming that the type of a variable can **never** change, but I did not have a look at the JVM-specs. – Lino May 31 '22 at 11:33
  • 3
    Related: https://stackoverflow.com/questions/4186320/why-cast-after-an-instanceof (which asks why Java didn't _already_ do this, before pattern matching). – Joe May 31 '22 at 11:40
  • @Joe Perfect. Retrospectively changing the overload method resolution is exactly what I was looking for. Thanks! – Michael May 31 '22 at 11:42
  • 1
    I can't think of a way where the *match binding* (`s` in the given example) is actually necessary other than [in relation to *method overload resolution*](https://stackoverflow.com/a/31902768/507738) (the link posted by Joe). For instance, given the code `if (o instanceof Integer i || o instanceof String s)`, neither one of `i` and `s` can be used. – MC Emperor May 31 '22 at 12:09
  • 2
    @MCEmperor but what about `if(o instanceof Appendable a && o instanceof CharSequence cs) …` – Holger May 31 '22 at 14:40
  • @Holger That *could* be an option. But then `o` could be an *intersection type* of `Appendable` and `CharSequence`, just like the type of `subject` in the following example: ` void someMethod(T subject) { … }`. – MC Emperor May 31 '22 at 15:04
  • 1
    @MCEmperor then, let’s raise the bar and use `if(o instanceof String s && o instanceof Number n) { }` which is valid (Java does not disallow every impossible condition). But you can’t have an intersection type of two concrete classes. – Holger May 31 '22 at 15:07
  • Changing the type of the original variable w/in the block also precludes the body of the block from using members of the _original_ type that _aren't_ on the instanceof'd type. I don't know how often it occurs, but I assure you it's not zero. This doesn't happen when the original type is `Object` but may if it's some other abstract type or interface and the code wants to see if it is _also_ another interface type and then use both. I suppose an extra local var outside of the block w/ the original type could solve that, but then we're back to declaring a variable somewhere. – William Price Sep 22 '22 at 17:16

2 Answers2

6

If Java introduced something like a “smart instanceof”, we could argue that this feature could have worked without introducing a new variable.

But that’s not what has been introduced. The new feature is Pattern Matching, a far bigger concept, though only implemented in the simplest form in the first step. This is also a new integration approach, instead of working decades on the big feature, smaller features are continuously added to Java while maintaining the big vision.

JDK-8260244 describes one of the next steps, which would allow something like

(given)

record Point(int x, int y) {}
void printSum(Object o) {
    if (o instanceof Point(int x, int y)) {
        System.out.println(x+y);
    }
}

or even

(given)

record Point(int x, int y) {}
enum Color { RED, GREEN, BLUE }
record ColoredPoint(Point p, Color c) {}
record Rectangle(ColoredPoint upperLeft, ColoredPoint lowerRight) {}
static void printXCoordOfUpperLeftPointWithPatterns(Rectangle r) {
    if (r instanceof Rectangle(ColoredPoint(Point(var x, var y), var c), var lr)) {
        System.out.println("Upper-left corner: " + x);
    }
}

Since Pattern Matching includes the creation of (sometimes multiple) new variables, the case of creating only one new variable containing the same reference as an already existing variable is just a corner case. Even what has been implemented so far, covers more cases than that. E.g.

if(foo.getObject() instanceof List<?> l && l.get(0) instanceof Map<?,?> m
   && m.get("foo") instanceof String s) {
       // use s
}

It’s also worth noting that the scope of a pattern variable is complex. It would be even more confusing, if the variable existed outside this scope as well, but with a different type.

So, changing the type of a variable contextually in the presence of instanceof, while tempting, would cause problems (some comments also mentioned method overload selection), while at the same time, wouldn’t fit into the actual vision of the Java language developers.

Holger
  • 285,553
  • 42
  • 434
  • 765
-2

What's "unreasonable" about the code:

Object o = "test";
if (obj instanceof String) {
    System.out.println(o.length());
}

is it does not compile, because there is no method length() for the class Object.

The proper comparison code is either:

Object o = "test";
if (obj instanceof String) {
    System.out.println(((String)o).length());
}

or

Object o = "test";
if (obj instanceof String) {
    String s = (String)o;
    System.out.println(s.length());
}

With the new syntax,

if (obj instanceof String s) {

is syntactic sugar for

if (obj instanceof String) {
    String s = (String)o;
Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • 2
    This doesn't answer my question. Why can't the compiler narrow the type of `o` to `String` in the scope of that if-statement without going via second identifier called `s`? The point of pattern matching is to make code more expressive. Like I said in the question, other languages manage this. – Michael May 31 '22 at 11:23
  • @Michael because Java is statically typed, and `o` is of type `Object`. – daniu May 31 '22 at 11:24
  • @Michael simply said: Because java does not have that feature. – f1sh May 31 '22 at 11:25
  • @Mich why should it? `o` is declared as `Object`. It's that simple. Remember, this is a *compile* issue, not a `runtime` issue. The compiler is strict and deals **only** with the *declared* type. – Bohemian May 31 '22 at 11:25
  • 1
    @daniu Kotlin is statically typed. – Michael May 31 '22 at 11:27
  • 1
    @Bohemian Because it would be more expressive. I'm sure at the design phase of this feature, they didn't start with a 2nd identifier and then look for reasons to *not* have it. They would have started without one and looked for reasons *to* have it. I'm interested what those reasons were. – Michael May 31 '22 at 11:28
  • 2
    @Mich because `if (obj instanceof String s) {` is [syntactic sugar](https://en.wikipedia.org/wiki/Syntactic_sugar) for `if (obj instanceof String) { String s = (String)o;`, just like other examples of syntactic sugar found in the language. – Bohemian May 31 '22 at 11:31
  • 1
    @Bohemian That doesn't explain why the identifier is necessary either. The compiler could automatically generate `s` under the hood. – Michael May 31 '22 at 11:33
  • It could, but what if 's' was already taken as a variable name? Then you would have 2 local variables with the same name in that context. – dunni May 31 '22 at 11:44
  • 1
    @dunni It's completely under the hood... It doesn't need to be anything specific. The compiler can choose a random, unused identifier. Anyway, thank you for trying, but I've now found that the answer is backwards compatibility for overload method resolution, rather than any of the reasons given in this thread. – Michael May 31 '22 at 11:50