Java self referencing generic type

Question

While this question in its core has been ask many times there is a think that hasn't been asked yet (or i haven't found it).

In Java there is no way to have a generic that references the type itself. You might say "If you end up trying it there is a flaw in your design", but i contradict. Why? Because java would have needed it in their own design.

Code of Object.getClass()

    /**
     * ...
     *
     * <p><b>The actual result type is {@code Class<? extends |X|>}
     * where {@code |X|} is the erasure of the static type of the
     * expression on which {@code getClass} is called.</b> For
     * example, no cast is required in this code fragment:</p>
     *
     * <p>
     * {@code Number n = 0;                             }<br>
     * {@code Class<? extends Number> c = n.getClass(); }
     * </p>
     * . . .
     */
    public final native Class<?> getClass();

So to return Class<? extends SELF> clearly SELF is needed. which makes sense, if you type instanceOfCar.getClass() you expect to get Class<? extends Car>

(Edit: The questions assume that unsave casting is not an option)

How did the developers made it that you don't need to cast Class<? extends XYZ> (eg. Ho can Class<? extends Color> c = BLACK.getClass(); even compile?)
Why don't they introduce a feature with generics that allows self referencing, when they clearly need it themself?
How to simulate/create such a method?

class SuperClass{
    private SELF getThis(){
        return this;
    }
}

Even if it doesn't seems like it i am aware of the consequences of such a thing (like using SELF as a parameter would only work with ? extends SELF.

Voted to reopen: "Why does java not have reified generics" is clearly not the same as this question. — rzwitserloot, Sep 11 '21 at 20:32
You can kind-of cheat by having a type argument that extends your class and ensure it always is the actual type (by having a private constructor and a factory method, for example). I.e. `MyClass`. This kind of thing is sometimes used in builders that have hierarchies (i.e. `BaseBuilder`, `SpecialBuilder1`, `SpecialBuilder2` if you want the methods of `BaseBuilder` to still return a `SpecialBuilder1` if called on that object and not override each one). — Joachim Sauer, Sep 11 '21 at 20:35
@JoachimSauer This sadly isn't guaranteeing self referencing , see : https://onecompiler.com/java/3xb9gdygh — Niton, Sep 12 '21 at 11:23
@Niton: yes, I know. That's why I said "kind-of cheat" and not presented it as an absolute solution. — Joachim Sauer, Sep 12 '21 at 15:03

score 3 · Answer 1 · answered Sep 11 '21 at 23:16

Why don't they introduce a feature with generics that allows self referencing?

Because that's harder than it seems at first glance. For instance, suppose you specify that "The name This refers to the type of this", and somebody writes the following:

class Number {
    public abstract int compareTo(This other);
}

class Integer extends Number {
    final int value;

    public Integer(int value) {
        this.value = value;
    }

    @Override
    public int compareTo(This other) {
        return value - other.value;
    }
}


public static void main(String[] args) {
    Number n1 = new Integer(42);
    Number n2 = new Double(Math.PI);
    n1.compareTo(n2);
}

Should this compile? Probably not, because the compareTo implementation provided by class Integer only works with Integers, not some other subtype of Number.

The problem is that our specification is ambiguous. When we said that "This is the type of this", did we mean the declared type of this (i.e. class in whose source code the this appears) or the subclass used to create the this object at runtime?

If we choose the declared type, This would means different things in a subclass than it its superclass. That would be very confusing. For instance:

class Super {
    This delegate;
}

class Sub extends Super {
    void foo() {
        delegate.foo(); // error: delegate is of type "This", which does not have a method "foo"
    }
}

If we choose the runtime type, the This type is not known to the caller:

Number x = new Integer(42);
Number y = new Integer(43);
x.compareTo(y); // error: the method compareTo takes an argument of unknown type, but was provided a Number

meaning we can not invoke any method that takes a This. We can't even do something as simple as:

class Super {
    This data;
}

void temporarilyRemoveDataFrom(Super s) {
    Super d = s.data;
    s.data = null;
    process(s);
    s.data = d; // error: type Super is not assignable to an unknown subtype of Super
}

As you can see, introducing support for self-referential types raises all the issues of types that refer to arbitrary types. In particular, we need both type variance and a way to capture the value of unknown types.

Self referential types are therefore not significantly simpler than generics. In contrast, if we have generics, building a self referential type is trivial:

class SelfAware<T extends SelfAware<T>> {
  abstract T getThis();
}

class Sub extends SelfAware<Sub> {
  Sub getThis() {
    return this;
  }
}

SelfAware<Sub> x = new Sub();
x = x.getThis(); // compiles just fine

In addition, a case can be made that self-referential types are often overly constrained. Requiring programmers to define type variables and their bounds explicitly nudges them to think about which bounds are appropriate, avoiding accidental over-constraining. For instance, java.lang.Integer does not implement Comparable<Integer>, but the more general (and useful) Comparable<Number>.

To conclude, subclassable self referential types are not significantly easier to use than normal generics, do not make the language more expressive, and tempt programmers to over-constrain type arguments, and increase the complexity of the language and its tooling for no clear benefit.

With all that said, let's return to the curious case of getClass():

How did the developers made it that you don't need to cast the return value of getClass()?

By introducing special treatment for this method in the Java Language Specification, which writes:

The type of a method invocation expression of getClass is Class<? extends |T|>, where T is the class or interface that was searched for getClass (§15.12.1) and |T| denotes the erasure of T (§4.6).

It is worth noting that this method would have required special treatment even if self referential types were supported, because its interaction with the runtime type system exposes the caller to type erasure.

I don't get why this is downvoted. There is just one minor thing (https://onecompiler.com/java/3xb9gdygh) which makes using yout `selfAware` hack useless — Niton, Sep 12 '21 at 11:32
The `>` is not necessary here. Everything here compiles with just `class SelfAware`. — newacct, Sep 12 '21 at 15:35
@newacct: It guards against a subclass (or direct caller) providing a type argument that is not self referential. Since OP is looking for a way to express a self referential type, I think it is appropriate to constrain the type to be self-referential. — meriton, Sep 12 '21 at 16:21
@meriton: It does not guard against what you probably think it guards against. What it actually guards against is rarely useful, and you have not shown an example where that guard is actually used. — newacct, Sep 23 '21 at 06:35
It does constrain the type `T` to be self-referential. It does not constrain that `this` is of that type, though, only that `getThis()` is. Nevertheless, I think it communicates the intent very well. Have you never written additional code to better convey your intent? — meriton, Sep 23 '21 at 12:49
In addition, it's quite possible that the constraint will be needed later, and since adding a constraint could break subclasses, one might want to add the constraint up front. — meriton, Sep 23 '21 at 13:11

score 0 · Answer 2 · answered Sep 11 '21 at 20:38

Why not?

You'd have to ask the designers of java 1.5, and they don't hang out in stack overflow. As far as I know, they never were asked, or at least, never answered this question.

But let's take an educated guess

Many reasons. Mostly, because generics is already complicated enough. Pragmatically, because you rarely need it, so whilst, yes, it would have been nice if foo.getClass() ended up being an expression of type Class<? extends WhateverFoosTypeIs>, it's not crucial. If you run into this a lot, then that is what you're doing wrong. You shouldn't be using Class, almost ever (if you're over-using it, you probably need to write factories instead), especially if you are desiring the type of .getClass() to be Class<? extends ThatThing>.

Java does not introduce keywords. At least, not lightly. They did once. Admittedly this was epic levels of idiocy: Introduce the single worst keyword imaginable: java 1.4 introduced assert, which meant that the most popular method in the entire ecosystem outside of the core libs (junit, which was literally the most popular library, and assert most likely its most used method) no longer compiled and couldn't be used.

So, whilst I think the correct response would be: Introducing keywords is suboptimal but if you must, allright, but don't introduce as keyword a common method name - what the openjdk team actually learned from it is 'do not. under any circumstance. ever ever ever. introduce. a keyword. ever. seriously.' - even var got introduced as context sensitive keyword (you can search the web for it; essentially, if you were using var as a type name or field name or whatnot, your code still works in java10+, the compiler tries to figure out if you intended 'var' to be the java10 feature or just as an identifier).

So, given that new keywords are right out, what do you propose? SELF doesn't work - that's a valid identifier already. #? Java isn't perl and doesn't aspire to turn itself into cartoon swearing. This isn't, by miles, important enough to warrant expending a precious symbol on. That leaves 'this', so you could do something like public class Foo<S extends this> {} which would then mean Foo doesn't really have a typevar, but S can be used inside to mean 'my own type', and this at any point is of type S, and S is bound to be extends Foo, which could be a worthwhile addition to java.

But it would be a bit complicated and hard to follow (would you apropos of nothing realize what class Foo<F extends this> means? I bet most java programmers would have to look it up, and as this is a somewhat exotic need, that's not a good idea). You weigh the pros (which boils down to how often you need self-type) against the cons (the burden on e.g. learning curve and all the tooling stacks, as well as 'burning' a feature and permanently lenghtening the already sizable java spec). Instead of assuming the java lang designers were a bunch of morons who didn't consider this, let's give them the benefit of the doubt and assume they did realize this is useful, made a list of pros and cons, and decided this wasn't worth doing.

I really want this

You can hack it. All sorts of classes do it, notably including enums:

class Foo<F extends Foo<F>> {

    public F returnSelf() {
        return (F) this; //MARK
    }
}

class Bar extends Foo<Bar> {
}

Bar b = new Bar();
Bar c = b.returnSelf(); // works

The above code works and gets you a 'self-referencing' generic type. It does mean you introduce an actual typevar in the type which you may not want (this would not have been a workable solution for the .getClass() method, for example - changing j.l.Object into class Object<S extends Object<S>> would have been a sizable burden on everybody, not worth it at all). But it does work, and this 'pattern' is somewhat common.

Note that at the line marked with //MARK, you get a compiler warning which you must suppress with @SuppressWarnings, as the warning is warning against a scenario that's not going to happen.

"Note that at the line marked with //MARK, you get a compiler warning which you must suppress with @SuppressWarnings, as the warning is warning against a scenario that's not going to happen." The scenario can very much happen. Have a `class Baz extends Foo`. This declaration satisfies the constraints and compiles. Calling `returnSelf()` on a `Baz` instance will cast an instance of `Baz` to `Bar` which is unsafe. "All sorts of classes do it" There are a few legitimate safe uses for having such a bound, but `return (F) this;` is never a safe use for it. — newacct, Sep 12 '21 at 02:54
Why is this answer downvoted, you tried you best to explain and offered valid points? — Niton, Sep 12 '21 at 11:26
"Instead of assuming the java lang designers were a bunch of morons", i did not :) I like java a lot as you can judge by my github. It was more curiosity how they managed to use a feature "special treatment" that is not in the java spec itself. I still do not understand how they managed to implement getClass(). The only thing that i could think of is that they manipulated the bytecode directly or that at runtime each class implements this method (just in bytecode / classloader) — Niton, Dec 17 '21 at 23:08

Java self referencing generic type

2 Answers2

Why not?

But let's take an educated guess

I really want this