A more fundamental reason Java does not include operator overloading (at least for assignment)?

Question

Referring to a ~2-year old discussion of the fact that there is no operator overloading in Java ( Why doesn't Java offer operator overloading? ), and coming from many intense C++ years myself to Java, I wonder whether there is a more fundamental reason that operator overloading is not part of the Java language, at least in the case of assignment, than the highest-rated answer in that link states near the bottom of the answer (namely, that it was James Gosling's personal choice).

Specifically, consider assignment.

// C++
#include <iostream>

class MyClass
{
public:
    int x;
    MyClass(const int _x) : x(_x) {}
    MyClass & operator=(const MyClass & rhs) {x=rhs.x; return *this;}
};

int main()
{
    MyClass myObj1(1), myObj2(2);
    MyClass & myRef = myObj1;
    myRef = myObj2;

    std::cout << "myObj1.x = " << myObj1.x << std::endl;
    std::cout << "myObj2.x = " << myObj2.x << std::endl;

    return 0;
}

The output is:

myObj1.x = 2
myObj2.x = 2

In Java, however, the line myRef = myObj2 (assuming the declaration of myRef in the previous line was myClass myRef = myObj1, as Java requires, since all such variables are automatically Java-style 'references') behaves very differently - it would not cause myObj1.x to change and the output would be

myObj1.x = 1
myObj2.x = 2

This difference between C++ and Java leads me to think that the absence of operator overloading in Java, at least in the case of assignment, is not a 'matter of personal choice' on the part of James Gosling, but rather a fundamental necessity given Java's syntax that treats all object variables as references (i.e. MyClass myRef = myObj1 defines myRef to be a Java-style reference). I say this because if assignment in Java causes the left-hand side reference to refer to a different object, rather than allowing the possibility that the object itself change its value, then it would seem that there is no possibility of providing an overloaded assignment operator.

In other words - it's not simply a 'choice', and there's not even the possibility of 'holding your breath' with the hope that it will ever be introduced, as the aforementioned high-rated answer also states (near the end). Quoting: "The reasons for not adding them now could be a mix of internal politics, allergy to the feature, distrust of developers (you know, the saboteur ones), compatibility with the previous JVMs, time to write a correct specification, etc.. So don't hold your breath waiting for this feature.". <-- So this isn't correct, at least for the assignment operator: the reason there's no operator overloading (at least for assignment) is instead fundamental to the nature of Java.

Is this a correct assessment on my part?

ADDENDUM

Assuming the assignment operator is a special case, then my follow-up question is: Are there any other operators, or more generally any other language features, that would by necessity be affected in a similar way as the assignment operator? I would like to know how 'deep' the difference goes between Java and C++ regarding variables-as-values/references. i.e., in C++, variable tokens represent values (and note, even if the variable token was declared initially as a reference, it's still treated as a value essentially wherever it's used), whereas in Java, variable tokens represent honest-to-goodness references wherever the token is later used.

C++ had been in effective use for about 5 years before work on Java began. From what I remember, operator overloading was a relatively new feature to anyone who had never programmed Ada, and programmers were overloading operators for the most horrid of reasons, creating a maintainability nightmare. This is consistent with Gosling's assertion that he saw it being abused and decided to avoid it as a feature. — Nathan Ryan, May 16 '11 at 01:00
This is a subjective discussion. People have different reasons for loving or hating operator overloading and you can't really say who is "right". (Except those who share my own position :-)) That said, I think @Nathan's comment about "maintainability nightmare" and "abuse" is probably accurate about what motivated Gosling et al. — asveikau, May 16 '11 at 01:19
But, this doesn't comment upon my question regarding whether, at least in the case of the assignment operator, the absence of operator overloading is a fundamental requirement, rather than a language design choice, for the reason I mentioned. — Dan Nissenbaum, May 16 '11 at 01:22
I don't seen anything in Java that couldn't accommodate operator overloading of some form. Several operators are already overloaded (technically, there are four different `%` operators, for example, though all predefined). It might still be different from the subtleties of operator overloading in C++, in particular for assignments and value aliasing. The parenthesis operator would probably also have significant differences, given the rules for overloading it in C++. Other operators, like array access `[]`, have non-intuitive evaluation order that might also result in different behavior. — Nathan Ryan, May 16 '11 at 02:40

score 5 · Answer 1 · answered May 16 '11 at 07:57

There is a big misconception when talking about similarities and differences between Java and C++, that arises in your question. C++ references and Java references are not the same. In Java a reference is a resettable proxy to the real object, while in C++ a reference is an alias to the object. To put it in C++ terms, a Java references is a garbage collected pointer not a reference. Now, going back to your example, to write equivalent code in C++ and Java you would have to use pointers:

int main() {
   type a(1), b(2);
   type *pa = &a, *pb = &b;
   pa = pb;
   // a is still 1, b is still 2, pa == pb == &b
}

Now the examples are the same: the assignment operator is being applied to the pointers to the objects, and in that particular case you cannot overload the operator in C++ either. It is important to note that operator overloading can be easily abused, and that is a good reason to avoid it in the first place. Now if you add the two different types of entities: objects and references, things become more messy to think about.

If you were allowed to overload operator= for a particular object in Java, then you would not be able to have multiple references to the same object, and the language would be crippled:

Type a = new Type(1);
Type b = new Type(2);
a = b;                 // dispatched to Type.operator=( Type )??
a.foo();
a = new Type(3);       // do you want to copy Type(3) into a, or work with a new object?

That in turn would make the type unusable in the language: containers store references, and they reassign them (even the first time just when an object is created), functions don't really use pass-by-reference semantics, but rather pass-by-value the references (which is a completely different issue, again, the difference is void foo( type* ) versus void foo( type& ): the proxy entity is copied, you cannot modify the reference passed in by the caller.

The problem is that the language is trying really hard to hide the fact that a and the object that a refers to are not the same thing (same happens in C#), and that in turn means that you cannot explicitly state that one operation is to be applied to the reference/referent, that is resolved by the language. The outcome of that design is that any operation that can be applied to references can never be applied to the objects themselves.

As of the rest of the operators, the decision is most probably arbitrary, because the language hides the reference/object difference, it could have been designed such that a+b was translated into type* operator+( type*, type* ) by the compiler. Since you cannot use arithmetic then there would be no problem, as the compiler would recognize that a+b is an operation that must be applied to the objects (it does not make sense with references). But then it could be considered a little awkward that you can overload +, but you cannot overload =, ==, !=...

That is the path that C# took, where assignment cannot be overloaded for reference types. Interestingly in C# there are value types, and the set of operators that can be overloaded for reference and value types are different. Not having coded C# in large projects, I cannot really tell whether that potential source of confusion is such or if people are just used to it (but if you search SO, you will find that a few people do ask why X cannot be overloaded in C# for reference types where X is one of the operations that can be applied to the reference itself.

I wish I could "select" more than one answer; I'd select this one also. — Dan Nissenbaum, May 17 '11 at 03:17
IMHO, a major weakness in Java is that there's no distinction between a variable which encapsulates the *identity* of an object, one which encapsulates its *mutable state*, one which encapsulates *both*, and one which encapsulates *neither* (not encapsulating the object's identity, and only encapsulating immutable aspects of the object's state). In many cases, only one variable at a time should encapsulate any particular object's mutable state; it would be helpful if there were some linguistic means of enforcing this. — supercat, Feb 11 '13 at 22:19

Claudiu · Answer 2 · 2011-05-16T03:41:55.347

That doesn't explain why they couldn't have allowed overloading of other operators like + or -. Considering James Gosling designed the Java language, and he said it was his personal choice, which he explains in more detail at the link provided in the question you linked, I think that's your answer:

There are some things that I kind of feel torn about, like operator overloading. I left out operator overloading as a fairly personal choice because I had seen too many people abuse it in C++. I've spent a lot of time in the past five to six years surveying people about operator overloading and it's really fascinating, because you get the community broken into three pieces: Probably about 20 to 30 percent of the population think of operator overloading as the spawn of the devil; somebody has done something with operator overloading that has just really ticked them off, because they've used like + for list insertion and it makes life really, really confusing. A lot of that problem stems from the fact that there are only about half a dozen operators you can sensibly overload, and yet there are thousands or millions of operators that people would like to define -- so you have to pick, and often the choices conflict with your sense of intuition. Then there's a community of about 10 percent that have actually used operator overloading appropriately and who really care about it, and for whom it's actually really important; this is almost exclusively people who do numerical work, where the notation is very important to appealing to people's intuition, because they come into it with an intuition about what the + means, and the ability to say "a + b" where a and b are complex numbers or matrices or something really does make sense. You get kind of shaky when you get to things like multiply because there are actually multiple kinds of multiplication operators -- there's vector product, and dot product, which are fundamentally very different. And yet there's only one operator, so what do you do? And there's no operator for square-root. Those two camps are the poles, and then there's this mush in the middle of 60-odd percent who really couldn't care much either way. The camp of people that think that operator overloading is a bad idea has been, simply from my informal statistical sampling, significantly larger and certainly more vocal than the numerical guys. So, given the way that things have gone today where some features in the language are voted on by the community -- it's not just like some little standards committee, it really is large-scale -- it would be pretty hard to get operator overloading in. And yet it leaves this one community of fairly important folks kind of totally shut out. It's a flavor of the tragedy of the commons problem.

UPDATE: Re: your addendum, the other assignment operators +=, -=, etc. would also be affected. You also can't write a swap function like void swap(int *a, int *b);. and other stuff.

Is my assessment correct regarding the assignment operator, in any case? — Dan Nissenbaum, May 16 '11 at 01:00
Also - I think I fit in the 'numerical' crowd - having worked extensively writing mathematical simulations in physics, including for PhD research. Almost all the examples I think of do involve mathematical objects, and I was a bit surprised to find that Java does not include overloading, but came to that 'realization' not so much by reading the rules of the language, but by seeing the assignment operator behave in the (to me) odd way it does. — Dan Nissenbaum, May 16 '11 at 01:26
@Dan: it's comparing apples and oranges, really. in C++, having `MyClass inst(10); inst = x;` basically calls `inst.=(x)`. it's just calling a method on it. in Java everything is basically a pointer. `MyClass inst = new MyClass(10); inst = x;` is like doing `MyClass *inst = new MyClass(10); inst = x;`. the pointer will get changed anyway, no over-riding that as far as I'm aware (i might be wrong, i don't know C++). — Claudiu, May 16 '11 at 03:39
the exception in java is primitive types of course, which work like in C++ where they can't be overridden either. — Claudiu, May 16 '11 at 03:40
disclaimer, i'm also not fully brushed up on how references work in C++ (the `&` thing) so i might be mistaking something here — Claudiu, May 16 '11 at 03:42

score 1 · Accepted Answer · answered May 16 '11 at 04:59

Is this a correct assessment on my part?

The lack of operator in general is a "personal choice". C#, which is a very similar language, does allow operator overloading. But you still can't overload assignment. What would that even do in a reference-semantics language?

Are there any other operators, or more generally any other language features, that would by necessity be affected in a similar way as the assignment operator? I would like to know how 'deep' the difference goes between Java and C++ regarding variables-as-values/references.

The most obvious is copying. In a reference-semantics language, clone() isn't that common, and isn't needed at all for immutable types like String. But in C++, where the default assignment semantics are based around copying, copy constructors are very common. And automatically generated if you don't define one.

A more subtle difference is that it's a lot harder for a reference-semantics language to support RAII than a value-semantics language, because object lifetime is harder to track. Raymond Chen has a good explanation.

tp1 · Answer 4 · 2011-05-16T15:48:53.993

0

The reason why operator overloading is abused in C++ language is because it's too complex feature. Here's some aspects of it which makes it complex:

expressions are a tree
operator overloading is the interface/documentation for those expressions
interfaces are basically invisible feature in c++
free functions/static functions/friend functions are a big mess in C++
function prototypes are already complex feature
choice of the syntax for operator overloading is less than ideal
there is no other comparable api in c++ language
user-defined types/function names are handled differently than built-in types/function names in function prototypes
it uses advanced math, like the operator<<(ostream&, ostream & (*fptr)(ostream &));
even the simplest examples of it uses polymorphism
It's the only c++ feature that has 2d array in it
this-pointer is invisible and whether your operators are member functions or outside the class is important choice for programmers

Because of these complexity, very small number of programmers actually understand how it works. I'm probably missing many important aspects of it, but the list above is good indication that it is very complex feature.

Update: some explanation about the #4: the argument pretty much is as follows:

class A { friend void f(); }; class B { friend void f(); }
void f() { /* use both A and B members inside this function */ }

With static functions, you can do this:

class A { static void f(); }; void f() { /* use only class A here */ }

And with free functions:

class A { }; void f() { /* you have no special access to any classes */ }

Update#2: The #10, the example I was thinking looks like this in stdlib:

  ostream &operator<<(ostream &o, std::string s) { ... } // inside stdlib
  int main() { std::cout << "Hello World" << std::endl; }

Now the polymorphism in this example happens because you can choose between std::cout and std::ofstream and std::stringstream. This is possible because operator<< first parameter takes a reference to ostream. This is normal runtime polymorphism in this example.

Update #3: About the prototypes still. The real interaction between operator overloading and prototypes is because the overloaded operators becomes part of the class' interface. This brings us to the 2d array thing, because inside the compiler the class interface is a 2d data structure which has quite complex data in it, including booleans, types, function names. The rule #4 is needed so that you can choose when your operators are inside this 2d data structure and when they're outside of it. Rule #8 deals with the booleans stored in the 2d data structure. Rule #7 is because class' interface is used to represent elements of an expression tree.

edited May 16 '11 at 15:48

answered May 16 '11 at 04:42

tp1

288
1
3

As to the numbered list: I don't know where #4 comes from (I've never seen them as a big mess, I've never heard anybody call them a big mess in C, C++, Perl, Python, Lua, etc., and I really don't know what you're complaining about). #5 doesn't make sense, unless you meant to say "template functions." #8 may be true in some corner cases, but one of the fundamental principles in C++ design is that user defined types should have as much support as native types. #9 doesn't make sense, as output isn't math, let alone advanced math (nor is bit fiddling). ... – Max Lybbert May 16 '11 at 05:11
When overloading operators, I haven't yet had to use polymorphism (#10). It's possible, and it would probably come up more in Java where OOP has replaced a lot of critical thought. I don't know what 2d array you're talking about in #11, but I believe that the rules for overload resolution are no more complex for operators as they are for, say, an overloaded function that includes template functions and functions with default arguments. And I'm not sure what the complaint is in #12. But, yes, if you want to write a complex system in C++ you definitely can. It's also possible to do in Java. – Max Lybbert May 16 '11 at 05:16
@Max - thanks for taking the time to respond in detail - regarding the list provided, having worked intensely with C++ for many years, including with templates, I don't agree that this particular list represents, on the whole, a solid argument that operator overloading in C++ is too complex, and a number of specific items in the list would need clarification for me to understand them sufficiently to be certain they're valid criticisms - basically the same ones you've commented on. – Dan Nissenbaum May 16 '11 at 06:22
The #4 comes from the fact that almost no programmers knows the connection between free functions/static functions and friend functions.(the fact that they're all outside the class, but just can access different _number_ of classes). No, this is why #5 makes sense. Everyone thinks function prototypes are so simple that they miss the complexity of them and this is critical in operator overloading. For #9, check how that function has been implemented in stdlib. For #10, even the hello world example with std::cout uses polymorphism. The #11 is very advanced feature and very difficult to find. – tp1 May 16 '11 at 06:24
@tp1 - Re. #4: not some of the programmers I've worked with. Re. #5 - this seems to me like it could be more a criticism of Java, where the complexity feels just a bit hidden in contrast to C++ (for example, memory management), so perhaps it's more likely that Java programmers overly perceive function prototypes as simple, than C++ programmers. I'm not saying that this is true - I'm just saying I think the argument could apply equally well in reverse. In any case - this discussion is a bit off-topic from my question. – Dan Nissenbaum May 16 '11 at 06:29
A function prototype is what you put in the header (`int foo(int b);`). You then put the function definition in an implementation file (the "cpp file") unless you put it in the header at the same time. The header then looks like what you get in Eclipse when you're looking at a JAR file without the associated source. There's no complexity with the prototype itself. There's complexity regarding function overloading rules and other questions, but that doesn't stem from the prototype. I'll agree with you if you say "template functions are complex" (templates being C++'s version of generics). – Max Lybbert May 16 '11 at 14:40
(Especially given that Java generics were designed to be less powerful, and less complex, than C++ templates). – Max Lybbert May 16 '11 at 14:41
And I still can't figure out the assertion in #4. An example would help, if you're willing to write one (edit the original answer). Again, there's still no math in output, any more than there is in `System.out.println`. I'll concede that #10 uses _compile time_ polymorphism, which is what threw me off (it used to be runtime polymorphism). – Max Lybbert May 16 '11 at 14:47
I believe the output example supports my point, in fact. I may have a need to output a lot of types (making polymorphism in operator overloading possible), but I've never had a need to add them all together (`Car + Mouse = ???`). Thinking of examples in math, you usually can't add completely unrelated types together (say, adding a linear algebra vector to a complex number). So, again, I've never needed polymorphism when overloading operators; but it's available if necessary. – Max Lybbert May 16 '11 at 14:55
I think I've finally figured out the function prototypes complaint. Basically issues 4, 5, 6 and 10 are interrelated. I'll agree that the syntax for operator overloading in C++ is "less than ideal." However, the Java language designers would be free to use a better syntax. Say, the syntax Perl uses, or the syntax Python uses, or the syntax C# uses. Or they could create their own. I can't consider this an argument for not having operator overloading in Java. – Max Lybbert May 16 '11 at 15:19
I think operator overloading should actually be a library feature instead of language feature, but that's not possible because every class needs to support it. How they made it part of the language makes it interact with very many different features, adding complexity. – tp1 May 16 '11 at 15:37

A more fundamental reason Java does not include operator overloading (at least for assignment)?

4 Answers4