Java开发网 - [转贴] Overloading considered Harmful

Java开发网

您没有登录

» Java开发网 » 技术文章库

打印话题 寄给朋友 订阅主题

作者

[转贴] Overloading considered Harmful

Julian13

发贴: 387
积分: 52

于 2003-06-02 08:47

Author: Alexander (Sascha) Hoher http://www.javaspecialists.co.za.

Overloading considered harmful
What is overloading, once again? Same method name for different methods - sounds harmless enough!

Sure it's one of the first things Java programmers are confronted with when learning the language. You are told things like: Do not mix it up with overriding - remember, these things may look quite similar, but are altogether different concepts! Then your Java introduction goes on telling you about unique parameter lists, and after one and half pages you get the impression that this is something not so terribly hard to understand. [HK: I can vouch for this argument. In my Java courses, students commonly make this mistake.]

What is the value proposition of this seemingly simple feature?

Shorter interfaces, not bogged down by artificial, tiresome discriminators, and a bit of lexical structuring of your class text: Overloading allows you to indicate the conceptual identity of different methods, letting you stress common semantics across methods so that similarities become apparent at first sight. It's supposed to make your code more readable, and what regards server code - the code, where these method siblings are defined -, it really does.

There are many who like it. There is tons of code using what overloading has to offer. And of course, you cannot even escape it in Java, where you're simply forced to use it when you want to provide different initializers. It seems, overloading rules - a feature not only popular, but tightly integrated into some important programming languages, an open commitment of venereous language designers that surely does not fail to impress the masses. And, what is more: no performance penalty whatsoever...

Now, should we fully embrace overloading in our own code then? Should we use it wherever possible? This discussion shall present an attempt to put the technical facts investigated in-depth by a former edition of this newsletter into a usage perspective - a bit similar in spirit to the popular harping on pointers which you can find in every Java introduction. The seminal idea that overloading clashes with dynamic binding is taken from a discussion of overloading to be found in "Object-Oriented Software Construction" by Bertrand Meyer.

There is no reason to question that naming conventions to indicate conceptual interrelatedness of different methods will benefit the class text where these methods are defined. To adopt the convention of reusing the same method name, however, has unfortunate consequences on client code which can become quite unintuitive, to say the least.

Overloading with parameter lists of different length pose no problem for client code interpretation, as they openly disambiguate client calls at first sight. Things that could irritate you just will not compile. However, when overloaded methods with the same method name have parameter lists of the same length, and when the actual call arguments conform to more than one signature of these overloaded methods, it somehow gets a little hard to tell which methods are actually executed just looking on the client calls. In this situation, you experience the strange phenomenon that the methods being called are not independent of the reference types being used for the calls.

There are several problems related to this, but first let's take another look on the small code example presented in a former edition of this newsletter in order to really get a feel for what it's like when methods being called are not independent of the reference types being used for the calls.

A minimal modification allows us to focus on the ugly side of overloading: The program still tells us which method gets actually called, but on top of that also delivers rather strong comments when overloading is caught to harm our ability to reason about the client code without knowing the server classes.

Basically, we have two fixed instances, which will play always the same roles: one serving as call target, the other serving as argument. Now we mix and match several calls always to be executed on these same instances (always the same target object, always the same argument object) the only difference being the reference types through which these objects are accessed. And behold: Different methods are being called. If you are familiar with this simple setting, you may skip the program part to directly go on with the following discussion.


public class OverloadingTest {
  public abstract static class Top {
    public String f(Object o) {
      String whoAmI = "Top.f(Object)";
      System.out.println(whoAmI);
      return whoAmI;
    }
  }

  public static class Sub extends Top {
    public String f(String s) {
      String whoAmI = "Middle.f(String)";
      System.out.println(whoAmI);
      return whoAmI;
    }
  }

  public static void main(String[] args) {
    Sub sub = new Sub();
    Top top = sub;

    String stringAsString = "someString";
    Object stringAsObject = string;

    if (top.f(stringAsObject) == sub.f(stringAsString))
    //if (top.f(stringAsObject) == sub.f(stringAsObject))
    //if (top.f(stringAsString) == sub.f(stringAsString))
    //if (top.f(stringAsString) == sub.f(stringAsObject))
    //if (sub.f(stringAsString) == sub.f(stringAsObject))
    //if (top.f(stringAsString) == top.f(stringAsObject))
    {
      System.out.println("Hey, life is great!");
    } else {
      System.out.println("Oh no!");
    }
  }
}

Can you tell what happens with activating each of the conditions?

Let us carefully go through the code.

There are two overloaded methods spread across a class hierarchy (one class inheriting from another class). This is the server code to be called by the client.
The superclass defines: String f(Object o).
The subclass defines: String f(String o).
The signatures are chosen to make both methods eligible candidates to be executed in the context of calls on the subclass instance with a String argument.
The client provides two objects, reused for all calls and chosen in a way that both overloaded methods are potentially eligible candidates for executing the client calls.
Through polymorphic assignment, the client obtains references of different types for these two instances.
The client makes method calls that differ only in the different references used for making the call. In the given setup, there are 4 different call forms possible: Overloading has the method name fixed, so only the target reference type and the parameter reference type are variable. Every reference type for the target can be combined with every reference type for the argument. (Mathematically spoken, there are 4 binary strings of length 2).
The comparisons then are really just for fun, eliminating detail. They shift the focus of attention from the question what particular method gets called to the general insight that different methods get called, additionally allowing the program to be explicit about its likes and dislikes: Every case of seeming reference-independence of the calls is instantly interpreted as an example of how things should be, and welcomed with a happy, optimistic "Hey, life seems great!" In those some dark moments, however, when overloading casts its dark shadow upon the else so object-oriented Java world, and just nothing seems right, our little program starts to complain... Combinatorics tells us six 2-combinations of a 4-set (consisting of 4 call forms) exist, and so you find six comparisons (five of them showing up as comments), but of course, already one single predicate returning false (different methods having been called) suffices to get the point across.
And that's it.

Discussion
The program shows, once again, that one thing to be aware of in connection with overloading is that it's all about reference types. This is as true for target reference types as it is for parameter reference types. For instance, the predicate "sub.f(stringAsObject) == sub.f(stringAsString)" will resolve to false in our setup because two different methods are executed. This dependence on reference types in connection with overloading may or may not be what you expected, but the question remains if this is a clean approach to object-oriented programming.

No doubt, this may puzzle many a brave programmer, as it is a result absolutely exclusive to overloaded methods. And, as the use of overloaded methods does not identify itself as such in the method call, the intuitive, but unfortunately wrong expectation might be that the predicate returns true, as it would be the case with any gentle non-overloaded method.

Honest, do we like this? No. Object-oriented programming, as we know it, is about objects, not about references. We expect objects to behave the way they are and not the way they are referenced. Objects do their thing regardless of the role the client assigns them. This is how it should be be, and we call this thing dynamic binding. It is not cosmetics, it is not just a feature, it is THE feature. It shapes the architecture of our systems, decoupling clients from servers.

Now, with overloading a second rule, reference type dependence, takes over, breaking the fundamental polymorphic equivalence property described above (that polymorphic assignments do not change the results of method calls as long as the code can be compiled). The choice of references in the client, which should be based on considerations like grouping and low coupling, suddenly has to take the demands of overloading into account. Overloaded server objects affect the design of client code. Cosmetics beat structure. Unlike overriding, overloading cannot just be applied in a server method definition act and end of the story. It is a feature you have to stay aware of in your clients whose specific referencing of server objects influences what functionality gets called in the end. While with dynamic binding alone the method to be executed is completely server-defined, overloading proves to be client-sensitive.

Now to the problems. An important issue closely connected to software quality is readability. Our ability to reason about the software text is essential for any kind of maintenance, and, as you might have guessed by the direction this discussion has taken by now, overloading affects readability of client code in a rather negative way. It is all very well to let the program run and after the surprise look at the server code and explain the strange things away (oh, of course, overloaded methods, you know...), but nevertheless it would be preferable by far to predict the behaviour, simple as it is, by simply (i.e. exclusively) examining the client. Show me the client class, tell me no overloading is involved, and I tell you: "Hey, life seems great!" I can reason about the result of the condition solely looking at the client class.

With overloading being introduced, or even with just the slightest chance of overloading being used (this includes all unknown Java code), this statement is impossible to make, because you cannot tell if the same server method gets called without examining the server sources. In our program, you would have to read three classes instead of one to know what's going on. So, use of overloading weakens the expressive power of client code as the polymorphic equivalence property cannot be relied upon.

Sometimes, of course, you are willing to dig into the server code because you want to find out the exact server method that gets called. But even then overloading significantly complicates things. Without overloading, you just work your way bottom up through the target's class hierarchy, and when you find a match, bingo, you're done! With unknown code or code known to use overloading, this can be only your second step. First you have to examine the class of the reference and find the matching method. Only then can you check the class hierarchy for overriding methods. The bad thing about this is probably not the additional step involved, but that you have to repeat this analysis for every different reference type, because results can vary. Thus, overloading complicates the analysis of client-server interaction.

There is also a psychological dimension to all this. The following will try to show that overloading is not a gentle, unobtrusive language feature, but, as it stands in conflict with other language features, late binding and inheritance, particularly prone to abuse. In other words, overloading is an open invitation for introducing conceptual errors. Think of novice programmers or programmers in a rush. Overloaded methods, coming with its own method selection rules, present an anomaly in the object-oriented landscape shaped by the presence of dynamic binding, and will surely go on to puzzle people, who will falsely think overloaded methods behave like "normal" methods, or mistake overloading for overriding just because the methods signatures involved in overloading look so similar.

In fact, such a mistake may be seen as expressing justified desires regarding object-oriented design. Hell, we'd sure like to see the overloaded methods in our example being handled as an instance of overriding! The parameters of our methods are related through inheritance, so inspired by other programming languages, it does not take great imagination to see the derived class define a method that overrides the inherited method. Of course, this is an additional twist adding a bit of vision to our discussion, and of course, we know that Java does not support such covariant method redefinition (restricting the parameter domain of the method): Most of us have learnt by now that Java allows only specification inheritance (overriding being only defined for methods with the same return and parameter types). But still. Do we not think, deep in our heart, that the subclass method with the more specific parameter should, in a better world maybe, be the one in charge, overriding the superclass method? Think about an Integer class inheriting from Number while redefining addition for integers only. Not allowed in Java, but still desirable (and a real feature in other languages such as Eiffel). Sure, overloading is not to be blamed for an incorrect understanding of inheritance in Java, but it clearly invites such fantasies (and the corresponding errors) when used in a context such as the presented one. And even if such interpretations are wrong - shouldn't they be right?

And then the poor integration of overloading and inheritance in Java, which is very misleading as well. Reference type dependence means that overloading is simply not developed to conceptual consistency in the context of inheritance. Guessing from experience with overloaded methods defined in one and the same class, we might expect the method with the best match in terms of formal parameter type and actual method argument to be called on the object. This does not happen, though. Java does not produce any kind of "flat form" for the object's class with all overloaded methods, inherited or not, appearing side by side in a list in order to allow the runtime to choose the most appropriate.

No, what technically happens, is, in my understanding, that the compiler takes the method symbol plus the parameter reference types of some method call and calculates a position in the method table of the target's reference type. So, choosing between overloaded methods is done compile-time, and it is restricted to the overloaded methods of one class: the class of the reference type. Overloaded methods defined in subclasses of the reference type are never called: Java ignores the exiled siblings although the whole thing looks so very similar to overriding.

With overloaded methods being defined in superclasses of the reference type, Java exhibits quite strange behaviour: While the server code can still be compiled, client code will break: Trying to make a method call where the compiler would have to choose between them, you get a compilation error, complaining that the call is ambiguous. Put the method into the reference type and all is well. Don't ask me why - just remember selection of overloaded methods is limited to the reference type class. I personally believe this further anomaly might is more a compiler issue than a language issue. If you find a logic explanation for this, other than that it helps to improve compilation performance, please let me know.

Once again (the last time): overloaded methods defined in subtypes of the target reference will not be taken into consideration as candidates for execution by the runtime. With the table position given in the bytecode, the runtime will only check if there are overriding methods (which will appear at the same position in the method tables of subclasses if they exist). So, the compiler cannot hunt them down, and the runtime does not want to.

A consequence of this is, disturbingly, that the place where non-overridden overloaded methods are defined in the class hierarchy is of essential importance what regards the selection of the method being actually called. To me, this sounds a little scary, or would you really want your class design to be influenced by the crippled demands of overloading? Summing up: Overloading is a static compile-time feature which does not integrate well with our expectations shaped by dynamic method lookup coming along with inheritance.

What else can we do to shoot the dead man? (Who is still alive enough to ruin our programs, of course.) Bertrand Meyer sees overloading as a violation of the "principle of non-deception: differences in semantics should be reflected by differences in the text of the software" (OOSC 94, Bertrand Meyer). But wait a second, isn't late binding another case where there is only one method symbol for different methods?

As I understand it, the difference between late binding and overloading can be pinned to the observation that late binding lets one method name to be the pointer to one operation contract (which then can be fulfilled by several different methods whose differences are nevertheless absolutely transparent to the client code), whereas overloading lets one method name to be the pointer for several method specifications whose differences can be experienced in the client code. In the scope of the client, there is no difference between polymorphic calls bound to different methods. The polymorphic call specification is all the client has to know about the call. Overloaded methods, on the other hand, need not share common semantics, to be more precise, a common contract, their pre- and postconditions potentially varying wildly. This is something the client always has to take into account: Overloaded methods can not be used interchangeably, as different methods just under the same hood they have to be treated according to their specific contracts. These contracts, however, are hidden behind the same name which makes them hard to identify.The same method name does not point to a common denominator, in this case, but only serves to disguise differences that have to be laboriously disambiguated lateron. The client has to stay aware of the method contract being pointed to by a complicated three component key for the method which, as we have seen, consists of target reference type, method name, and parameter reference types.

So what are my final words to the programmer who, after having read this article, wonders if he should try to use overloading now wherever possible or not? Keep going... And if you really, really want to use it, go on and do so, but only with different method names - this is a trick stolen from real experts that can improve your overloading a lot! Clown

Sascha

话题树型展开

人气	标题	作者	字数	发贴时间
7033	[转贴] Overloading considered Harmful	Julian13	19891	2003-06-02 08:47
5605	Re:[转贴] Overloading considered Harmful	floater	197	2003-06-04 22:34
6926	Re:[转贴] Overloading considered Harmful	archonLing	634	2003-06-28 22:00

已读帖子

新的帖子

被删除的帖子