Refactoring legacy instanceof switch casing via design patterns

Question

My company's legacy code is suffering from prevalent usage of instanceof switch-casing, in the form of:

if(object instanceof TypeA) {
   TypeA typeA = (TypeA) object;
   ...
   ...
}
else if(object instanceof TypeB) {
   TypeB typeB = (TypeB) object;
   ...
   ...
}
...
...

To make things worse, several of the TypeX classes in questions are actually wrappers of classes found in 3rd party libraries.

The suggested approach of using the visitor design pattern and dedicated visitor design pattern wrappers on 3rd party classes as seen here (instanceof -> Visitor DP) and here (Visitor DP with 3rd party classes) seems like a good approach.

However, during a code review session where this approach was suggested, the question of the additonal overhead of boilerplate code required by each refactoring of the instaceof switch-casing has lead to this mechanism being declined.

I would like to fix this ongoing issue and I am considering a generic approach to the problem:

A utility class which will wrap the visitor design pattern with generic referencing to the visited objects. The idea is to implement the generic core of the visitor utility class once and only once, and provide specific implementations to the TypeX object behaviour where needed - hopefully even reusing some implementations via OO extention of the functionality implementing classes.

My question is - has anyone here done something similiar? If not - can you point out any pros/cons that might be relevant?

EDIT : Too much boilerplate code = implementing the visitor design pattern specifically for each instance of an instanceof switch-case. This is obviously redundent and will cause a lot of code duplication, if the visitor DP is not implemented using generics.

As for the generic visitor DP utility I had in mind :

First of all, usage of reflection with the visitor DP as seen here.

Second, the following usage of generics (based on the reflective visitor):

public interface ReflectiveVisitor<GenericReturn,GenericMetaData>
{
   public GenericReturn visit(Object o, GenericMetaData meta);
}
public interface ReflectiveVisitable<A,B>
{
   public GenericReturn accept(Visitor visitor, GenericMetaData meta);
}

GenericReturn and GenericMetaData are interfaces aimed at providing any additionally required meta data for specific logics to be implemented, and to provide versatility for return types returned by the visitor DP.

Thanks in advance!

EDIT : Boiler plate coding when refactoring from instanceof to the visitor :

A common use case I'd have to handle is instanceof switchcasing in order to perform single API calls of concrete implementations :

public class BoilerPlateExample
...
if(object instanceof TypeA) {
   ((TypeA) object).specificMethodTypeA(...)......;
}
else if(object instanceof TypeB) {
   ((TypeB) object).completeyDifferentTypeBMethod(...)......;
}
...
...

As for the visitor design handling this?

public interface Visitor
{
   // notice how I just binded my interface to a specific set of methods?
   // this interface will have to be generic in order to avoid an influx of
   // of dedicated interfaces
   public void visit(TypeA typeA);
   public void visit(TypeB typeB);
}
public interface Visitable
{
   public void accept(Visitor visitor);
}

public class BoilerPlateExampleVisitable<T> implements Visitable
{
   // This is basically a wrapper on the Types
   private T typeX;
   public BoilerPlateExampleVisitable (T typeX) {
      this.typeX = typeX;
   }
   public void accept(Visitor visitor) {
      visitor.visit(typeX);
   }
}

public class BoilerPlateExampleVisitor implements Visitor
{
   public void visit(TypeA typeA) {
      typeA.specificMethodTypeA(...)......;
   }
   public void visit(TypeB typeB) {
      typeB.completeyDifferentTypeBMethod(...)......;
   }
}

public static final BoilerPlateExampleVisitor BOILER_PLATE_EXAMPLE_VISITOR = new BoilerPlateExampleVisitor();
public static void main(....) {
    TypeA object = .....; // created by factory
    BoilerPlateExampleVisitable boilerPlateVisitable = VisitableFactory.create(object); // created by dedicated factory, warning due to implicit generics
    boilerPlateVisitable.accept(BOILER_PLATE_EXAMPLE_VISITOR);
}

https://stackoverflow.com/questions/103564/the-performance-impact-of-using-instanceof-in-java?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa — Tal Avissar, Mar 29 '18 at 15:54
@TalAvissar : this is not an issue of performence, rather of using OOP properly instead of just copy-pasting code around. — Rann Lifshitz, Mar 29 '18 at 16:03
what kind of boilerplate code required by each refactoring? I can see so much boilerplate for just implementing accept interface — Tal Avissar, Mar 29 '18 at 16:11
I think you have to be a bit more concrete. Can you show code examples of the "too much boilerplate" and an example of the utility class you are thinking of? How many switch cases are we talking about? How much and what kind of code is in the switch? Is the behavior duplicated? Does the behavior change often, or are new data types frequently added? Are there many datatypes, many switches, or both? Do the types extend from the same base class? It all depends on these kind of details. — NickL, Mar 29 '18 at 17:29
@NickL : The legacy code is the product of 10 years of development by many developers, the majority of whom have left the company. This means - dozens of instanceof switch cases, with a recurring theme of casting a base class in order to get its extended API and perform some logic (in some cases a single API call, and in others a significant chunk of code). Some behaviours are indeed duplicated - calling the same API on different concrete implementations. Not a lot of extension by composition - quite rare, actually. — Rann Lifshitz, Mar 29 '18 at 18:32
See edit to OP regarding how to implement the visitor DP generically. — Rann Lifshitz, Mar 29 '18 at 19:03
Try to first reduce instanceof by using types (bounded types) whereever possible. For example, use Generics in collections, methods arguments, inheritance. — Shailesh Pratapwar, Apr 04 '18 at 04:44
@ShaileshPratapwar : I mentioned using generics in the question. Can you provide an actual implementation, or and idead of a design ? — Rann Lifshitz, Apr 04 '18 at 04:50
That reflective visitor looks like a pretty terrible idea. The whole point of the visitor pattern is that it provides static guarantees such as 1. that you have a case implemented for every possible type and 2. that each implemented case is correct. The code in that article doesn't actually offer any improvement over `instanceof`. The fact that the code in that article requires the methods to be named in a certain way and fails silently if there's a typo is also terrible. — Radiodef, Apr 04 '18 at 17:28
@Radiodef : good points buddy. Any thoughts on how to handle the issues I raised in this question ? — Rann Lifshitz, Apr 04 '18 at 18:38
@Radiodef : Its not a matter of the visitor design not working - its the concern from the tech managers, that refactoring out instanceof into the visitor solution will be very time consuming and will generate a lot of code. That means i havent been grenn-lighted to start PoCing aorund the visitor utility i have been thinking about.... — Rann Lifshitz, Apr 05 '18 at 05:26
How is your goal different than an approach to use abstract classes? I'm probably being naive (Visitor pattern is a new term for me), but it looks to me as though an abstract class with the common functionality, followed by a call to super() in the extending class would take care of your situation, would it not? — Stephan, Apr 09 '18 at 15:51
@Stephan : What you are suggesting would require a major refactoring overhall of all the BL classes in our legacy code, which would not be approved by the tech managers..... — Rann Lifshitz, Apr 09 '18 at 15:53
Approved or not, your question asks for approaches to refactor the code to remove instanceof usage. Business bureaucracy is irrelevant from a technical perspective. — Stephan, Apr 10 '18 at 12:59
@Stefan : Agreed. If it were up to me I'd refactor more then half of the legacy code....... — Rann Lifshitz, Apr 10 '18 at 18:30

jferard · Accepted Answer · 2018-04-07T19:55:12.107

TL;DR: Let's say that you have N classes with M operations each. You need the visitor pattern only if M may grow and N is already big. Else use polymorphism.

Maybe I will push an open door, because you already thought of this, but here are a few thoughts.

Visitor pattern

In the generic case, you will use the visitor pattern only if you want to add new operations without refactoring all classes. That is when M may grow and N is already big.

For every new operation, you create a new visitor. This visitor accepts the N classes and handles the operation for each of them:

public class NewOperationVisitor implements Visitor
{
   public void visit(TypeA typeA) {
        // apply my new operation to typeA
   }
   public void visit(TypeB typeB) {
        // apply my new operation to typeB
   }
   ...
}

Thus, you don't have to add the new operation to all the N classes, but you have to refactor every visitor if you add a class.

Polymorphism

Now, if M is stable, please avoid the visitor pattern: use polymorphism. Every class has a well defined set of methods (roughly one per operation). If you add a class, just define the known operations for that class:

public class TypeX implements Operator 
{
    public void operation1() {
        // pretty simple
    }

    public void operation2() {
        // pretty simple
    }
}

Now, you have to refactor every class if you add an operation, but adding a class is very easy.

This tradeoff is explained in Clean Code by R. C. Martin (6. Objects and Data Structures, Data/Object Anti-Symmetry):

Procedural code [here: the visitor] makes it hard to add new data structures because all the functions must change. OO code makes it hard to add new functions because all the classes must change.

What you should do

As stated in a @radiodef comment, avoid reflection and other tricks. This will be worse than the issue.
Clearly separate where you really need the visitor pattern, and where you don't. Count classes and operations. I bet that, in most of the cases, you don't need the visitor pattern. (Your managers are probably right!). If you need the visitor pattern in 10 % of the cases, maybe the "additonal overhead of boilerplate code" will be acceptable.
Since several of your TypeX classes are already wrappers, you maybe need to wrap better. Sometimes, one wraps from bottom to top: "My 3rd party class has those methods: I will wrap the methods I need and forget the others. And I will keep the same names to keep it simple." Instead, you have to carefully define the service a TypeX class should provide. (A hint: look in your if ... instanceof ... bodies). A then wrap again the 3rd party libs to provide thoses services.
Really: avoid avoid reflection and other tricks.

What would I do?

You asked in a comment for a pseudo-code, but I can't give it to you since I have a method in mind, not a procedure or an algorithm.

Here's a minimal step by of step of what I would do in such a situation.

Isolate every "big `instanceof` switch" in a method

That's almost a medical advice! Before:

public void someMethod() {
    ...
    ...
    if(object instanceof TypeA) {
       TypeA typeA = (TypeA) object;
       ...
       ...
    }
    else if(object instanceof TypeB) {
       TypeB typeB = (TypeB) object;
       ...
       ...
    }
    ...
    ...
}

After:

public void someMethod() {
    ...
    ...
    this.whatYouDoInTheSwitch(object, some args);
    ...
    ...
}

And:

private void whatYouDoInTheSwitch(Object object, some args) {
    if(object instanceof TypeA) {
       TypeA typeA = (TypeA) object;
       ...
       ...
    }
    else if(object instanceof TypeB) {
       TypeB typeB = (TypeB) object;
       ...
       ...
    }
}

Any decent IDE will do it for free.

If you are in a case that would need the Visitor pattern

Leave the code as this, but document it:

/** Needs fix: use Visitor Pattern, because... (growing set of operations, ...) */
private void whatYouDoInTheSwitch(Object object, some args) {
    ...
}

If you want to use polymorphism

The goal is to switch from:

this.whatYouDoInTheSwitch(object, other args);

To:

object.whatYouDoInTheSwitch(this, other args);

You have a little refactoring to do:

A. Create a method for every case in the big switch. All those methods should have the same signature, except for the type of the object:

private void whatYouDoInTheSwitch(Object object, some args) {
    if(object instanceof TypeA) {
        this.doIt((TypeA) object, some args);
    }
    else if(object instanceof TypeB) {
        this.doIt((TypeB) object, some args);
    }
}

Again, any IDE will do it for free.

B. Create an interface with the following method:

doIt(Caller caller, args);

Where Caller is the type of the class you are refactoring (the one that contains the big switch).

C. Make every TypeX implement this interface by converting every doIt(TypeX objX, some args) into a doIt(Caller, some args) method from TypeX. Basically, that is a simple find-replace: replace this by caller and objX by this. But this may be a little more time consuming than the rest.

D. Now, you have:

private void whatYouDoInTheSwitch(Object object, some args) {
    if(object instanceof TypeA) {
        ((TypeA) object).doIt(this, some args);
    }
    else if(object instanceof TypeB) {
        ((TypeB) object).doIt(this, some args);
    }
}

This is strictly equivalent to:

private void whatYouDoInTheSwitch(Object object, some args) {
    if(object instanceof TypeA) {
        object.doIt(this, some args);
    }
    else if(object instanceof TypeB) {
        object.doIt(this, some args);
    }
}

Because at runtime, the JVM will find the right method for the right class (that's polymorphism!). Thus, this is also equivalent to (if object has one of the enumerated types):

private void whatYouDoInTheSwitch(Object object, some args) {
    object.doIt(this, some args);
}

E. Inline the method and you have in your Caller class:

public void someMethod() {
    ...
    ...
    object.doIt(this, some args);
    ...
    ...
}

Actually, this is only a sketch, and many special cases could occur. But it is guaranteed to be relatively fast and clean. It may be done for a selected method only of for all methods.

Be sure to test, if possible, the code after every step. And be sure to choose the right names for the methods.

Really nice answer, I wish you could post a pseudo-code example of the direction you are suggesting, in terms of a more modular solution. I am thinking of using a Utility class/method in order to handle this issue, and am still wondering if generics might be able to help out. Any thoughts on your end ? — Rann Lifshitz, Apr 07 '18 at 05:28
And please let me emphasize: I do not have the privilege of doing an extensive refactoring session. The best I can hope for is creating a utility aimed at replacing the instanceof switch-casing, hence considering the usage of reflection and generics. — Rann Lifshitz, Apr 07 '18 at 05:44

Joop Eggen · Answer 2 · 2018-04-07T20:36:02.467

Seems like polymorphism. Such code could stem from a heterogene set of business object classes, like Excel ReportX, Zip, TableY, and actions like Open, Close, Save and such.

As it is, this kind of programming causes a huge coupling between classes, and has issues with completeness of all cases, extendibility.

In the case of polymorphism the actual wrapper for some bussiness object should provide Actions (Open, Save, Close).

This mechanism is similar to java swing, where an edit field has its list of actions (Cut, Copy, Paste and such), and a tree view an overlapping set of actions. Depending on the focus the actual actions will be installed in the menu actions.

A declarative specification might be in order: say an XML that "holds" beans and their actions.

Should you have some MVC paradigm on the radar, consider the following: every action could have parameters. Use PMVC (my idea), a Parameters class apart from the Model class, as those infos have a different life cycle, and are constant.

The road hereto could be:

a prototype with two business objects and two actions.
refactoring with one polymorph business object containing all (old code).
slowly moving one class after another to their own business object.
the first removal from the polymorph business object being used to clean up the new architecture.

I would refrain from using inheritance (a BaseDocument with open/save) as this probably does not fit a more heterogeneous reality, and could cause parallel class hierarchies (XDoc with XContainer and XObject).

How it is actually done is still your work. I also would be eager to learn whether there exists an established paradigm.

Asked pseudo-code One would need a bit of analysis with some prototype made - a proof of concept. However there is a discovery of (dynamic) capabilities/features.

public interface Capabilities {
    <T> Optional<T> as(Class<T> type);
}

Add this interface to every case class, and you can do:

void f(Capabilities animal) {
    int distance = 45;
    animal.as(Flying.class).ifPresent(bird -> bird.fly(distance));
}

The infrastructure would be: first the registration of capabilities and discovery can be placed in a separate class.

/**
 * Capabilities registration & discovery map, one can delegate to.
 */
public class CapabilityLookup implements Capabilities {

    private final Map<Class<?>, Object> capabilitiesMap = new HashMap<>();

    public final <T> void register(Class<T> type, T instance) {
        capabilitiesMap.put(type, instance);
    }

    @Override
    public <T> Optional<T> as(Class<T> type) {
        Object instance = capabilitiesMap.get(type);
        return instance == null ? Optional.empty()
                                : Optional.of(type.cast(instance));
    }
}

Then the legacy classes can be augmented:

/** Extended legacy class. */
public class Ape implements Capabilities {

    private final CapabilityLookup lookup = new CapabilityLookup();

    public Ape() {
        lookup.register(String.class, "oook");
    }

    @Override
    public <T> Optional<T> as(Class<T> type) {
        return lookup.as(type); // Delegate to the lookup map.
    }
}

/** Extended legacy class. */
public class Bird implements Capabilities {

    private final CapabilityLookup lookup = new CapabilityLookup();

    public Bird() {
        lookup.register(Flying.class, new Flying() {
            ...
        });
        lookup.register(Singing.class, new Singing() {
            ...
        });
    }

    @Override
    public <T> Optional<T> as(Class<T> type) {
        return lookup.as(type); // Delegate to the lookup map.
    }
}

As you can see with Bird the original code would move into the actual class of the interface, here into Bird, as the instance was created in the constructor. But instead of an anonymous class one could make a BirdAsFlying class, a kind of action class in java swing parlance. An inner class has the advantage to access Bird.this.

The refactoring can be done incrementally. Add Capabilities to all "instanceof" legacy classes. An if-sequence would normally be one interface, but could also be two, or one interface with two methods.

I would definitely adopt this solution if it were up to me alone. However, knowing my managers' way of thinking, I suspect the overhead of adding the PMVC architecture, coupled with the intensive refactoring required in order to bind our BL objects with the PMVC wrappers will be unacceptable. Is there a more lightweight approach perhaps? Something that might use the Utility class approach I discussed in my question? — Rann Lifshitz, Apr 05 '18 at 13:58
Forget for the moment about the embellishments like XML/PMVC. Search and list all if-cases and see whether those points are something like actions, configurable elsewhere. For some solution: some **bean container** maybe. The problem without inspecting the code, and _prototyping_ a bit any "solution" will be hard to nail down. — Joop Eggen, Apr 05 '18 at 14:24
I like the concept you are suggesting, however the bean container (lets call it Spring in order to tie it with frameworks previously discussed at my workplace) will not be approved for integration. The only posibillity for me is a solid, light weight design encapsulating this problem, with a minimum of refactoring on our legacy code. — Rann Lifshitz, Apr 05 '18 at 18:41
Present a prototype, proof-of-concept. There are more light-weight frameworks, with CDI, though my experience is mainly with Spring. Look I do not think there is much chance on an other answer here, so good luck. — Joop Eggen, Apr 06 '18 at 07:28
Any chance you might ask some of your colleges to pitch in their 2 cents over here ? — Rann Lifshitz, Apr 06 '18 at 11:48
Fair enough. Can you at least a post pseudo code example of your suggest solution? — Rann Lifshitz, Apr 06 '18 at 19:26
I have added a swiss-knife kind of technique, which has the advantage of being clear in its application to the code. — Joop Eggen, Apr 07 '18 at 13:32
Cool. Looks like the best answer for now. If nothing else shows up - the bounty will go here. — Rann Lifshitz, Apr 07 '18 at 16:08
Thank you very much for your insights. This answer will be more difficult to adopt then the one suggested by : https://stackoverflow.com/users/6914441/jferard , so I will go with the other solution. — Rann Lifshitz, Apr 10 '18 at 00:20
Certainly looks like a good solution. I wish you a nice refactoring. — Joop Eggen, Apr 10 '18 at 07:16
Thank you very much Mr. Eggen, you are a coder and a gentleman :) — Rann Lifshitz, Apr 10 '18 at 07:19