98

There is a lot of talk on decoupling the algorithms from the classes. But, one thing stays aside not explained.

They use visitor like this

abstract class Expr {
  public <T> T accept(Visitor<T> visitor) { return visitor.visit(this); }
}

class ExprVisitor extends Visitor {
  public Integer visit(Num num) {
    return num.value;
  }

  public Integer visit(Sum sum) {
    return sum.getLeft().accept(this) + sum.getRight().accept(this);
  }

  public Integer visit(Prod prod) {
    return prod.getLeft().accept(this) * prod.getRight().accept(this);
  }
}

Instead of calling visit(element) directly, Visitor asks the element to call its visit method. It contradicts the declared idea of class unawareness about visitors.

PS1 Please explain with your own words or point to exact explanation. Because two responses I got refer to something general and uncertain.

PS2 My guess: Since getLeft() returns the basic Expression, calling visit(getLeft()) would result in visit(Expression), whereas getLeft() calling visit(this) will result in another, more appropriate, visit invocation. So, accept() performs the type conversion (aka casting).

PS3 Scala's Pattern Matching = Visitor Pattern on Steroid shows how much simpler the Visitor pattern is without the accept method. Wikipedia adds to this statement: by linking a paper showing "that accept() methods are unnecessary when reflection is available; introduces term 'Walkabout' for the technique."

Xolve
  • 22,298
  • 21
  • 77
  • 125
Val
  • 1
  • 8
  • 40
  • 64
  • http://stackoverflow.com/questions/3262811/what-is-single-and-double-dispatch – Mark Simpson Feb 03 '12 at 16:53
  • It says "when the visitor calls accept, the cal is dispatched based on the type of the callee. Then the callee calls back the visitor's type specific visit method, and this call is dispatched based on the actual type of the visitor." In other words, it states the thing that confuses me. For this reason, can you please be more specific? – Val Feb 03 '12 at 19:41

5 Answers5

173

The visitor pattern's visit/accept constructs are a necessary evil due to C-like languages' (C#, Java, etc.) semantics. The goal of the visitor pattern is to use double-dispatch to route your call as you'd expect from reading the code.

Normally when the visitor pattern is used, an object hierarchy is involved where all the nodes are derived from a base Node type, referred to henceforth as Node. Instinctively, we'd write it like this:

Node root = GetTreeRoot();
new MyVisitor().visit(root);

Herein lies the problem. If our MyVisitor class was defined like the following:

class MyVisitor implements IVisitor {
  void visit(CarNode node);
  void visit(TrainNode node);
  void visit(PlaneNode node);
  void visit(Node node);
}

If, at runtime, regardless of the actual type that root is, our call would go into the overload visit(Node node). This would be true for all variables declared of type Node. Why is this? Because Java and other C-like languages only consider the static type, or the type that the variable is declared as, of the parameter when deciding which overload to call. Java doesn't take the extra step to ask, for every method call, at runtime, "Okay, what is the dynamic type of root? Oh, I see. It's a TrainNode. Let's see if there's any method in MyVisitor which accepts a parameter of type TrainNode...". The compiler, at compile-time, determines which is the method that will be called. (If Java indeed did inspect the arguments' dynamic types, performance would be pretty terrible.)

Java does give us one tool for taking into account the runtime (i.e. dynamic) type of an object when a method is called -- virtual method dispatch. When we call a virtual method, the call actually goes to a table in memory that consists of function pointers. Each type has a table. If a particular method is overridden by a class, that class' function table entry will contain the address of the overridden function. If the class doesn't override a method, it will contain a pointer to the base class' implementation. This still incurs a performance overhead (each method call will basically be dereferencing two pointers: one pointing to the type's function table, and another of function itself), but it's still faster than having to inspect parameter types.

The goal of the visitor pattern is to accomplish double-dispatch -- not only is the type of the call target considered (MyVisitor, via virtual methods), but also the type of the parameter (what type of Node are we looking at)? The Visitor pattern allows us to do this by the visit/accept combination.

By changing our line to this:

root.accept(new MyVisitor());

We can get what we want: via virtual method dispatch, we enter the correct accept() call as implemented by the subclass -- in our example with TrainElement, we'll enter TrainElement's implementation of accept():

class TrainNode extends Node implements IVisitable {
  void accept(IVisitor v) {
    v.visit(this);
  }
}

What does the compiler know at this point, inside the scope of TrainNode's accept? It knows that the static type of this is a TrainNode. This is an important additional shred of information that the compiler was not aware of in our caller's scope: there, all it knew about root was that it was a Node. Now the compiler knows that this (root) is not just a Node, but it's actually a TrainNode. In consequence, the one line found inside accept(): v.visit(this), means something else entirely. The compiler will now look for an overload of visit() that takes a TrainNode. If it can't find one, it'll then compile the call to an overload that takes a Node. If neither exist, you'll get a compilation error (unless you have an overload that takes object). Execution will thus enter what we had intended all along: MyVisitor's implementation of visit(TrainNode e). No casts were needed, and, most importantly, no reflection was needed. Thus, the overhead of this mechanism is rather low: it only consists of pointer references and nothing else.

You're right in your question -- we can use a cast and get the correct behavior. However, often, we don't even know what type Node is. Take the case of the following hierarchy:

abstract class Node { ... }
abstract class BinaryNode extends Node { Node left, right; }
abstract class AdditionNode extends BinaryNode { }
abstract class MultiplicationNode extends BinaryNode { }
abstract class LiteralNode { int value; }

And we were writing a simple compiler which parses a source file and produces a object hierarchy that conforms to the specification above. If we were writing an interpreter for the hierarchy implemented as a Visitor:

class Interpreter implements IVisitor<int> {
  int visit(AdditionNode n) {
    int left = n.left.accept(this);
    int right = n.right.accept(this); 
    return left + right;
  }
  int visit(MultiplicationNode n) {
    int left = n.left.accept(this);
    int right = n.right.accept(this);
    return left * right;
  }
  int visit(LiteralNode n) {
    return n.value;
  }
}

Casting wouldn't get us very far, since we don't know the types of left or right in the visit() methods. Our parser would most likely also just return an object of type Node which pointed at the root of the hierarchy as well, so we can't cast that safely either. So our simple interpreter can look like:

Node program = parse(args[0]);
int result = program.accept(new Interpreter());
System.out.println("Output: " + result);

The visitor pattern allows us to do something very powerful: given an object hierarchy, it allows us to create modular operations that operate over the hierarchy without needing requiring to put the code in the hierarchy's class itself. The visitor pattern is used widely, for example, in compiler construction. Given the syntax tree of a particular program, many visitors are written that operate on that tree: type checking, optimizations, machine code emission are all usually implemented as different visitors. In the case of the optimization visitor, it can even output a new syntax tree given the input tree.

It has its drawbacks, of course: if we add a new type into the hierarchy, we need to also add a visit() method for that new type into the IVisitor interface, and create stub (or full) implementations in all of our visitors. We also need to add the accept() method too, for the reasons described above. If performance doesn't mean that much to you, there are solutions for writing visitors without needing the accept(), but they normally involve reflection and thus can incur quite a large overhead.

BenMorel
  • 34,448
  • 50
  • 182
  • 322
atanamir
  • 4,833
  • 3
  • 24
  • 20
  • 5
    **Effective Java Item #41** includes this warning: "_avoid situations where the same set of parameters can be passed to different overloadings by the addition of casts._" The `accept()` method becomes necessary when this warning is violated in the Visitor. – jaco0646 Aug 04 '15 at 21:41
  • "*Normally when the visitor pattern is used, an object hierarchy is involved where all the nodes are derived from a base Node type*" this is absolutely not necessary in C++. See Boost.Variant, Eggs.Variant – Jean-Michaël Celerier May 02 '16 at 10:22
  • It seems to me that in java we don't really need the accept method because in java we always call the most specific type method – Gilad Baruchian Aug 08 '17 at 11:01
  • 2
    Wow, this was an awesome explanation. Enlightening to see that all the shadows of the pattern are because of compiler limitations, and now reveals clear thanks to you. – Alfonso Nishikawa Sep 28 '17 at 16:10
  • @GiladBaruchian, the compiler generates a call to the most specific type method which the *compiler* can determine. – mmw Dec 19 '19 at 14:09
  • "Java does give us one tool for taking into account the runtime (i.e. dynamic) type of an object when a method is called -- virtual method dispatch." You meant C++? – Matías Santurio Jul 24 '23 at 23:45
15

Of course that would be silly if that was the only way that Accept is implemented.

But it is not.

For example, visitors are really really useful when dealing with hierarchies in which case the implementation of a non-terminal node might be something like this

interface IAcceptVisitor<T> {
  void Accept(IVisit<T> visitor);
}
class HierarchyNode : IAcceptVisitor<HierarchyNode> {
  public void Accept(IVisit<T> visitor) {
    visitor.visit(this);
    foreach(var n in this.children)
      n.Accept(visitor);
  }

  private IEnumerable<HierarchyNode> children;
  ....
}

You see? What you describe as stupid is the solution for traversing hierarchies.

Here is a much longer and in depth article that made me understand visitor.

Edit: To clarify: The visitor's Visit method contains logic to be applied to a node. The node's Accept method contains logic on how to navigate to adjacent nodes. The case where you only double dispatch is a special case where there are simply no adjacent nodes to navigate to.

George Mauer
  • 117,483
  • 131
  • 382
  • 612
  • 7
    Your explanation does not explain why it should be responsibility of the Node rather than Visitor's appropriate visit() method to iterate the children? Do you mean that the main idea is sharing the hierarchy traversal code when we need the same visiting patters for different visitors? I do not see any hints from the recommended paper. – Val Feb 03 '12 at 19:37
  • You see, your answer, with focus on hierarchy traversal, is different from Matt's, who seems stating that walk in circles is indispensable for double dispatch. – Val Feb 03 '12 at 19:50
  • @Val - yes, I am giving you a specific use that I have found it incredibly useful. Honestly, I think it is mostly useful when working with object graphs, fortunately a LOT of (most?) problems in programming can be viewed in terms of networks. With this setup **the visitor does not need to know that it is being applied to a hierarchy** it only needs to be concerned with what it does to an item, not to how to navigate to the next one. In CS terms this seperates action from network topology. I really recommend you make an effort to understand that article - it is one of the ones I recommend most. – George Mauer Feb 03 '12 at 20:52
  • 1
    Saying that accept is good for routine traversal is reasonable and worth for general population. But, I took my example from somebodyelse's "I could not understand Visitor pattern until I read http://andymaleh.blogspot.com/2008/04/scalas-pattern-matching-visitor-pattern.html ". Neither this example nor Wikipedia nor other answers mention the navigation advantage. Nevertheless, they all demand this stupid accept(). That is why ask my question: Why? – Val Feb 03 '12 at 23:32
  • Actually, Wikipedia says that we want to get rid of those accept() methods! – Val Feb 03 '12 at 23:57
  • 1
    @Val - what do you mean? I am not sure what you're asking. I cannot speak for other articles as those people have different outlooks on this stuff but I doubt that we are in disagreement. In general in computation a lot of problems can be mapped to networks, so a usage might have nothing to do with graphs in the surface but is actually a very similar problem. – George Mauer Feb 05 '12 at 20:03
  • 1
    Providing an example of where some method might be useful does not answer the question of why the method is mandatory. Since navigation is not always needed, the accept() method is not always good for visiting. We should be able to accomplish our goals without it, therefore. Nevertheless, it is mandatory. It means that there is a stronger reason for introducing accept() into every visitor patter than "it is sometimes useful". What is not clear in my question? If you do not try to understand why Wikipedia looks for ways to get rid of accept, you are not interested to understand my question. – Val Feb 08 '12 at 16:44
  • 1
    @Val The paper they link to "The Essence of the Visitor Pattern" notes the same separation of navigation and operation in its abstract as I gave. They are simply saying that the GOF implementation (which is what you're asking about) has some limitations and annoyances that can be removed with use of reflection - so they introduce Walkabout pattern. This is certainly useful and can do much of the same stuff that visitor can but it is a lot of fairly sophisticated code and (on a cursory reading) looses some benefits of type safety. It's a tool for the toolbox but a more heavy one than visitor – George Mauer Feb 08 '12 at 18:15
  • @Val that's why I say I'm not sure what you're asking. Are you asking of why would you ever use visitor over Walkabout? – George Mauer Feb 08 '12 at 18:19
  • Using diplomatic language, you hide the essence. Say "our accept method" instead of "some annoyances". They want to get rid of annoying accept method. Removing it, you eliminate the opportunity to exploit accept() for implementing the navigation. This contradicts to your idea that visitor mandates accept() for navigation. – Val Feb 09 '12 at 17:19
0

The classes that require modification must all implement the 'accept' method. Clients call this accept method to perform some new action on that family of classes thereby extending their functionality. Clients are able to use this one accept method to perform a wide range of new actions by passing in a different visitor class for each specific action. A visitor class contains multiple overridden visit methods defining how to achieve that same specific action for every class within the family. These visit methods get passed an instance on which to work.

Visitors are useful if you are frequently adding, altering or removing functionality to a stable family of classes because each item of functionality is defined seperately in each visitor class and the classes themselves do not need changing. If the family of classes is not stable then the visitor pattern may be of less use, because many visitors need changing each time a class is added or removed.

andrew pate
  • 3,833
  • 36
  • 28
0

The purpose of the Visitor pattern is to ensure that objects know when the visitor is finished with them and have departed, so the classes can perform any necessary cleanup afterward. It also allows classes to expose their internals "temporarily" as 'ref' parameters, and know that the internals will no longer be exposed once the visitor is gone. In cases where no cleanup is necessary, the visitor pattern isn't terribly useful. Classes which do neither of these things may not benefit from the visitor pattern, but code which is written to use the visitor pattern will be usable with future classes that may require cleanup after access.

For example, suppose one has a data structure holding many strings that should be updated atomically, but the class holding the data structure doesn't know precisely what types of atomic updates should be performed (e.g. if one thread wants to replace all occurrences of "X", while another thread wants to replace any sequence of digits with a sequence that is numerically one higher, both threads' operations should succeed; if each thread simply read out a string, performed its updates, and wrote it back, the second thread to write back its string would overwrite the first). One way to accomplish this would be to have each thread acquire a lock, perform its operation, and release the lock. Unfortunately, if locks are exposed in that way, the data structure would have no way of preventing someone from acquiring a lock and never releasing it.

The Visitor pattern offers (at least) three approaches to avoid that problem:

  1. It can lock a record, call the supplied function, and then unlock the record; the record could be locked forever if the supplied function falls into an endless loop, but if the supplied function returns or throws an exception, the record will be unlocked (it may be reasonable to mark the record invalid if the function throws an exception; leaving it locked is probably not a good idea). Note that it's important that if the called function attempts to acquire other locks, deadlock could result.
  2. On some platforms, it can pass a storage location holding the string as a 'ref' parameter. That function could then copy the string, compute a new string based upon the copied string, attempt to CompareExchange the old string to the new one, and repeat the whole process if the CompareExchange fails.
  3. It can make a copy of the string, call the supplied function on the string, then use CompareExchange itself to attempt to update the original, and repeat the whole process if the CompareExchange fails.

Without the visitor pattern, performing atomic updates would require exposing locks and risking failure if calling software fails to follow a strict locking/unlocking protocol. With the Visitor pattern, atomic updates can be done relatively safely.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • 2
    1. visiting implies that you have only access to public methods of visited so that need to make internal locks accessable for public to be useful with Visitor. 2/ None of the examples I've seen before imply that Visitor is supposed to be used for changing status of visited. 3. "With the traditional VisitorPattern, one can only determine when we are entering a node. We don't know if we have left the previous node before we entered the current node." How do you unlock with only visit instead of visitEnter and visitLeave? Finally, I asked about applications of accpet() rather than Visitor. – Val Feb 03 '12 at 23:24
  • Perhaps I'm not entirely up to speed with terminology for patterns, but the "visitor pattern" seems to resemble an approach I've used where X passes Y a delegate, to which Y can then pass information which only needs to be valid as long as the delegate is running. Maybe that pattern has some other name? – supercat Feb 03 '12 at 23:59
  • 2
    This is an interesting *application* of the visitor pattern to a specific problem but does not describe the pattern itself or answer the original question. "In cases where no cleanup is necessary, the visitor pattern isn't terribly useful." This claim is definitely false and only relates to your specific problem and not the pattern in general. – Tony O'Hagan Jun 07 '17 at 00:41
-1

A good example is in source code compilation:

interface CompilingVisitor {
   build(SourceFile source);
}

Clients can implement a JavaBuilder, RubyBuilder, XMLValidator, etc. and the implementation for collecting and visiting all the source files in a project does not need to change.

This would be a bad pattern if you have separate classes for each source file type:

interface CompilingVisitor {
   build(JavaSourceFile source);
   build(RubySourceFile source);
   build(XMLSourceFile source);
}

It comes down to context and what parts of the system you want to be extensible.

Garrett Hall
  • 29,524
  • 10
  • 61
  • 76
  • The irony is that VisitorPattern offers us to use the bad pattern. It says that we must define a visit method for every kind of node it is going to visit. Secondly, it is not clear what are your examples are good or bad? How are they related to my question? – Val Feb 03 '12 at 23:28