COMP2100/2500/6442 Software Construction in 2007


Lecture 6: Recursive Data Types 3

Summary

The third lecture on recursive data types. This lecture answers the question posed at the end of the last lecture, creating a more flexible form of expression tree.


Data types with a Visitor Facility

If the class hierarchy representing the recursive data type isn't going to change much but we might want to add more and more operations, it may make sense to define a corresponding Visitor class hierarchy and add appropriate accept() methods to the classes of the original hierarchy. The main idea of the Visitor design pattern is to "rotate" by 90° from the derivatives (subclasses in the original class hierarchy, here it's Expression) to the methods in the parallel Visitor hierarchy. The new functionality can be added by creating a new subclass in the Visitor hierarchy.

Here is the abstract class (or, the interface) that we would need for Expressions. Notice that we need a routine for each of the classes in the original hierarchy, and each method takes one parameter which is an instance of that class.

public interface ExpressionVisitor {

    public void visitConstant(Constant4 c);

    public void visitAddition(Addition4 a);

    public void visitMultiplication(Multiplication4 m);

    public void visitNegation(Negation4 n);
}

Expression Implementation (Version 4)

Class Expression4 is now pretty simple. It too can be declared as an interface if we have nothing to put in there except for method declarations and final (constant) fields (in this case, the child classes will implement this interface; or it can be declared as abstract, in which case he child classes extend this parent class.

public abstract class Expression4 {

    public abstract void accept(ExpressionVisitor visitor);

}

The different subclasses implement accept(), which tells expression visitors what type they are and thus which of their routines to run on them.

public class Constant4 extends Expression4 {

    int value;

    public Constant4(int v) {
	value = v;
    }

    public void accept(ExpressionVisitor visitor) {
	visitor.visitConstant(this);
    }
}

public class Addition4 extends Expression4 {

    Expression4 left, right;
   
    public Addition4(Expression4 l, Expression4 r) {
	left = l;
	right = r;
    }
   
    public void accept(ExpressionVisitor visitor) {
	visitor.visitAddition(this);
    }
}

Classes for Multiplication and Negation are similar.

Note that the fields in these classes (value in Constant and left and right in Addition) have no visibility modifier (public, protected or private). This means that these fields have package visibility. Only instances of classes in the same package can access this field. If we organise our program so that Expression and its subclasses, and ExpressionVisitor and its subclasses are all in the same package (and nothing else is), then this will work correctly. Those fields will be accessible to the classes that need access, but hidden from everything else. Alternatively we would have to do the usual nonsense with making the fields private and having public accessor methods.

Now we can define visitors to evaluate and format Expressions.

public class ExpressionEvaluator implements ExpressionVisitor {

    /** The calculated value */
    private int value;

    /** public view of the value */
    public int getValue() { return value; }

    /** Get the value of a constant expression */
    public void visitConstant(Constant4 c) {
	value = c.value;
    }

    /** Get the value of a sum */
    public void visitAddition(Addition4 a) {
	ExpressionEvaluator leftEvaluator = new ExpressionEvaluator();
	ExpressionEvaluator rightEvaluator = new ExpressionEvaluator();
	a.left.accept(leftEvaluator);
	a.right.accept(rightEvaluator);
	value = leftEvaluator.value + rightEvaluator.value;
    }

    /** Get the value of a product */
    public void visitMultiplication(Multiplication4 a) {
	ExpressionEvaluator leftEvaluator = new ExpressionEvaluator();
	ExpressionEvaluator rightEvaluator = new ExpressionEvaluator();
	a.left.accept(leftEvaluator);
	a.right.accept(rightEvaluator);
	value = leftEvaluator.value * rightEvaluator.value;
    }

    /** Get the value of a negation */
    public void visitNegation(Negation4 n) {
	n.expression.accept(this);
	value = -value;
    }
}

public class ExpressionFormatter implements ExpressionVisitor {

    /** The formatted string being built  */
    private String string;
    
    /** Public access to the string */
    public String getString() { return string; }

    /** Initialise with an empty string */
    public ExpressionFormatter() {
	string = "";
    }

    /** Get the string for a constant */
    public void visitConstant(Constant4 c) {
	string += c.value;
    }

    /** Build the string for a sum */
    public void visitAddition(Addition4 a) {
	string += "(";
	a.left.accept(this);
	string += " + ";
	a.right.accept(this);
	string += ")";
    }

    /** Build the string for a product */
    public void visitMultiplication(Multiplication4 a) {
	string += "(";
	a.left.accept(this);
	string += " * ";
	a.right.accept(this);
	string += ")";
    }

    /** Build the string for a negation */
    public void visitNegation(Negation4 n) {
	string += "-(";
	n.expression.accept(this);
	string += ")";
    }
}

This design leads to a delicate (confusing?) exchange of messages between the objects in an expression tree and the objects performing operations on them.

The reason for this apparent complexity is that this is an example of double (dual) dispatch (called so because it involves two polymorphic dispatches). If I make the routine call ‘expression.accept(visitor)’, the actual operation that is carried out depends on two runtime types: the actual type of ‘expression’ and that of ‘visitor’.

(Usually only the type of the target matters. That is, in a method call a.b(), the version of b() that actually executes is determined at runtime based on the actual type of a. This is called dynamic dispatch.)


Using Expression (Version 4)

Expression4 a, b, c, d, e;
ExpressionFormatter f = new ExpressionFormatter();
ExpressionEvaluator v = new ExpressionEvaluator();

a = new Constant4(1);
b = new Constant4(2);
c = new Addition4(a, b);
a = new Constant4(3);
b = new Constant4(4);
d = new Addition4(a, b);
e = new Multiplication4(c, d);

e.accept(f);
e.accept(v);
System.out.println(f.getString() + " = " + v.getValue());

Assessment of Expression (Version 4)

+ Class Expression is now very simple.

+ Easy to create new visitors for Expression without modifying any existing classes.

+ No subtle manipulation of object references.

+ Each expression type has only relevant attributes.

+ All the code for a particular operation is in the one concrete subclass of ExpressionVisitor, rather than spread across all the different subclasses of Expression.

- Interaction between Expression and ExpressionVisitor is subtle.

- Difficult to add new kinds of expression, e.g., division.

This style of interaction is called the Visitor pattern.


Shortcomings of Visitor

The Visitor DP (Design Pattern) works well if the visited hierarchy (Expression) is stable, ie, does not undergo modification (at least, not too often). The reason is that the described above Visitor design pattern introduces a cyclic dependency which ties together all the visited derivatives (subclasses of Expression). Namely, the Expression interface, and therefore, all the subclasses in the visited hierarchy, depends on the Visitor interface (all the visit methods). If we add one subclass to the visited hierarchy, the Visitor interface must change to include the visit method for the new subclass. Since the Expression depends on the Visitor, every subclass of Expression depends on Visitor, and thus if we add even one small class to Expression hierarchy, then the whole big hierarchy must be recompiled. The latter operation may be costly, and can sometimes be impossible if we do not have access to the original source code.

There are ways to address this problem. One modification of the Visitor DP (known as Acyclic Visitor) involves multiple inheritance and runtime type identification.


Extension work

Java overloading and naming the Visitor methods

Because the methods in the Visitor interface all have different signatures, we could overload the definitions and not invent new names for each one:

public interface ExpressionVisitor {

    public void visit(Constant4 c);

    public void visit(Addition4 a);

    public void visit(Multiplication4 m);

    public void visit(Negation4 n);
}

You can decide for yourself whether this is easier to create, and easier to use (is it any more prone to programming errors?)

Design Patterns

The topic of Design Patterns in general is one of the subjects discussed in the Comp2110/2510/6444 course on Software Design.

____________________________________________________

Copyright © 2006, 2007 Jim Grundy, Ian Barnes, Richard Walker & Chris Johnson, The Australian National University
$Revision: 1.3 $ $Date: 2007/03/09 05:11:50 $ $Author: cwj $
Feedback & Queries to comp2100@cs.anu.edu.au