Jacaranda Cookbook


To Jacaranda Home

Contents

  1. Recipe 1: Boolean DFA
  2. Recipe 2: Character FST


The cookbook of a framework is the essential tool for the application developer who wishes to implement an instance of the framework.
There are two kinds of Jacaranda users: the application user, who needs to refer to the user instructions, and the application developer, who needs to refer to the current document.
This cookbook offers two examples of how to realize an application from the Jacaranda framework. The two examples are fully implemented and they are part of the framework release.
 

Recipe 1: Boolean DFA


Let's start with the realization of the DFA presented in the introductory document. The automaton and its input document are the following:

}01
0
{1}01
1
0
Now we have to decide which classes we can use as they are and which ones we have to override. Notice that the out of the box classes for the Arc and for the Extractor abstractions are exactly the classes developed in this cookbook examples. They can be used by every application developer, because they are part of the release, but, for explanation purposes, I will not use them out of the box, rather I will show how to implement them.

Arc Implementation

Before implementing the Arc concrete class, we need to decide which object we want to use to represent  the alphabet {0,1}. Java offers the precooked class Boolean and we could easily use this. But, because the example is fairly simple, I want to add more to do and decide to implement the alphabet class. This is a useful exercise, because there are some small elements to pay attention to.
I call the class DfaBoolean and I will construct an object passing 0 or 1 as character to the constructor. Let now first see the class DfaArc:
 
class DfaArc extends AbstractArc
{
     public int compareInputTo(Object otherInput) throws FsmNotComparableException{
         if (otherInput instanceof DfaBoolean)
             return ((DfaBoolean)input).compareTo((DfaBoolean)otherInput);
         throw new FsmNotComparableException("Not comparable: " + otherInput);
     }

     public int compareOutputTo(Object otherOutput){
         return 0;   //No output
     }

     public Arc read(InputFileWrapper inputStream){
          char in = inputStream.readChar();
          if (in == '\n')
              return null;
          else{
              input = new DfaBoolean(in);
              return this;
          }
     }
}


Only three methods must be overridden. Obviously the read() method, because it must read the elements of the new defined alphabet, even though the read() structure is very similar to all implementations. Then the two compare methods, which must be able to determine an order among elements. This is important for efficiency reasons, due to the fact that the framework organizes elements with a mix between hash tables and ordered lists. The internal organization can change with new releases, but the framework must always be able to determine the order.
The compare methods must deliver negative values when the invoking element is less than the parameter,  a positive value in the opposite case, and zero in case of equality.
Notice also that the compareOutputTo() must always return 0 when the FSM has no output element (like our DFA). I did not define a default variant of it in AbstractArc, because I find it more dangerous to forget to implement it when it is needed, than to have to add a return 0 when not needed.

To complete the Arc implementation we need now the DfaBoolean class:
 

class DfaBoolean
{
    private boolean internalValue;

    DfaBoolean(char elt){
        if (elt == '0')
            internalValue = false;
        else
            internalValue = true;
     }

     int compareTo(DfaBoolean otherInput){
         if (internalValue == otherInput.getValue())
             return 0;
         else if (internalValue == false)
             return -1;
         else
             return 1;
     }

     public boolean equals(Object obj){
         return (obj instanceof DfaBoolean) && (compareTo((DfaBoolean)obj) == 0);
     }

     public int hashCode(){
         if (internalValue)
             return 1;
         return 0;
     }
}


Here there is no predefined interface. In fact all what is really needed is:

For a discussion on equals() and hashCode() see the document on architecture overview. They are important for the internal use of hash tables. Usually JDK Java classes already override those two methods, so in most cases you only have to check that the predefined class you intend to use (for example Boolean instead of DfaBoolean) has a comparing method.

Extractor Implementation

Due to the fact that the DFA only has to accept sequences, there is no real information to extract. The Extractor object must only flag when the DFA reaches a final state.
 
class DfaExtractor extends AbstractExtractor
{
     protected boolean acceptedSequence;

     DfaExtractor(){
         reset();
     }

     public void reset(){
         acceptedSequence = false;
     }

     public void visit(Object key, Arc arc, int traverse, int level){
          //nothing to do
     }

     public void addInfo(Flag finalStateFlag, int level){
         acceptedSequence = true;
     }

     public boolean hasResult(){
         return acceptedSequence;
     }

     public String iterateResult(Object filter, Object infoTables, Object matchCriteria){
         if (hasResult())
             return new String("Accepted sequence");
         return null;
     }
}

The real job is performed by addInfo(), which is called at each final state and knows therefore that a sequence has been fully accepted and that it has to mark it, and by iterateResult().
Notice however that reset() and hasResult() must be correctly implemented, because they are part of the interface and are used by the framework.

Sequence Implementation

The Sequence implementation is necessary because the out of the box version implements a sequence for elements of type Character. However the new class can extend the existent concrete one basically overwriting only two methods:
 
public Object getElementAt(int pos){
    return new DfaBoolean(sequence.charAt(pos));
}

public Sequence subSequence(int start){
    return new DfaBooleanSequence(sequence.substring(start));
}
 

Factory Implementation

Now we only need to implement the abstract factory implementing the requested methods (and eventually overwriting the default implemented ones). Here is the extension of FsmTool. It is simply a catalog of constructors:
 
public class DfaFsmTool extends FsmTool
{
     public DfaFsmTool(String fsmFileName) throws FsmException{
         super(fsmFileName);
     }

     public Sequence createSequence(Object sequence) throws FsmSequenceException{
         return new DfaBooleanSequence(sequence);
     }

     public Arc createArc(){
         return new DfaArc();
     }

     public Extractor createExtractor(){
         return new DfaExtractor();
     }

     public Traversal createTraversal(){
         return new DetTraversal(createExtractor());
     }
}

Quick Start


And now let run the the example. First we implement the main method in the Run class:
 

class Run
{
     public static void main(String[] args){
         try{
             FsmTool fsm = new DfaFsmTool(args[0]);
             fsm.getShell().interactiveAnalyze();
         }catch(FsmException e){
             e.printMessage();
         }
    }
}
And then we call the program from the prompt with the document Odd1.tra (available in dfa-doc):
java dfa.Run Odd1.tra
Other documents are available in dfa-doc. This implementation can be used to simulate other acceptor having the alphabet {1,0}, providing the right input documents.
 

Recipe 2: Character FST


This is the implementation of the character transducer shown in the Introductory document, storing however integer elements as information, instead of string elements. Read the Advanced Features document to see how to realize an information table.
Again we need to spot which elements we have to implement:

In fact we need to implement only Arc and Extractor, but the latter will be slightly more complex than in the former example, due to the fact that a FST must build a result during the traversal.

Arc Implementation

We have already seen how to implement the input and output elements from scratch, so in this example we will use a JDK delivered class: Character.
Here follows the Arc implementation. Again only read(), and the compare methods have to be implemented.
 
class TransArc extends AbstractArc
{
     public int compareInputTo(Object otherInput){
         return ((Character)input).compareTo(otherInput);
     }

     public int compareOutputTo(Object otherOutput){
         return ((Character)output).compareTo(otherOutput);
     }

     public Arc read(InputFileWrapper inputStream){
         char in = inputStream.readChar();
         if (in == '\n')
             return null;
         else{
             output = new Character(inputStream.readChar());
             input = new Character(in);
             return this;
         }
    }
}

Extractor Implementation

The Extractor implementation is more complex than the previous one, because it must accumulate the result during the traversal. Moreover the FST is non deterministic, and this means for the Extractor that it must be able to accumulate not one, but several results.
Let's look at the method visit(), the main responsible for this operation:
 
public void visit(Object key, Arc arc, int traverse, int level){
    char ch;
    if (traverse == FsmTool.ANALYSIS)
        ch = ((Character)arc.getOutput()).charValue();
    else
        ch = ((Character)arc.getInput()).charValue();
    if (resultList.size() == 0)
        resultList.add(new TransResult());
    else if (resultList.get(resultList.size()-1).isComplete())
        resultList.copyLastPrefix(level);
    if (ch != Flag.DUMMY)
        resultList.get(resultList.size()-1).put(level,ch);
}
The method must first check the direction of the traversal. If the direction is input->output the result must be built with the output information, otherwise the opposite. The variable resultList is of type TransList, a TransExtractor internal class, used as help structure. TransList stores pairs of (String, Flag) elements. If the Flag is not null the element is called "complete" and the called method isComplete() returns true. This means that one result has been accumulated and if new result elements are coming they must be added to a copy of the former element (copyLastPrefix()).
The check for Flag.DUMMY is performed for the cases where the sequence to accept is longer than the sequence to use as result. In such cases the exceeding character are not accumulated.

The other key method is addInfo() called at each final state. It is the responsible of setting the flag:
 

public void addInfo(Flag finalStateFlag, int level){
    resultList.get(resultList.size()-1).append(finalStateFlag);
}

Factory Implementation


The factory implementation is straightforward. We simply need to choose the right classes for the instantiation.
 

public class TransFsmTool extends FsmTool
{
     public TransFsmTool(String fsmFileName) throws FsmException{
         super(fsmFileName);
     }

     public Arc createArc(){
         return new TransArc();
     }

     public Extractor createExtractor(){
         return new TransExtractor();
     }

     public Traversal createTraversal(){
         return new NondetTraversal(createExtractor());
     }
}

Quick Start

And now  call the program from the prompt with one of the documents in trans-doc:
java trans.Run EIrrVerbs.tra
This document contains a list of English irregular verbs. If used in analysis direction, the FST returns the citation form (infinitive form) of each English verb form. If used in generation direction, the FST generates all conjugated form of a given citation form.
 


sp 2000