org.cdlib.xtf.dynaXML.test
Class TreeAnnotater

Object
  extended by TreeAnnotater

public class TreeAnnotater
extends Object

Performs brute-force (that is, stupid but reliable) single-term searching and hit marking on a DOM tree.

Author:
Martin Haye

Field Summary
private  StandardAnalyzer analyzer
           
private  Document doc
           
private  String searchTerm
           
private  int totalHitCount
           
private static String xtfUri
           
 
Constructor Summary
TreeAnnotater()
           
 
Method Summary
private  boolean isAllWhitespace(String s)
          Determine whether a string contains only whitespace characters.
 void processDocument(Document doc, String term)
          Process an entire document, marking hits and hit counts as we go.
 int processElement(Element parent, int level)
          Traverse an element of the tree.
private  int processText(Text node)
          Recursively scans a text node for hits.
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

analyzer

private StandardAnalyzer analyzer

searchTerm

private String searchTerm

doc

private Document doc

xtfUri

private static final String xtfUri
See Also:
Constant Field Values

totalHitCount

private int totalHitCount
Constructor Detail

TreeAnnotater

public TreeAnnotater()
Method Detail

processDocument

public void processDocument(Document doc,
                            String term)
Process an entire document, marking hits and hit counts as we go.


isAllWhitespace

private boolean isAllWhitespace(String s)
Determine whether a string contains only whitespace characters.


processElement

public int processElement(Element parent,
                          int level)
Traverse an element of the tree. Process all its children, and if any has hits, record the hit count on this element.

Parameters:
parent - The element to traverse
Returns:
The number of hits found within it.

processText

private int processText(Text node)
Recursively scans a text node for hits. All hits are marked with special elements.

Parameters:
node - Node to scan
Returns:
How many hits were found