org.cdlib.xtf.textIndexer
Class FacetTokenizer

Object
  extended by TokenStream
      extended by FacetTokenizer

public class FacetTokenizer
extends TokenStream

Performs special tokenization for facet fields. Looks for the hierarchy marker "::" between hierarchy levels. For instance, the string "US::California::Alameda County::Berkeley" would be made into four tokens: US US::California US::California::Alameda County US::California::Alameda County::Berkeley

Author:
Martin Haye

Field Summary
(package private)  Token nextToken
           
(package private)  int pos
           
(package private)  String str
           
 
Constructor Summary
FacetTokenizer(String str)
          Construct a token stream to remove accents from the input tokens.
 
Method Summary
 Token next()
          Retrieve the next token in the stream.
 
Methods inherited from class TokenStream
close
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

str

String str

pos

int pos

nextToken

Token nextToken
Constructor Detail

FacetTokenizer

public FacetTokenizer(String str)
Construct a token stream to remove accents from the input tokens.

Parameters:
str - The string to tokenize
Method Detail

next

public Token next()
           throws IOException
Retrieve the next token in the stream.

Specified by:
next in class TokenStream
Throws:
IOException