org.cdlib.xtf.util
Class WordMap

Object
  extended by WordMap

public class WordMap
extends Object

Maintains an in-memory, one-to-one mapping from words in one set to words in another. The list is read from a disk file, which may be sorted or unsorted. The format of file entries should be one pair per line, separated by a bar ("|") character. The first word is considered the "key", the second is the "value". For speed, an in-memory cache of recently mapped words is maintained.


Field Summary
private  ArrayList blockHeads
          Sorted list of the block keys, for fast binary searching
private  HashMap blockMap
          Map of blocks, keyed by the first word in each block
private  FastCache cache
          Keep a cache of lookups performed to-date
private static int CACHE_SIZE
          How many recent mappings to maintain
 
Constructor Summary
WordMap(File f, CharMap charMap)
          Construct a word map by reading in a file.
WordMap(InputStream s, CharMap charMap)
          Construct a word map by reading from an InputStream.
 
Method Summary
 String lookup(String word)
          Look up a word, and return the corresponding value, or null if none.
private  void readFile(BufferedReader reader, CharMap charMap)
          Read in the contents of a word file, forming blocks of 128 entries per block.
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CACHE_SIZE

private static final int CACHE_SIZE
How many recent mappings to maintain

See Also:
Constant Field Values

cache

private FastCache cache
Keep a cache of lookups performed to-date


blockMap

private HashMap blockMap
Map of blocks, keyed by the first word in each block


blockHeads

private ArrayList blockHeads
Sorted list of the block keys, for fast binary searching

Constructor Detail

WordMap

public WordMap(File f,
               CharMap charMap)
        throws IOException
Construct a word map by reading in a file.

Throws:
IOException

WordMap

public WordMap(InputStream s,
               CharMap charMap)
        throws IOException
Construct a word map by reading from an InputStream. If a non-null character map is specified, all entries are filtered through it.

Throws:
IOException
Method Detail

lookup

public String lookup(String word)
Look up a word, and return the corresponding value, or null if none.


readFile

private void readFile(BufferedReader reader,
                      CharMap charMap)
               throws IOException
Read in the contents of a word file, forming blocks of 128 entries per block. The file need not be in sorted order.

Parameters:
reader - Reader to get the data from
charMap - Accent map to filter entries with, or null for none.
Throws:
IOException