org.apache.lucene.spelt
Class QuerySpeller

Object
  extended by SimpleQueryRewriter
      extended by QuerySpeller

public class QuerySpeller
extends SimpleQueryRewriter

Handles spelling correction for simple queries produced by the Lucene QueryParser. Allows a custom QueryParser to be supplied, though it must retain the case of the input tokens, so that we can supply spelling corrections using the same case the user did.

Author:
Martin Haye

Field Summary
private  HashSet<String> fieldSet
          Set of fields we're allowed to collect terms for
private  QueryParser queryParser
          Used to parse queries
private  SpellReader spellReader
          Used to get spelling suggestions
private  HashMap<String,String> suggestMap
          Mapping of terms to replace
private  LinkedHashSet<String> terms
          List of terms collected
 
Constructor Summary
QuerySpeller(SpellReader spellReader)
          Construct a new speller using a given dictionary reader.
QuerySpeller(SpellReader spellReader, QueryParser queryParser)
          Construct a new speller using a given dictionary reader and analyzer (note that the analyzer should do MINIMAL token filtering, without any case conversion).
 
Method Summary
protected  Term rewrite(Term t)
          This is the way we slip in to grab or rewrite terms
 String suggest(String inQuery)
          Suggest alternate spellings for terms in a Lucene query.
 String suggest(String inQuery, String[] fields)
          Suggest alternate spellings for terms in a Lucene query, limiting suggestions to the specified fields only.
private  void validateAnalyzer()
          Make sure the analyzer preserves the case of input tokens.
 
Methods inherited from class SimpleQueryRewriter
rewrite, rewrite, rewrite, rewriteQuery
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

spellReader

private SpellReader spellReader
Used to get spelling suggestions


fieldSet

private HashSet<String> fieldSet
Set of fields we're allowed to collect terms for


terms

private LinkedHashSet<String> terms
List of terms collected


suggestMap

private HashMap<String,String> suggestMap
Mapping of terms to replace


queryParser

private QueryParser queryParser
Used to parse queries

Constructor Detail

QuerySpeller

public QuerySpeller(SpellReader spellReader)
Construct a new speller using a given dictionary reader. The queries will be parsed with a MinimalAnalyzer, and the default field name will be "text".

Parameters:
spellReader - source for spelling suggestions -- see SpellReader.open(File).

QuerySpeller

public QuerySpeller(SpellReader spellReader,
                    QueryParser queryParser)
Construct a new speller using a given dictionary reader and analyzer (note that the analyzer should do MINIMAL token filtering, without any case conversion).

Parameters:
spellReader - source for spelling suggestions -- see SpellReader.open(File).
queryParser - used to parse queries; note that the analyzer it uses should do only MINIMAL token filtering, not even conversion to lower case, so that suggestions can be made in the same case the user typed them. In particular, StandardAnalyzer should not be used.
Method Detail

validateAnalyzer

private void validateAnalyzer()
Make sure the analyzer preserves the case of input tokens. If it didn't, we would be unable to make spelling suggestions that match the case of user queries.


suggest

public String suggest(String inQuery)
               throws ParseException,
                      IOException
Suggest alternate spellings for terms in a Lucene query. By default, we consider terms in any field. If you need to specify a subset of fields to consider, use the alternate method below.

Parameters:
inQuery - the original query to scan
Returns:
an query with some suggested spelling corrections, or null if no suggestions could be found.
Throws:
ParseException
IOException

suggest

public String suggest(String inQuery,
                      String[] fields)
               throws ParseException,
                      IOException
Suggest alternate spellings for terms in a Lucene query, limiting suggestions to the specified fields only.

Parameters:
inQuery - the original query to scan
fields - to consider for correction, or null for all
Returns:
a query with some suggested spelling corrections, or null if no suggestions could be found.
Throws:
ParseException
IOException

rewrite

protected Term rewrite(Term t)
This is the way we slip in to grab or rewrite terms

Overrides:
rewrite in class SimpleQueryRewriter
Parameters:
t - The term to rewrite
Returns:
Rewritten version, or 't' unchanged if no change needed.