public class MoreLikeThisQuery
extends Query
Modifier and Type | Class and Description |
---|---|
private static class |
MoreLikeThisQuery.Flt
Used for scores and to avoid renewing Floats.
|
private static class |
MoreLikeThisQuery.Int
Used for frequencies and to avoid renewing Integers.
|
class |
MoreLikeThisQuery.MoreLikeWrapper
Exclude the target document from the set.
|
private static class |
MoreLikeThisQuery.QueryWord |
private static class |
MoreLikeThisQuery.QueryWordQueue
PriorityQueue that orders query words by score.
|
Modifier and Type | Field and Description |
---|---|
private CharMap |
accentMap |
private boolean |
boost
Should we apply a boost to the Query based on the scores?
|
private Map |
boostMap
Boost values for the fields
|
private float[] |
fieldBoosts
Boost value per field.
|
private String[] |
fieldNames
Field name(s) we'll analyze.
|
private int |
maxDocFreq
Ignore words which occur in at least this many docs.
|
private int |
maxNumTokensParsed
The maximum number of tokens to parse in each example doc field that is
not stored with TermVector support
|
private int |
maxQueryTerms
Don't return a query longer than this.
|
private int |
maxWordLen
Ignore words if greater than this len.
|
private int |
minDocFreq
Ignore words which do not occur in at least this many docs.
|
private int |
minTermFreq
Ignore words less freqent that this.
|
private int |
minWordLen
Ignore words if less than this len.
|
private WordMap |
pluralMap |
private Similarity |
similarity
For idf() calculations.
|
private Set |
stopSet |
private Query |
subQuery |
private int |
targetDoc |
Constructor and Description |
---|
MoreLikeThisQuery(Query subQuery)
Constructs a span query selecting all terms greater than
lowerTerm but less than upperTerm . |
Modifier and Type | Method and Description |
---|---|
private void |
addTermFrequencies(TokenStream tokens,
String field,
Map termFreqMap)
Adds term frequencies found by tokenizing text from reader into the Map
words.
|
private Map |
condenseTerms(IndexReader indexReader,
Map words)
Condense the same term in multiple fields into a single term with a
total score.
|
private Query |
createQuery(IndexReader indexReader,
PriorityQueue q)
Create the More like query from a PriorityQueue
|
private PriorityQueue |
createQueue(IndexReader indexReader,
Map words)
Create a PriorityQueue from a word->tf map.
|
float[] |
getFieldBoosts() |
String[] |
getFieldNames() |
Query |
getSubQuery()
Retrieve the sub-query
|
protected boolean |
isNoiseWord(String term)
Determines if the passed term is likely to be of interest in "more like"
comparisons
|
private PriorityQueue |
retrieveTerms(IndexReader indexReader,
int docNum,
Analyzer analyzer)
Find words for a more-like-this query former.
|
Query |
rewrite(IndexReader reader)
Generate a query that will produce "more documents like" the first
in the sub-query.
|
void |
setAccentMap(CharMap map)
Establish the accent map in use
|
void |
setBoost(boolean boost)
Should we apply a boost to the Query based on the scores?
|
void |
setFieldBoosts(float[] fieldBoosts)
Boost value per field
|
void |
setFieldNames(String[] fieldNames)
Field name(s) we'll analyze.
|
void |
setMaxDocFreq(int maxDocFreq)
Ignore words which occur in at least this many docs.
|
void |
setMaxNumTokensParsed(int maxNumTokensParsed)
The maximum number of tokens to parse in each example doc field that is
not stored with TermVector support
|
void |
setMaxQueryTerms(int maxQueryTerms)
Don't return a query longer than this.
|
void |
setMaxWordLen(int maxWordLen)
Ignore words if greater than this len.
|
void |
setMinDocFreq(int minDocFreq)
Ignore words which do not occur in at least this many docs.
|
void |
setMinTermFreq(int minTermFreq)
Ignore words less freqent that this.
|
void |
setMinWordLen(int minWordLen)
Ignore words if less than this len.
|
void |
setPluralMap(WordMap map)
Establish the plural map in use
|
void |
setStopWords(Set set)
Establish the set of stop words to ignore
|
void |
setSubQuery(Query subQuery)
Set the sub-query
|
String |
toString(String field)
Prints a user-readable version of this query.
|
private Query subQuery
private int targetDoc
private Set stopSet
private WordMap pluralMap
private CharMap accentMap
private int minTermFreq
private int minDocFreq
private int maxDocFreq
private boolean boost
private String[] fieldNames
private float[] fieldBoosts
private Map boostMap
private int maxNumTokensParsed
private int minWordLen
private int maxWordLen
private int maxQueryTerms
private Similarity similarity
public MoreLikeThisQuery(Query subQuery)
lowerTerm
but less than upperTerm
.
There must be at least one term and either term may be null,
in which case there is no bound on that side, but if there are
two terms, both terms must be for the same field. Applies
a limit on the total number of terms matched.public Query getSubQuery()
public void setSubQuery(Query subQuery)
public void setStopWords(Set set)
public void setPluralMap(WordMap map)
public void setAccentMap(CharMap map)
public void setMaxDocFreq(int maxDocFreq)
public void setFieldNames(String[] fieldNames)
public String[] getFieldNames()
public void setFieldBoosts(float[] fieldBoosts)
public float[] getFieldBoosts()
public void setMaxNumTokensParsed(int maxNumTokensParsed)
public void setMaxQueryTerms(int maxQueryTerms)
public void setMaxWordLen(int maxWordLen)
public void setMinDocFreq(int minDocFreq)
public void setMinTermFreq(int minTermFreq)
public void setMinWordLen(int minWordLen)
public void setBoost(boolean boost)
public Query rewrite(IndexReader reader) throws IOException
rewrite
in class Query
IOException
private Query createQuery(IndexReader indexReader, PriorityQueue q) throws IOException
IOException
private PriorityQueue createQueue(IndexReader indexReader, Map words) throws IOException
words
- a map of words keyed on the word(String) with Int objects as the values.IOException
private Map condenseTerms(IndexReader indexReader, Map words) throws IOException
words
- a map of words keyed on the word(String) with Int objects as the values.IOException
private PriorityQueue retrieveTerms(IndexReader indexReader, int docNum, Analyzer analyzer) throws IOException
docNum
- the id of the lucene document from which to find termsIOException
private void addTermFrequencies(TokenStream tokens, String field, Map termFreqMap) throws IOException
tokens
- a source of tokensfield
- Specifies the field being tokenizedtermFreqMap
- a Map of terms and their frequenciesIOException
protected boolean isNoiseWord(String term)
term
- The word being consideredpublic String toString(String field)
toString
in class Query