org.cdlib.xtf.textEngine
Class DocHitImpl

Object
  extended by ScoreDoc
      extended by FieldDoc
          extended by DocHit
              extended by DocHitImpl
All Implemented Interfaces:
Serializable

public class DocHitImpl
extends DocHit

Represents a query hit at the document level. May contain Snippets if those were requested.

Author:
Martin Haye
See Also:
Serialized Form

Field Summary
private  int chunkCount
          Total number of chunks for this document
private  String docKey
          Index key for this document
private  Explanation explanation
          Explanation of this document's score
private  FieldSpans fieldSpans
          Spans per field
private  FieldSpanSource fieldSpanSource
          Source of spans.
private  long fileDate
          Date the original source XML document was last modified
private  AttribList metaData
          Document's meta-data fields (copied from the docInfo chunk)
private  int recordNum
          Record number of this document within the main file
private  SnippetMaker snippetMaker
          Used to load and format snippets
private  Snippet[] snippets
          Array of pre-built snippets
private  String subDocument
          Record the subdocument within the main file, if any
 
Fields inherited from class FieldDoc
fields
 
Fields inherited from class ScoreDoc
doc, score
 
Constructor Summary
DocHitImpl(int docNum, float score)
          Construct a document hit.
 
Method Summary
 Explanation explanation()
          Retrieve an explanation of this document's score
 String filePath()
          Retrieve the original file path as recorded in the index (if any.)
(package private)  void finish(SnippetMaker snippetMaker, float docScoreNorm)
          Called after all hits have been gathered to normalize the scores and associate a snippetMaker for later use.
(package private)  void finishWithExplain(SnippetMaker snippetMaker, float docScoreNorm, Weight weight, BoostSet boostSet, BoostSetParams boostParams)
          Called after all hits have been gathered to normalize the scores and associate a snippetMaker for later use.
private  void load()
          Read in the document info chunk and record the path, date, etc. that we find there.
private  void loadMetaField(String name, String value, Document docContents, AttribList metaData, boolean isTokenized)
          Performs all the manipulations and marking for a meta-data field.
 AttribList metaData()
          Retrieve a list of all meta-data name/value pairs associated with this document.
 int nSnippets()
          Return the number of snippets available (limited by the max # specified in the original query.)
 int recordNum()
          Retrieve the record number of this document within the main file, or zero if this is the only record.
(package private)  void setSpanSource(FieldSpanSource src)
          Sets the source for spans (to perform deduplication)
 Snippet snippet(int hitNum, boolean getText)
          Retrieve the specified snippet.
 String subDocument()
          Retrieve the subdocument name of this section within the main file, if any.
 Set textTerms()
          Fetch a map that can be used to check whether a given term is present in the original query that produced this hit.
 int totalSnippets()
          Return the total number of snippets found for this document (not the number actually returned, which is limited by the max # of snippets specified in the query.)
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

snippetMaker

private SnippetMaker snippetMaker
Used to load and format snippets


fieldSpanSource

private FieldSpanSource fieldSpanSource
Source of spans. Only valid during collection.


fieldSpans

private FieldSpans fieldSpans
Spans per field


snippets

private Snippet[] snippets
Array of pre-built snippets


docKey

private String docKey
Index key for this document


fileDate

private long fileDate
Date the original source XML document was last modified


recordNum

private int recordNum
Record number of this document within the main file


subDocument

private String subDocument
Record the subdocument within the main file, if any


chunkCount

private int chunkCount
Total number of chunks for this document


metaData

private AttribList metaData
Document's meta-data fields (copied from the docInfo chunk)


explanation

private Explanation explanation
Explanation of this document's score

Constructor Detail

DocHitImpl

DocHitImpl(int docNum,
           float score)
Construct a document hit. Package-private because these should only be constructed inside the text engine.

Parameters:
docNum - Lucene ID for the document info chunk
score - Score for this hit
Method Detail

setSpanSource

void setSpanSource(FieldSpanSource src)
Sets the source for spans (to perform deduplication)


finish

void finish(SnippetMaker snippetMaker,
            float docScoreNorm)
Called after all hits have been gathered to normalize the scores and associate a snippetMaker for later use.

Parameters:
snippetMaker - Will be used later by snippet() to actually create the snippets.
docScoreNorm - Multiplied into the document's score

finishWithExplain

void finishWithExplain(SnippetMaker snippetMaker,
                       float docScoreNorm,
                       Weight weight,
                       BoostSet boostSet,
                       BoostSetParams boostParams)
                 throws IOException
Called after all hits have been gathered to normalize the scores and associate a snippetMaker for later use. Also calculates an explanation of the score.

Parameters:
snippetMaker - Will be used later by snippet() to actually create the snippets.
docScoreNorm - Multiplied into the document's score
weight - The query weight that will be used to calculate an explanation.
boostSet - The boost set used, or null if none
boostParams - Other boost set parameters (e.g. exponent)
Throws:
IOException

load

private void load()
Read in the document info chunk and record the path, date, etc. that we find there.


loadMetaField

private void loadMetaField(String name,
                           String value,
                           Document docContents,
                           AttribList metaData,
                           boolean isTokenized)
Performs all the manipulations and marking for a meta-data field.

Parameters:
name - Name of the field
value - Raw string value of the field
docContents - Where to get spans from
metaData - Where to put the resulting data
isTokenized - true if the field was tokenized and should be marked.

textTerms

public Set textTerms()
Fetch a map that can be used to check whether a given term is present in the original query that produced this hit.


filePath

public final String filePath()
Retrieve the original file path as recorded in the index (if any.)

Specified by:
filePath in class DocHit

recordNum

public final int recordNum()
Retrieve the record number of this document within the main file, or zero if this is the only record.

Specified by:
recordNum in class DocHit

subDocument

public final String subDocument()
Retrieve the subdocument name of this section within the main file, if any.

Specified by:
subDocument in class DocHit

metaData

public final AttribList metaData()
Retrieve a list of all meta-data name/value pairs associated with this document.

Specified by:
metaData in class DocHit

totalSnippets

public final int totalSnippets()
Return the total number of snippets found for this document (not the number actually returned, which is limited by the max # of snippets specified in the query.)

Specified by:
totalSnippets in class DocHit

nSnippets

public final int nSnippets()
Return the number of snippets available (limited by the max # specified in the original query.)

Specified by:
nSnippets in class DocHit

snippet

public final Snippet snippet(int hitNum,
                             boolean getText)
Retrieve the specified snippet.

Specified by:
snippet in class DocHit
Parameters:
hitNum - 0..nSnippets()
getText - true to fetch the snippet text in context, false to only fetch the rank, score, etc.

explanation

public Explanation explanation()
Retrieve an explanation of this document's score

Overrides:
explanation in class DocHit