org.cdlib.xtf.textIndexer
Class MARCIndexSource

Object
  extended by IndexSource
      extended by MARCIndexSource

public class MARCIndexSource
extends IndexSource

Supplies MARC data to an XTF index, breaking it up into individual MARCXML records.

Author:
Martin Haye

Nested Class Summary
private  class MARCIndexSource.RecordHandler
          Handles running blocks of records through the stylesheet
 
Field Summary
private  Templates displayStyle
          Stylesheet from which to gather XSLT key definitions to be computed and cached on disk.
private  long fileSize
          Size of the whole input file
private  boolean isDone
          Are we there yet?
private  String key
          Key used to identify this file in the index
private  File path
          Path to the file, or null if it's not a local file.
private  Templates[] preFilters
          XSLT pre-filters used to massage the XML document (null for none)
private  CountedInputStream rawStream
          Input stream for the raw data
private  MARCIndexSource.RecordHandler recordHandler
          Record handling thread
private  int recordNum
           
 
Constructor Summary
MARCIndexSource(File path, String key, Templates[] preFilters, Templates displayStyle)
          Constructor -- initializes all the fields
 
Method Summary
 Templates displayStyle()
          Stylesheet from which to gather XSLT key definitions to be computed and cached on disk.
 String key()
          Obtain a unique key for this input file
 IndexRecord nextRecord()
          Obtain the next record from the file, or null if no more.
private  void openFile()
           
 File path()
          Obtain the path to the file (or null if it's not a local file)
 Templates[] preFilters()
          Obtain set of prefilters to be run, serially in order, on each input record.
 long totalSize()
          Obtain the total size of the source file (used to calculate overall % done).
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

path

private File path
Path to the file, or null if it's not a local file.


key

private String key
Key used to identify this file in the index


preFilters

private Templates[] preFilters
XSLT pre-filters used to massage the XML document (null for none)


displayStyle

private Templates displayStyle
Stylesheet from which to gather XSLT key definitions to be computed and cached on disk. Typically, one would use the actual display stylesheet for this purpose, guaranteeing that all of its keys will be pre-cached.

Background: stylesheet processing can be optimized by using XSLT 'keys', which are declared with an <xsl:key> tag. The first time a key is used in a given source document, it must be calculated and its values stored on disk. The text indexer can optionally pre-compute the keys so they need not be calculated later during the display process.


fileSize

private long fileSize
Size of the whole input file


rawStream

private CountedInputStream rawStream
Input stream for the raw data


recordHandler

private MARCIndexSource.RecordHandler recordHandler
Record handling thread


isDone

private boolean isDone
Are we there yet?


recordNum

private int recordNum
Constructor Detail

MARCIndexSource

public MARCIndexSource(File path,
                       String key,
                       Templates[] preFilters,
                       Templates displayStyle)
Constructor -- initializes all the fields

Method Detail

path

public File path()
Description copied from class: IndexSource
Obtain the path to the file (or null if it's not a local file)

Specified by:
path in class IndexSource

key

public String key()
Description copied from class: IndexSource
Obtain a unique key for this input file

Specified by:
key in class IndexSource

preFilters

public Templates[] preFilters()
Description copied from class: IndexSource
Obtain set of prefilters to be run, serially in order, on each input record.

Specified by:
preFilters in class IndexSource
Returns:
Prefilter stylesheet(s) to run, or null to for none.

displayStyle

public Templates displayStyle()
Description copied from class: IndexSource
Stylesheet from which to gather XSLT key definitions to be computed and cached on disk. Typically, one would use the actual display stylesheet for this purpose, guaranteeing that all of its keys will be pre-cached.

Background: stylesheet processing can be optimized by using XSLT 'keys', which are declared with an <xsl:key> tag. The first time a key is used in a given source document, it must be calculated and its values stored on disk. The text indexer can optionally pre-compute the keys so they need not be calculated later during the display process.

Specified by:
displayStyle in class IndexSource

totalSize

public long totalSize()
Description copied from class: IndexSource
Obtain the total size of the source file (used to calculate overall % done). If you don't know, return 1.

Specified by:
totalSize in class IndexSource

nextRecord

public IndexRecord nextRecord()
                       throws SAXException,
                              IOException
Description copied from class: IndexSource
Obtain the next record from the file, or null if no more.

Specified by:
nextRecord in class IndexSource
Throws:
SAXException
IOException

openFile

private void openFile()
               throws IOException
Throws:
IOException