org.cdlib.xtf.textEngine.facet
Class StaticGroupData

Object
  extended by GroupData
      extended by StaticGroupData

public class StaticGroupData
extends GroupData

This class contains the mapping, for a given field, from documents to one or more term values in that document.

Author:
Martin Haye

Field Summary
private static WeakHashMap cache
          Cached data.
private  int[] docs
          Array of document IDs
private  String field
          The particular field we have data from
private  int[] groupChildren
          The first child of each group, or -1 for none.
private  int[] groupParents
          The parent of each group, or -1 for none
private  String[] groups
          Array of group names
private  int[] groupSiblings
          The next sibling of each group, or -1 for none.
private  int[] links
          Array of links: 0..docs.length is either positive to indicate a single group for this doc, or negative to indicate a link later in the array to a list of groups. docs.length..links.length holds the extra groups; each entry is a group number, negative to mean end of the groups for a single doc.
 
Constructor Summary
StaticGroupData(IndexReader reader, String field)
          Read in the term data for a given field, and build up the various arrays of document to group info, and hierarchical relationships between the groups.
 
Method Summary
private  Integer addTermKey(String termText, Vector groupVec, HashMap groupMap, HashMap childMap)
          Add the given term to the group vector and map.
private  void buildHierarchy(HashMap childMap)
          Based on a hierarchy data map, build the parent, child, and sibling relationship arrays that make all this info easy to find and fast to traverse.
private  void buildLinks(HashMap docMap)
          Perform the final build step, forming the 'docs' and 'links' arrays.
 int child(int groupId)
          Get the first child of the given group, or -1 if it has no children
 int compare(int group1, int group2)
          Compare two groups for sort order
 String field()
          Get the name of the grouping field
 int findGroup(String name)
          Locate a group by name and return its index, or -1 if not found
 int firstLink(int docId)
          Return the ID of the first link for the given document, or -1 if there are no links for that document.
static StaticGroupData getCachedData(IndexReader reader, String field)
          Retrieves GroupData for a given field from a given reader.
 int linkGroup(int linkId)
          Returns the group number of the specified link
 String name(int groupId)
          Get the name of a group given its number
 int nChildren(int groupId)
          Get the number of children a group has
 int nextLink(int linkId)
          Return the ID of the link after the specified one, or -1 if no more
 int nGroups()
          Get the total number of groups
 int parent(int groupId)
          Get the parent of the given group, or -1 if group is the root
 int sibling(int groupId)
          Get the sibling of the given group, or -1 if no more
 
Methods inherited from class GroupData
debugGroups, isDynamic, nDocHits, score
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

field

private String field
The particular field we have data from


docs

private int[] docs
Array of document IDs


links

private int[] links
Array of links: 0..docs.length is either positive to indicate a single group for this doc, or negative to indicate a link later in the array to a list of groups. docs.length..links.length holds the extra groups; each entry is a group number, negative to mean end of the groups for a single doc.


groups

private String[] groups
Array of group names


groupParents

private int[] groupParents
The parent of each group, or -1 for none


groupChildren

private int[] groupChildren
The first child of each group, or -1 for none.


groupSiblings

private int[] groupSiblings
The next sibling of each group, or -1 for none.


cache

private static WeakHashMap cache
Cached data. If the reader goes away, our cache will too.

Constructor Detail

StaticGroupData

public StaticGroupData(IndexReader reader,
                       String field)
                throws IOException
Read in the term data for a given field, and build up the various arrays of document to group info, and hierarchical relationships between the groups.

Parameters:
reader - Where to read the term data from
field - Which field to read
Throws:
IOException
Method Detail

getCachedData

public static StaticGroupData getCachedData(IndexReader reader,
                                            String field)
                                     throws IOException
Retrieves GroupData for a given field from a given reader. Maintains a cache so that if the same field is requested again for this reader, we don't have to re-read the group data. Synchronized so that if a bunch of threads come in wanting to load the same data, we won't waste time and memory loading it over and over.

Parameters:
reader - Where to read the data from
field - Which field to read
Returns:
Group data for the specified field
Throws:
IOException

addTermKey

private Integer addTermKey(String termText,
                           Vector groupVec,
                           HashMap groupMap,
                           HashMap childMap)
Add the given term to the group vector and map. If it's hierarchical, add relationships for the parent and all ancestors as well.

Parameters:
termText - Term to add
groupVec - Vector of groups in sort order
groupMap - Mapping of terms to group numbers
childMap - Mapping of parent key to child vector
Returns:
New key for the term

buildHierarchy

private void buildHierarchy(HashMap childMap)
Based on a hierarchy data map, build the parent, child, and sibling relationship arrays that make all this info easy to find and fast to traverse.

Parameters:
childMap - Map of parent key to vector of child keys

buildLinks

private void buildLinks(HashMap docMap)
Perform the final build step, forming the 'docs' and 'links' arrays.

Parameters:
docMap - Map of document ID to vector of group IDs

firstLink

public final int firstLink(int docId)
Return the ID of the first link for the given document, or -1 if there are no links for that document.

Specified by:
firstLink in class GroupData
Parameters:
docId - document to look for
Returns:
the first link ID, or -1 if none

nextLink

public final int nextLink(int linkId)
Return the ID of the link after the specified one, or -1 if no more

Specified by:
nextLink in class GroupData

linkGroup

public final int linkGroup(int linkId)
Returns the group number of the specified link

Specified by:
linkGroup in class GroupData

field

public final String field()
Get the name of the grouping field

Specified by:
field in class GroupData

nGroups

public final int nGroups()
Get the total number of groups

Specified by:
nGroups in class GroupData

name

public final String name(int groupId)
Get the name of a group given its number

Specified by:
name in class GroupData

parent

public final int parent(int groupId)
Get the parent of the given group, or -1 if group is the root

Specified by:
parent in class GroupData

nChildren

public final int nChildren(int groupId)
Get the number of children a group has

Specified by:
nChildren in class GroupData

child

public final int child(int groupId)
Get the first child of the given group, or -1 if it has no children

Specified by:
child in class GroupData

sibling

public final int sibling(int groupId)
Get the sibling of the given group, or -1 if no more

Specified by:
sibling in class GroupData

compare

public final int compare(int group1,
                         int group2)
Compare two groups for sort order

Specified by:
compare in class GroupData

findGroup

public final int findGroup(String name)
Locate a group by name and return its index, or -1 if not found

Specified by:
findGroup in class GroupData