This package is for handling for very large documents that have been indexed in overlapping chunks.
Lucene deals readily with documents in the range of a few bytes to maybe a hundred kbytes. But throw a 10-megabyte document at it, and things start to break down. For one thing, generating snippets on a number of these documents becomes very slow.
One technique for dealing with these large documents is to index them in small "chunks", for instance, breaking a document into 200-word chunks. In order for proximity queries to still be effective, these chunks need to overlap. For instance, if one queried for "bat man", one would expect to get a hit even if "bat" appears at the end of one chunk and "man" appears at the start of the next.
Breaking documents into chunks isn't addressed by this package, but once you've indexed it that way, the classes in this package will help to query the chunked index. Follow these steps: