[ You are here: XTF -> Programming -> crossQuery -> Result Formatter ]

Result Formatter Programming

The last stage in the crossQuery data flow is formatting the results. Recall that the URL parameters were parsed into an XTF-compatible query by the Query Parser stylesheet. Then the Text Engine runs that query against indexed data, resulting in a list of matching documents. The final task is to put a pretty face on things, and that's where the Result Formatter stylesheet enters in. It transforms the XML list of documents into an easy-to-use HTML result page.

How does XTF know which stylesheet to use? Simple: the Query Parser tells it. The <query> tag it outputs specifies a style attribute, which points at the Result Formatter stylesheet that you want XTF to run. Thus, it is quite possible -- and often useful -- to have multiple result formatters for different purposes or display modes, and program the Query Parser to decide which formatter to run based on a URL parameter. But for simplicity we'll assume for now that you only have one formatter.

To accomplish its work, the Result Formatter receives three pieces of data:
  1. First, it receives the same <parameters> block that was passed to the Query Parser. This contains parsed versions of all the URL parameters, in case the Result Formatter wants to act on these as well.
  2. Next, it also receives a copy of the full <query> element that was produced by the Query Parser.
  3. Finally and most importantly comes a list of documents that matched the query. Each <docHit> element will contain meta-data in addition to snippets of matching text from the main body of each document.
It's easy to view the XML that crossQuery sends to the Result Formatter. Simply append ;raw=1 to the URL, and the servlet will bypass the formatter completely and display the raw XML directly in your browser. A great way to plan your stylesheet is to run some sample queries and look at the raw XML, then try to envision how you want it to look in HTML.
Here's a real-life sample of Result Formatter input, coming from a query for the words "man" and "war". Much of the repetitive information has been snipped out so you can get a quick idea of the structure without getting bogged down in details.
<crossQueryResult queryTime="0.32" totalDocs="8" startDoc="1" endDoc="8">
 
  <parameters>
    <param name="text" value="man war">
       <token value="man" isWord="yes"/>
       <token value="war" isWord="yes"/>
    </param>
    <!-- ...additional URL parameters here... -->
  </parameters>
 
  <query indexPath="index" termLimit="1000" workLimit="1000000"
         style="style/crossQuery/resultFormatter/default/resultFormatter.xsl"
         startDoc="1" maxDocs="10">
     <and field="text" maxSnippets="3" maxContext="100">
       <term>man</term>
       <term>war</term>
     </and>
  </query>
 
  <docHit rank="1" path="default:r1/ft2s2004r1/ft2s2004r1.xml" score="100" totalHits="3">
    <meta>
       <title>Asylia: Territorial Inviolability in the Hellenistic World</title>
       <creator>Kent J. Rigsby</creator>
       <!-- ...more meta-data here... -->
    </meta>
    <snippet rank="1" score="100">inspoliatus : [Sall. ] Resp . 1.2.7, in the civil <hit>
          <term>war</term>
          <term>men</term>
       </hit> fled to Pompey "as debtors use a sacred</snippet>
    <snippet rank="2" score="53">he explains, will win the favor of gods and <hit>
          <term>men</term>, and just <term>wars</term>
       </hit> are defensive. The locus classicus is</snippet>
    <snippet rank="3" score="53">the Roman peace, which ended the state of <hit>
          <term>war</term> among <term>men</term>
       </hit>). More generally, legend told of various</snippet>
  </docHit>
 
  <docHit rank="2" path="default:7d/ft7w10087d/ft7w10087d.xml" score="76" totalHits="6">
    <meta>  ...meta-data here... </meta>
    <snippet rank="1" score="100">the mother of shields." Kunu refers to the <hit>
          <term>war</term> shields <term>men</term>
       </hit> used to fashion from lighter bark, some</snippet>
    <!-- ...more snippets here... -->
  </docHit>
 
  <docHit rank="3" path="default:pf/ft7r29p1pf/ft7r29p1pf.xml" score="76" totalHits="2">
    <!-- ...meta-data and snippets here... -->
  </docHit>
 
  <!-- ...additional document hits here... -->
</crossQueryResult>
Essentially, each matching document will have a corresponding <docHit> tag, and these will be sorted in some order, generally by descending score (relevance). Each document hit contains corresponding meta-data within a <meta> sub-tag. Hits on the full text of the document will have <snippet> tags, each with its own <hit> tag inside it.
A little more formally, the result formatter receives a <crossQueryResult> tag that looks like this:
<crossQueryResult queryTime = "TimeInSeconds"
                  totalDocs = "NumberOfDocs"
                  startDoc  = "FirstDocNumber"
                  endDoc    = "LastDocNumber">
    Parameters
    Query
 
    DocumentHit
    DocumentHit
        …
</crossQueryResult>
Note that, depending on the query and the size of the document repository, there might be thousands of matching documents, and this thousands of <docHit> tags. Suppose you only wanted to display the first page of hits, say ten of them? It would be simple to make a Result Formatter that simply picked the first 10 and ignored the rest, but that would be very inefficient because the XSLT processor will still have to parse and process all of the document hits. A much more efficient way to handle paging is to modify the Query Parser to specify maxDocs="10" in the <query> element; then only the first ten document hits will be passed to the Result Formatter and the user interface will be much more responsive.

Each Document Hit looks like this:
<docHit rank="DocRelevanceRank" path="DocumentLocation" score="DocRelevanceScore">
  <meta>
    Meta-data defined by index Pre-Filter stylesheet
  </meta>
 
  Snippet
  Snippet
    …
</docHit>
The meta-data is copied directly from the tags in the input document marked by the index Pre-Filter stylesheet using the xtf:meta="yes" attribute. If the query targets meta-data fields, these may have <snippet> and/or <hit> tags embedded within them, marking the exact location of the matching terms.

If the query targets the "text" field -- that is, the full document text -- then the <docHit> tag will have one or more <snippet> tags containing the matching text and some surrounding context:
<snippet rank="MatchRelevanceRank" score="MatchRelevanceScore">
 
    Hit Text (and context text, if any)
 
</snippet>
Within each snippet will appear a <hit> tag with one or more <term> tags marking the exact matching terms.
The bulk of the Result Formatter's work will be in transforming all these <docHit>, <meta>, <snippet>, <hit>, and <term> XML tags into meaningful HTML output. Writing XSLT is beyond the scope of this document, but a good way to learn is to begin modifying the sample Result Formatter stylesheet. The stylesheet is included with the XTF distribution in the style/crossQuery/resultFormatter directory.

It should also be noted that the various input tags have bells and whistles not mentioned in this short tutorial. For a full specification, please refer to the Result Formatter Tag Reference.