Exercise 8: Hierarchical facets

Create a hierarchical facet

Introduction:

Our goal for this task is to add a new kind of metadata to the XTF instance: a hierarchical facet. A facet is a category of descriptive data for a given set of content. The default interface has a “subject” facet and a “date” facet. A hierarchical facet is one in which the values have different levels, from general to specific. For instance, in the date facet the top level is year, the next level down in the hierarchy is month, and the level below that is day. In this task, we will add a new hierarchical facet for article number that allows users to browse up and down the hierarchy.

Demonstration:

Steps:

  1. Add some metadata to the sample collection that could be displayed in a hierarchical facet. The sample data set has several PDF articles from the California Agriculture Journal. Edit each of the “.dc.xml” files in data\pdf\calag* to add a new piece of data indicating the issue and article number. A double colon must separate each of the levels of the hierarchy. For instance, in calag56-3.dc.xml you would add:

    <articleNum>56::3</articleNum>

  2. XTF needs to be told where to find the new metadata and that this metadata is organized as a hierarchical facet. Metadata is declared and defined in:

    %XTF_HOME%\style\textIndexer\common\preFilterCommon.xsl

  3. Open this file in your XML editor.
  4. Find <xsl:template name=”add-fields”>, and within that template locate the section commented “Create facets”.
  5. Add this line (which is based on the lines for date and subject):

    <xsl:apply-templates
    select=”$meta/*[matches(local-name(),’^articleNum$’)]” mode=”facet”/>

  6. Then copy the template <xsl:template match=”*[matches(local-name(),’^subject$’)]” mode=”facet”>, and change “subject” to “articleNum” all through it.
  7. Now we need the Query Parser to recognize the new facet in queries that it processes. Open:

    %XTF_HOME%\style\crossQuery\queryParser\default\queryParser.xsl

  8. Search for the string “hierarchical date facet” to get to the section of call-template statements for the facets. Copy the entire <xsl:call-template> element for the date facet, and substitute “articleNum” for “date” wherever it appears.
  9. Since only metadata has been changed and no documents have been altered or added, the index now needs to be generated from scratch in order to get access to the new facet. Shut down tomcat and then create a clean index.
  10. Start up tomcat and check that the new facet is appearing by doing a query with debugStep on, which displays the status of each step in processing a query:

    http://localhost:8080/xtf/search?keyword=agriculture;debugStep=4

    Each <docHit> should contain <articleNum> and <facet-articleNum> within its <meta> tag. This confirms that the facet has been appropriately indexed.

  11. In order to display the new article number hierarchical facet, the resultFormatter.xsl stylesheet must be modified. Open:

    %XTF_HOME%\style\crossQuery\resultFormatter\default\resultFormatter.xsl

  12. Search for “facet-date”. You’ll find only one line:

    <xsl:apply-templates select=”facet[@field=’facet-date’]”/>

    Copy that and change facet-date to facet-articleNum.

  13. In your browser, query again for “agriculture” and look for the new facet on the left below Date in the formatted results.

Next tutorial step:

Exercise 9: Change footnote behavior