Increase significance of titles in ranking hits
Introduction:
The preFilters for the various content types extract or produce a minimal set of Dublin Core metadata. This metadata is then further processed by preFilterCommon (1) to create the sort and browse fields in the index. The object here is to boost the value of the “title” field for all document types so that if a search term appears in the title of the document, it will have a much higher relevance ranking than a document that merely has the term in the body of its text.
Demonstration:
Steps:
- Do a search (http://localhost:8080/xtf/search) in the workshop content for a common term (e.g., grape). Take note of the document order in the search results, or take a screenshot if you prefer.
-
Using your XML editor:
a) Open: %XTF_HOME%\style\textIndexer\common\preFilterCommon.xsl
b) Look for the xsl:choose preceded by the comment “Copy all metadata fields”. This conditional is the mechanism by which the DC elements are copied to the index.
c) Within this xsl:choose, add an xsl:when for the ‘title’ element, e.g.:
<xsl:when test="matches(name(),'title')">
<title xtf:meta="true" xtf:tokenize="yes">
<xsl:copy-of select="@*"/>
<xsl:value-of select="string()"/>
</title>
</xsl:when>
d) Now add an xtf:wordBoost (2) attribute to the title field and give it a value of “1000000”, e.g.:
<xsl:attribute name="xtf:wordBoost" select="1000000"/>
e) Save the file.
- Shut down tomcat.
- Do a clean re-index.
- Start up tomcat.
- See if the order of your search results has changed.
Footnotes:
(1) Generally you want to do your work in the most narrowly focused version possible, although in this case the change we want to make is universal, so we are working in preFilterCommon.
(2) xtf:wordBoost is only one of many special XTF attributes that can be applied to an index field (e.g., xtf:meta, xtf:tokenize, xtf:facet, etc.). For a full list and explanation of each, please refer to the XTF Tag Reference.
Next tutorial step:
Exercise 5: Customize the advanced search form, and make it the default