Solr, finally

It took me about 3 weeks from the Solr preconference event at code4lib, but I finally have Solr running semi smoothly with my web application using Cocoon. I didn’t expect it to take so long, but most of that time was spent learning how to use cocoon (and trying to learn Java) . Ideally I would like to have my xqueries send POST and GET requests to Solr, which can be done using Java. However, the Java solution has a much larger learning curve than the Cocoon solution that I currently have in place. Because the release is only two weeks away, I’m sticking with Cocoon for now, with an eventual move to a Java/XQuery solution.Here what my setup currently looks like:

1) A Solr instance on port 8983 , with my website running on port 80 on the same machine. Port 8983 is firewalled so no one can come along and wipe out my index with a delete request.

2) An xquery that pulls data from my METS records for indexing, either a single record or multiple records, depending on the parameters. Using an XSL stylesheet I generate an XForm (with the xquery results as the instance data section of the form). This form then uses POST to send the data to the Solr index. A second button on the form sends a commit command to Solr.

3) A cocoon pipeline that sends GET requests to Solr and transforms the response using xsl. This feature took me a depressingly long time to figure out, in spite of the fact that I found this thread pretty early on.

One of the problems that I was running into was that I had changed my XSLT transformer from Xalan to Saxon (so I could use XSL 2.0). Saxon does not allow daisy chaining (pulling results from one pipeline through another pipeline, or applying multiple transformations). I adjusted my coccon.xconf and sitemap.xmap to use Xalan as an additional transformer and only call it when using the pipeline below.

The pipline for handling search requests looks like this:

<map:match pattern="search">
   <map:generate type="request">
      <map:parameter name="generate-attributes" value="true"/>
   </map:generate>
   <map:transform type="xslt-xsltc" src="solr.xsl">
      <map:parameter name="use-request-parameters" value="true"/>
   </map:transform>
   <map:transform type="cinclude" />
   <map:transform type="xslt-xsltc" src="searchResults.xsl" />
   <map:serialize type="xml"/>
</map:match>

solr.xsl transforms the prameters sent from the search form into Solr style prameters. The cinclude is passed form solr.xsl to Solr as a GET request (you can also use cincludes to POST data but I found it more difficult than posting from the XForm). The final XSL stylesheet transforms the results something attractive for the user.

Here is what my solr.xsl looks like:

<xsl:stylesheet xmlns:h="http:cocoon.apache.org/h"
   xmlns:cinclude="http://cocoon.apach.org/"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
   xmlns="http://www.w3.org/1999/xhtml" version="1.0">

<xsl:strip-space elements="*"/>
<xsl:output media-type="text/xml" method="xml"/>
<xsl:param name="term1"/>
<xsl:param name="field1"/>
<xsl:param name="term2"/>
<xsl:param name="field2"/>
<xsl:param name="term3"/>
<xsl:param name="field3"/>
<xsl:param name="bool1"/>
<xsl:param name="bool2"/>
<xsl:param name="start"/>
<xsl:param name="rows"/>
<xsl:param name="indent"/>
<xsl:template match="/">
   <xsl:variable name="param1">
      <xsl:choose>
         <xsl:when test="string-length(normalize-space($term1)) > 1">
            <xsl:choose>
               <xsl:when test="$field1 = 'kw'">
		 <xsl:value-of select="$term1"/></xsl:when>
  	       <xsl:when test="$field1 = 'ti'">
		 <xsl:value-of select="concat('title:','(',$term1,')')"/></xsl:when>
	       <xsl:when test="$field1 = 'au'">
		 <xsl:value-of select="concat('creator:','(',$term1,')')"/></xsl:when>
	       <xsl:when test="$field1 = 'su'">
		 <xsl:value-of select="concat('subject:','(',$term1,')')"/></xsl:when>
	       <xsl:when test="$field1 = 'ab'">
		 <xsl:value-of select="concat('text:','(',$term1,')')"/></xsl:when>
	       <xsl:otherwise><xsl:value-of select="$term1"/></xsl:otherwise>
	   </xsl:choose>
         </xsl:when>
      </xsl:choose>
   </xsl:variable>
   <xsl:variable name="param2">
	<!-- same as param 1 using field2 and term2 -->
   </xsl:variable>
   <xsl:variable name="param3">
 	<!-- same as param 1 using field2 and term2 -->
   </xsl:variable>
   <xsl:variable name="boolean1">
      <xsl:choose>
        <xsl:when test="string-length(normalize-space($term2)) > 1">
         <xsl:choose>
          <xsl:when test="$bool1 = 'and'"> AND </xsl:when>
          <xsl:when test="$bool1 = 'or'"> OR </xsl:when>
          <xsl:when test="$bool1 = 'not'"> NOT </xsl:when>
          <xsl:otherwise> AND </xsl:otherwise>
         </xsl:choose>
       </xsl:when>
       <xsl:otherwise> </xsl:otherwise>
     </xsl:choose>
   </xsl:variable>
<xsl:variable name="boolean2">
 <!-- same as boolean1 -->
</xsl:variable>
<!-- pulling all the params together-->
<xsl:variable name="params">
<xsl:value-of select="concat($param1,' ',$boolean1,' ',$param2,' ',$boolean2,' ',$param3)"/>
</xsl:variable>
   <ci:include
      xmlns:ci="http://apache.org/cocoon/include/1.0"
      src="http://localhost:8983/solr/select/?q=$params&version=2.2&start=$start&rows=$rows&indent=$indent"/>
</xsl:template>
</xsl:stylesheet>

For other approaches using cocoon check out SolrForrest, flowscripts, or try using the webdav module to talk to REST interfaces.

Resources:

Solr

Cocoon

Advertisements

2 Responses to “Solr, finally”

  1. Solr revisited « the DIL Says:

    […] Pretty much everything I wrote in my previous post about Solr is now obsolete. Up until last Sunday evening I had Solr running with Cocoon. However I […]

  2. Solr revisited « the DIL Says:

    […] Pretty much everything I wrote in my previous post about Solr is now obsolete. Up until last Sunday evening I had Solr running with Cocoon. However I […]

Comments are closed.


%d bloggers like this: