Solr revisited

Pretty much everything I wrote in my previous post about Solr is now obsolete. Up until last Sunday evening I had Solr running with Cocoon. However I had all sorts of problems with Cocoon, some stemming from my complete inability to go back to using XSLT 1.0 (which I needed to do in order to take advantage of daisy-chaining), and some stemming from bad (non HTML) characters in our metadata, most likely from pasting from Word documents.

At the same time I was struggling with Cocoon, this conversation was happening on the eXist listserv, which reminded me that I could use the eXist doc() function to send Solr requests, and transform the resulting response. I’m blaming being overworked as the reason I wasted so much time with Cocoon when I already use this function for retrieving XSL stylesheets for doing transformations in nearly every XQuery that I write.

So now my requests to Solr are sent via an xquery that looks like this:

xquery version "1.0";

declare namespace util="http://exist-db.org/xquery/util";

declare namespace request="http://exist-db.org/xquery/request";

declare namespace x="http://exist.sourceforge.net/dc-ext";

declare namespace xlink = "http://www.w3.org/1999/xlink";

declare namespace xslt="http://exist-db.org/xquery/transform";

declare namespace bh = "http://cdi.uvm.edu/cdi/ns";
(:Fields for limiting search : )

declare variable $field1 {request:request-parameter('field1', 'ft')};

declare variable $field2 {request:request-parameter('field2', 'ft')};

declare variable $field3 {request:request-parameter('field3', 'ft')};(:Search terms:)

declare variable $term1 {replace(request:request-parameter('term1', ''), "'", '"')};

declare variable $term2 {replace(request:request-parameter('term2', ''), "'", '"')};

declare variable $term3 {replace(request:request-parameter('term3', ''), "'", '"')};
(:Boolean operators: )

declare variable $bool1 {request:request-parameter('bool1', 'and')};

declare variable $bool2 {request:request-parameter('bool2', 'and')};
(:Variables for paging through results: )

declare variable $start {request:request-parameter('start', 0) cast as xs:integer};

declare variable $rows {request:request-parameter('rows', 25) cast as xs:integer};
(:Filters applied to search results: )

declare variable $filter {request:request-parameter('filter', '')};
(: Applies correct Solr field for fielded searching : )
declare function bh:field($field as xs:string) as xs:string {
   if ($field = "au") then
      "creator:"
   else if ($field = "ti") then
      "title:"
   else if ($field = "ab") then
      "abstract_text:"
   else if ($field = "su") then
      "topic_text:"
    else ''
};
(: Builds query parameters as a string : )
declare function bh:build-query()as xs:string{
let $queryString :=
   concat(
        if ($term1 != '') then
          concat(bh:field($field1), $term1)
        else '',
        if ($term2 != '') then
           concat(
            if ($bool1 = 'and' and $term1 != '') then ' AND '
            else if ($bool1 = 'or' and $term1 != '') then ' OR '
            else if($bool1 = 'not' and $term1 !='') then ' NOT '
            else ' ',bh:field($field2), $term2)
            else '',
         if ($term3 != '') then
           concat(
             if ($bool2 = 'and' and ($term1 != '' or $term2 != '')) then ' AND '
             else if ($bool2 = 'or' and $term1 != '' or $term2 != '') then ' OR '
             else if ($bool2 = 'not' and $term1 != '' or $term2 != '') then ' NOT '
             else ' ',bh:field($field3), $term3)
             else '',
         if ($term1 = '' and $term2 = '' and $term3 = '') then
            concat('/no-search-terms',' ')
         else '' )
  return encode-for-uri($queryString)
 };
declare function bh:filter(){
 if($filter != '') then
    encode-for-uri(concat(' ',translate($filter,';',' ')))
 else ''
};
declare function bh:fullQuery(){
let $searchPath :=
    concat('http://pathtoSolr/solr/select/?q=',bh:build-query(),bh:filter(),
    '&version=2.2&start=',$start,'&rows=',$rows,'&facet=true&facet.limit=-1
    &facet.sort=true&facet.zeros=false&facet.field=parent_facet&facet.mincount=1
  &facet.field=creator_facet&facet.mincount=1&facet.field=coverage_facet&facet.mincount=1
 &facet.field=genre_facet&facet.mincount=1&facet.field=topic_facet&facet.mincount=2')
return  $searchPath
};
(:Stylesheet used for dispay: )
let $xsl := doc('/path/search.xsl')
(:Format results : )
let $results :=
<query-results term1="{$term1}" field1="{$field1}" bool1="{$bool1}"
   term2="{$term2}"field2="{$field2}" bool2="{$bool2}"
   term3="{$term3}" field3="{$field3}" filter="{bh:filter()}">
      {
         if((exists($term1) and $term1 = '') and (exists(term2) and $term2 = '')
           and (exists(term3) and $term3 = '') ) then
             <response hits="0">Your search returned 0 results</response>
         else    doc(bh:fullQuery())/child::*
}
</query-results>
return xslt:stream-transform($results, $xsl, () )

The results are transformed using XSLT (2.0).

I find this works pretty well, but I’m also very interested in exploring this new HTTP extension model which is pretty much what I was hoping for back when I started exploring the Solr/eXist combination. (Which just demonstrates once again what a great community of developers eXist has.)

Documents are still added to the index using a combination of XQuery and XForms. Next week I’ll be refining our editor to make submitting completed records to the index a one (maybe two) button process. I’m pretty pleased with Solr and have gotten a very positive response to the browsing and limiting features. I still have some features to work on, for example, while my users can add filters to search results, they can not remove them. This seems like a pretty easy javascript fix, but I haven’t really had the time to implement it yet.

Advertisements

2 Responses to “Solr revisited”

  1. Peter Binkley Says:

    This is very cool stuff. Do you think you’ll have time at some point to package up a working example of your Solr/eXist framework? You’re way ahead of the pack on XQuery and XForms, and I’m sure a lot of us would love the chance to learn from your code.

  2. wsalesky Says:

    Hi Peter,
    Thanks for the kind words. I’m actually working rather half-heartedly on getting Trac up and running so that I can publish my code in a more systematic way than what I have been doing on the blog. I hadn’t thought of putting together an example of the whole framework (eXist/Solr/XForms) but I suppose I could once I get things working a bit more smoothly here.

Comments are closed.


%d bloggers like this: