Archive for October, 2006

FireFox 2.0 != XForms

October 24, 2006

I downloaded Firefox 2.0 yesterday, and it has some great features, my current favorite is the spell check. However, a word of warning to implementers of XForms who rely on the Firefox XForms extension. The XForms extension does need Firefox 1.5.0.2 and will not work with 2.0 yet. This may not be news to anyone else, as it is stated on the extension website, but I guess I didn’t really pay that much attention until I saw what happened to my Xforms after installing 2.0.

I’m holding our metadata processing environment as is for now (using Firefox 1.5.0.2), but may also be investigating yet more options for XForms, including this one: AJAXForms (mentioned by a commenter on the last post).

Update: Firefox 2.0 compatible extension for XForms has been released. See this post for more info.

Getting xforms and eXist talking

October 23, 2006

I used three different methods for implementing XForms with eXist (I am currently only using #3).

  1. Orbeon Presentation Server
  2. An XQuery passed through a cocoon pipeline which transformed the XML results using an XSL stylesheet. The edited documents were saved to eXist using REST put.
  3. An XQuery that first authenticates the user and then outputs an xform using the retrieved document as the XForms instance and using REST put for submission.

I looked at Orbeon about 6 months ago, it seems like a great product, especially in light of the poor browser support for XForms. If I needed to make a commercially viable XForms application I would probably spend some more time playing with Orbeon. However, I felt that Orbeon added yet another layer of complication to my application and it was not necessary in our situation because our XForms are being used in a controlled environment (and I don’t have more time). I can be assured that everyone using the editing interface will be using Firefox, not a great solution, but so far so good.

I generally prefer XSL to XQuery, so my XQueries have, in the past been pretty simple, returning very raw results to an XSL stylesheet. My first attempt at XForms used XQuery to call the XML document to be edited (or to create a new document). Then I had a cocoon pipeline  to specify the XSL stylesheet for transformation and the serialization option.

<map:match pattern="getDC.xq">
  <map:generate src="getDC.xq" type="xquery"/>
  <map:transform src="stylesheets/dcForms.xsl"/>
  <map:serialize type="xhtml11"/>
</map:match>

In the main sitemap.xmap file (included with exist), xhtml11 is the serializer that will output mime-type=”application/xhtml+xml”.

The XSL stylesheet then selects the node I’m interested in and uses xsl:copy to populate the XForms instance. The submission element uses put to replace the existing document with the edited document.

<xf:model>
  <xf:instance id="metadata">
    <xsl:copy-of select="/child::*"/>
  </xf:instance>

  <xf:submission id="submit" method="put" replace="all">
    <xsl:attribute name="action">
      <xsl:value-of   select="concat
           ('/exist/servlet/db/mets/collections/',
           $filename,'.mets.xml')"/>
    </xsl:attribute>
  </xf:submission>

</xf:model>

One of the features that I found the most useful in XForms is the ability to repeat elements as needed. Here is an example from a very simple form that we used to edit Dublin Core records that uses XForms repeat. This field populates the dc:subject element and adds a type attribute selected from a drop down menu. New subject elements are added or deleted when the users click the add or delete buttons created by xf:trigger. I use xf:setvalue to create each new element as a blank element, otherwise the element will simply copy the data from the first instance of the element.

<xf:repeat id="repeat.subject" nodeset="//dc:subject">
  <xf:input ref=".">
   <xf:label>Subject Headings:</xf:label>
  </xf:input>

  <xf:select1 ref="@type">
   <xf:label>Type: </xf:label>
   <xf:item>
    <xf:label>topic</xf:label>
    <xf:value>topic</xf:value>
   </xf:item>
   <xf:item>
    <xf:label>name</xf:label>
    <xf:value>name</xf:value>
   </xf:item>
  </xf:select1>
 <xf:trigger class="delete" appearance="minimal">
    <xf:label>Remove</xf:label>
    <xf:action ev:event="DOMActivate">
     <xf:delete nodeset="instance('metadata')//dc:subject"
         at="index('repeat.subject')"/>
    </xf:action>
  </xf:trigger>
 <xf:trigger class="add" appearance="minimal">
    <xf:label>Add a subject field</xf:label>
    <xf:action ev:event="DOMActivate">
      <xf:insert nodeset="instance('metadata')//dc:subject"
           at="index('repeat.subject')" position="after"/>
      <xf:setvalue ref="instance('metadata')//dc:subject[last()]" value=""/>
    </xf:action>
  </xf:trigger>
</xf:repeat>

This method works fine, but I as stated in an earlier post I wanted to route all my metadata processing XQueries through a password authenticating XQuery. This query calls a series of metadata administrative tasks, including the XForms.

I added the following namespaces for XForms to my xquery:

declare namespace xf="http://www.w3.org/2002/xforms";
declare namespace ev="http://www.w3.org/2001/xml-events";

And the exist:serialize option:

declare option exist:serialize "method=xhtml media-type=application/xhtml+xml";

The XForm is contained in a function called by an authenticating function. I made very few changes to the XForm code I had been using in my XSL stylesheets to get it working when called by the XQuery.

Here is a round-up of some of the resources I have found to be most useful in building these forms:

  1. eXist mailing list – search the archives for XForms
  2. eXist – XQuery examples
  3. Cocoon website (pipelines)
  4. XForms tutorial – Adrian de Jonge’s blog
  5. XForms – Tutorials and Cookbook – Wikibooks
  6. XForms for HTML Authors – W3C
  7. O’Reilly XForms Essentials by Micah Dubinko

Function(ing)

October 17, 2006

Well, it turns out that I can no longer get by without understanding how to write my own xquery functions. I finished my simple search xquery which searches items across collections within the database. I ended up following these tips pretty closely, except I do not store the results in an HTTP session. I plan on updating the search so that it does do so, but I had a little trouble writing this part of the query. I also added a function so that I could page though the results.

For the curious, my version of the simple search looks something like this:

(:caculates the end value for each page of results:)
declare function bh:getEnd($max as xs:integer, $start as xs:integer) as xs:integer{
  let $newEnd := $start + $max
  return $newEnd
};
let $max := 50
(:external parameters:)
let $query := request:request-parameter("query", "")
let $start := request:request-parameter("start", "")
(:the search:)
let $results :=
for $hits in collection('/collection')/mets:mets/mets:dmdSec[@ID='dmdDC']
      //descendant::dc:dc[. &= $query]
    let $title := $hits/dc:title[1]
    let $id := $hits/dc:identifier
    let $author := $hits/dc:creator[1]
    let $description := $hits/dc:description
    let $type := $hits/ancestor::mets:mets/@TYPE
    let $result :=
         <dc:dc type="{string($type)}">
          {$id, $title, $author, $description}
         </dc:dc>
    return $result,
     $totalResults := count($results),
     $end := if($totalResults >= $max) then bh:getEnd($max, $start)
             else $totalResults + 1
(:variables used for paging through the results:)
let $prevPg :=
  if ((($start cast as xs:integer) - $max) lt 1) then ''
  else ($start cast as xs:integer) - $max
let $nextPg :=
  if ($end gt $totalResults) then ''
  else $end
(:putting it all together:)
let $searchResults:=
  <results query="{if (empty($query)) then '' else $query}"
  prevPg="{ if (empty($prevPg)) then '' else $prevPg}"
  nextPg="{ if (empty($nextPg)) then '' else $nextPg}"
  total="{$totalResults}" count="{$max}">
    {
     for $i in $start to $end
     let $current := $results[$i]
     return
      <result number="{$i}">{$current}</result>
    }
  </results>

return  $searchResults

I’m moving on to the advanced search tomorrow. I also have several interface design issues outstanding that need to be addressed, some content to create and some sort of news feed to implement, and November is only two weeks away.

One step forward, one step back

October 13, 2006

eXist crashed on Wednesday. Actually crash is probably the wrong word, it seemed to be running fine but then failed to restart when I restarted Tomcat. We have backups, run early every morning, but that doesn’t help for the data that was entered during the day on Wednesday. More worrisome is that it is still unclear to me what caused the corruption in the database. I found a few discussions on the exist mailing list that seemed to be about similar problems, but without any satisfactory answers as to why the corruption occurred.

http://thread.gmane.org/gmane.text.xml.exist/5254/focus=5254

http://thread.gmane.org/gmane.text.xml.exist/7161/focus=7248

After a day and a half of trying to figure out what went wrong I caved and wrote to the list. I try to put that off for as long as possible, because while the list is very active, and generally helpful, I hate asking a question and then figuring out the answer myself later (or, I’ll be honest, getting an answer back that makes me feel stupid). I ‘ve been unable to reproduce the error after replacing the corrupt instance with the one from the backup. I have a feeling it was something I was working on during the morning on Wednesday, which means either my search xquery (which was outputting some java exceptions), or perhaps some of the xupdates I was using to add new elements to a few hundred documents at once.

Now that we are back up and running I’m returning to the question of my search xqueries, which I think need to be a little more sophisticated, the heart of which looks like this:

let $results :=
for $hits in collection('/db/mets')/mets:mets/mets:dmdSec[@ID='dmdDC']
    //descendant::dc:dc/child::* [self::* |= $_query]let $type := 
$hits/ancestor::mets:mets/@TYPElet $title := $hits/parent::*/dc:title[1]
let $id := $hits/parent::*/dc:identifier
return
<item id="{string($id)}" type="{string($type)}">
  <title>{string($title)}</title>
  {$hits}
</item>

It is problematic because it returns multiple hits for a single document. This is a pretty easy fix to make, but I also ran into a problem with this query when I had over 1000 hits, I encountered a java error (as noted above), so I will need rework this. I can limit the number of results returned, or I could use this search to only search collection level records, not item level records, most likely the first option. I have also had a request to include the author/creator field in the results which is a minor fix.

Update: My answer from the eXist list about the database corruption:

I fixed your issue. It wasn’t a “real” corruption, just removing the .lck files would have helped. As the exception shows, the lock files were damaged.

> org.exist.storage.lock.FileLock.read(FileLock.java:208)
> at
> org.exist.storage.lock.FileLock.tryLock(FileLock.java:108)
> at
> org.exist.storage.BrokerPool.canReadDataDir(BrokerPool.java:596)

Anyway, the startup process should handle this. After a database crash, the file locks might be incomplete. eXist will now check this.

So, that is good to know. Also eXist 1.0 and 1.1 final have just been released, I may take some time this week to upgrade to 1.1 final.

Interface design, progress notes

October 10, 2006

We have a demo of the prototype scheduled for mid-November, I’m hoping that I will have full functionality by that time, and I’m getting a lot closer. I spend almost equal amounts of time designing the interface as I do implementing the design. My original design (the one that got approval) was only for the home page, so I have been designing the internal pages (browse, collection and item level pages) and writing the code for them at the same time.

Here is what I have so far:

  • Home page – The home page is populated with images from (and links to) the 4 most recently added collections. Beneath these “featured collections” is a large browse box with several different avenues for browsing the site. This is currently static information but will be dynamic in the future. There is then the obligatory “about” blurb and the latest news from our non-existent news feed.
  • Collection pages -The collection pages were a challenge. I wanted the pages to contain a brief overview of the collection, and then the full list of items in the collection. I also wanted to make different “filters” for browsing the collection available. So I designed them with a smaller version of the browse box from the homepage that allows the user to browse all items in the collection or limit by genre, topic, people, place, or time. These categories are dynamically generated from the items in each collection. There is also a search with-in the collection option.
  • Item pages – I was originally planing on using the Mets Navigator from Indiana University as a page turning application, but it is a bit difficult to use with databases due to the fact that the navigator caches the pages and does not refresh when changes are made to the item. However, the way our METS records are formatted has made it very easy to implement my own version of the this application using xquery. The item pages have two parts, the page turning side and the description/metadata. I have also included a “find related materials” box that links to other items tagged with similar places, genres, people, and topics, as well as links back to all of the parent collections.

I’m still working on:

  • The advanced search
  • The browse collections page – I’m still contemplating the xquery I will need to write for this, and have’t quite figured it out yet.
  • About – This is mostly a content issue. Some of this content will need to be written by committee (our mission statement) and some of it I just haven’t had a chance to write yet.
  • News – There is no news (but I still need to put together the feed, and the query that will call it).

Recovered and moving on

October 5, 2006

Well, the weekend was enough time for me to recover from the extreme frustration I experienced on Friday, and I’m happy to say that my xforms are now being called by a password protected xquery.

As mentioned in my last post the big hang up was the mime type (declared in the HTTP header) for the xforms, it has to be application/xhtml+xml (which by the way IE 6 does not support). I had already written a very simple xquery and then transformed the data with an XSL stylesheet. However when I tried the transform:stream-transform using this xquery and stylesheet I had no luck forcing the HTTP header to output the correct mime type. So, much to my disappointment, I ended up having to write the entire xform in xquery. There really isn’t any drawback to this other than that I’m much more comfortable coding with XSL than xquery. The forms are working and are now password protected, so I’m feeling pretty good about that. I can continue research into the problematic mime types at a later date if I really want to go back to using XSL.

So what I have ended up with is one long xquery that authenticates the user and then initiates the various pieces of the metadata processing interface. All portions of the interface are called by different functions in this xquery, which has allowed me to learn user defined xquery functions without to much pain, as I had originaly written all of these as separate xqueries. For this xquery I defined each action (for example: view metadata queue, create new collection and edit records) as a separate function and then call them from a main function which first tests for session authentication and if it doesn’t find it presents the user with a log-on screen.

I’m feeling pretty good about my new understanding of user created functions, and may be ready to try something a little more complex, like recursive functions.

Here is the short list of resources I found most useful for learning functions, and also for writing and troubleshooting my xforms:

Xquery Functions (http://www.stylusstudio.com/xquery/xquery_functions.html)

A “how to” for xforms (http://adriaandej.blogspot.com/)

WC3 intro to xforms (http://www.w3.org/MarkUp/Forms/2003/xforms-for-html-authors)

And the eXist listserv archives (actually there seems to be a lot of talk on the listserv just in the past few weeks about eXist and xforms).

I also have a rudimentary page turning application up and running, more about that later.

Update:  I just wanted to clarify where the problem with the mime-type was cropping up. As one commenter noted it is possible to change the mime type in eXist with declare option exist:serialize “method=xhtml media-type=application/xhtml+xml”; which is what I ended up using for my xquery based xform. I had originally tried to transform the output from this xquery this with an xsl stylesheet, and this is where I “lost” the application/xhtml+xml mime type. I also had an xsl:output statement in my xsl stylesheet that looked like this: <xsl:output omit-xml-declaration=”yes” encoding=”UTF-8″ method=”xhtml” indent=”yes” media-type=”application/xhtml+xml”>. I retrieved two different results. When I used transform:transform which goes through eXist I returned the correct mime type, but could not get my namspace declarations into the html tag and when I used transform:stream-transform, which bypasses eXist, I had all the correct namespaces in my html tags but the mime type was text/html. I’m guessing the problem lies in a default output for eXist and perhaps Tomcat.

eXist and mime types

October 1, 2006

Friday was a bang your head on the desk kind of day. Unfortunately the banging didn’t help much. Most of the head banging involved me trying to get xforms to work without using cocoon.

We are moving beyond the university firewall because I want the opportunity to have former colleagues take a look at what I’ve been doing and to help me troubleshoot if I need it. But I also want to make sure the data processing side of the application is secure, which it is not currently. To do this I am routing all my data processing xqueries through a password authenticating xquery.

I had a lot of trouble getting the xforms functioning in the first place. I have been using cocoon to serve the forms, and while I know I can use cocoon to do session authentication I don’t know how, and would rather just cut cocoon out of the mix entirely if possible. One less thing to keep track of and learn.

So the problem that was arising was that in spite of the xsl:output declaration my forms were coming back with a mime type of html, not the application/xml+xhtml that is required for the forms to work. I had also noticed that none of the pages I was creating using xsl were outputting a doctype declaration although they should have according to my stylesheets (and did when I tested the stylesheets on my machine using oxygen).

I found an answer after much wild goose chasing and searching the eXist listserv. I have been transforming my results from eXist (with the transform:transform function in my xqueries) was passing the information to the xsl processor (Saxon) and then passing it back to eXist. This would be handy if I needed to do additional processing but for most of what I’m doing this seems to be the wrong function to use because eXist then stripped the dtd declaration and serialized all the results as html. The function I should have been using was transform:stream-transform which outputs directly to the ermmm, the web server I guess.

Problem solved. Or mostly, I can now return to the orignal problem that I started with on Friday, which was learning how to write my own xquery functions.


Follow

Get every new post delivered to your Inbox.