Archive for February, 2007

Georgia bound

February 26, 2007

I’m leaving for Georgia today for the code4lib conference. Although I don’t have everything I had hoped to have done I can admit that some of my goals were a little unrealistic (learning Java in one short month for example). The big thing I did not get done was a sample MODS xforms editor for the xforms BOF. Mostly because I feel stymied by xforms 1.0 and don’t want to develop the form without the benefit of the 1.1 additions to the specs. I do have some forms-in-progress though, and lots of ideas about how the form should/could work.

I did get through the solr tutorial(s) and have solr running in both Jetty (for the pre-conference session) and in tomcat (because I’ll have to do it this way for my data eventually). I also wrote a simple xquery to output my data from eXist to a solr xml format. Now I just need a way to post my xquery data to solr.

I can’t promise that I’ll live blog the conference, but I will be posting highlights, it should be a great conference.


XForm Repeats

February 18, 2007

In this entry I talk about one method for dealing with empty repeats. Here is the method I ended up using:

Within the xforms:repeat add the following two xforms:triggers:

 <xforms:trigger class="delete" appearance="minimal"
   ref="self::node()[count(//dc:title) > 1]">
	   <xforms:action ev:event="DOMActivate">
		<xforms:delete nodeset="instance('metadata')//dc:title"

 <xforms:trigger class="delete" appearance="minimal"
   ref="self::node()[count(//dc:title) < 2]">
	     <xforms:action ev:event="DOMActivate">
		<xforms:setvalue ref="instance('metadata')//dc:title"
		   at="index('repeat.title')" value=""/>

This above code essentially creates two if statements, swapping in the appropriate action based on the number of dc:title elements in the document. The first statement in the ref attribute of xforms:trigger only creates the delete action if there is more than one title elements in the repeat. The second xforms:trigger comes into play if there is only one title element in the repeat, this uses a setvalue action to clear the element without deleting. This prevents the last dc:title element from being deleted from the document but there still needs to be at least one dc:title element in the initial document.

I prefer this method because I didn’t like inserting an empty “template” element for every repeated element as in the previous example. I also ran in to problems with the delete removing the wrong element in the repeat and, although I’m sure it was just buggy code on my part, I find this to be a simpler solution. Of course I still have to insert an empty element into the initial data if the selected document doesn’t have the element.

I think both of these solutions are just patches until XForms 1.1 is supported by Firefox. XForms 1.1 has a much more elegant solution to the empty repeat problem which allows the use of a template instance that can be used to insert new elements/nodes. This would eliminate the need to add empty elements and it could also be very handy for simplifying potentially complex forms into manageable interfaces. (I have some thoughts on how to use this with a MODS XForm.)

For those who don’t want to wait for the Firefox extension to catch up (it sounds like they have enough to keep them busy and are not looking at implementing 1.1 specs until it is completed), Orbeon has implemented some of XForms 1.1, including the contex and origin attributes that allow inserting elements from a template into an empty node.

A Good Year… for XML

February 16, 2007

Elliotte Rusty Harold over at IBM developerWorks is predicting an exciting year for XML (Ten predictions for XML in 2007) , including a nod at XQuery, native XML databases, and XForms, all of which I’m rather heavily invested in. I’ve been pretty comfortable with the choices I made for the CDI, still, Harold’s predictions are reassuring, it is nice to hear someone else is predicting a bright future for the limb I’ve climbed out onto.

OAI data provider

February 7, 2007

I’ve pretty much finished writing my XQuery OAI-data provider. The process has taken longer than I expected (particularly since the original XQuery I was using was mostly complete). However, I ended up re-writing most of it, partially to insure that I had a thorough understanding of the code, and partially to add some additional features. For example I wanted to be able to provide unqualified Dublin Core records for all the metadata types we hold in the repository, with the flexibility to easily add additional types. Currently the query supports Qualified Dublin Core, MODS, and EAD records and adding additional types should be trivial.

Implementing the data provider is also bringing up some organizational questions. For example, how do I want to support deleted records? How about sets? For our collections I think defining sets as collection of records (rather than an item and its component parts, which is what the METS records do) makes the most sense. For deleted records I’m using the RECORDSTATUS attribute in the mets:header to “deleted” and deleting the actual content, metadata, and full text. I haven’t decided how to implement deleted records for the EAD’s yet, I think it is unlikely they will be deleted. I will probably use the revisiondesc tag, with the value of the item tag as “deleted.”

I’m also getting a little hung up on ResumptionTokens. I have simple paging in place, but have started to wonder if there might be a better method, the guidelines are little vauge on this.

Here are a few of the most helpful resource that I used while putting together the data provider.

  1. The Open Archives Initiative Protocol for Metadata Harvesting – I found this resource to be the most helpful in actual implementation, there are lots of examples.
  2. OAI Best Practices [NSDL]
  3. Open Archives Forum Online Tutorial
  4. Exposing and Harvesting Metadata Using the OAI Metadata Harvesting Protocol: A Tutorial
  5. Proai 1.0 – “Proai is a repository-neutral, Java web application supporting the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) version 2.0.” This may be a reasonable alternative to the XQuery that I’m working on, though I’d like to finish it anyway.

There have also been some recent discussions in my department about being an aggrigator for Vermont based digital collections, as there a several institutions who have expressed interest in collaborative projects, or who have content that would mesh really well with the Vermont centric nature of our current content. So there may be more fun with OAI in my future.