Back in the saddle … sort of

July 4, 2008 by wsalesky

In my last post I said I’d be back in about six weeks, any of my readers who are parents probably had a good chuckle over that one. Hannah turns 3 months today (pics here if you are interested), and I’m finally feeling like I can juggle more then just life with a new baby.

My work status remains a little up in the air. I took a year leave of absence from UVM, with the understanding that I would be doing about 20 hrs worth of consulting work for them while on leave. I haven’t started doing work for UVM yet but I have started doing some consulting. Currently I’m working on revamping the XSL stylesheets for HTML and PDF displays for the Archivists’ ToolKit. This has been a great project to ease back into working, because while EAD is a bit of a headache, I really enjoy working with XSL. It feels so uncomplicated after XQuery and XForms. Both stylesheets should be done by the begining of August for the next AT release.

Gone Fishing…

April 14, 2008 by wsalesky

I’ve been a little distracted lately:

My daughter Hannah was born last Friday, and as a result, I’m taking a little break from work and blogging. I’ll be back in oh… six weeks or so.

FireFox 2.0.0.13

March 27, 2008 by wsalesky

I pretty much dread every Firefox upgrade, because of the havoc it could wreck on my XForms. The past few upgrades have been problem free, but yesterday’s automatic update has put a halt to metadata production once again. I’m having trouble explaining myself to the disgruntled catalogers who had just gotten back to work.

I’ve e-mailed the list and an updated version of the extension should be out in a few days, in the meantime I’ve been poking about unsuccessfully looking for a nightly build that works with FF 2.0.0.13. I’ve also finally installed Orbeon on my development server and ported over the MODS XForm. It was surprisingly painless to get it working. However the working form is just the first step, since I have a whole metadata management system that has to be integrated. Most of the system is built using xquery, but could possibly be done entierly in XForms, it will just require some thinking through.

Unfortunately I’m on a bit of a tight schedule, I need to have the forms working reliably as soon as possible because I’m going on maternity leave some time in the next 3 weeks. I’ve got a lot of wrapping up and documenting to do before that time, so I’m not sure how far I’ll get with Orbeon.

**This version (for Windows) of the XForms extension does work with FF 2.0.0.13, just be sure to uninstall your earlier version of the extension before you install the new one. I didn’t do an uninstall, and was having all sorts of problems with submitting data. Everything seems to work just fine now.

Session Solutions

March 12, 2008 by wsalesky

So I came up with two possible solutions to my sessions problem, I’m sure there are others as well.

1) I could store all the xqueries that build my metadata admin interface in eXist with restricted read permissions so that uses would be presented with a login from the server when they arrive at the page, this would initiate the REST session, which would presumably perpetuate as long as the window was active (and thus hopefully by-pass the random logout issue.)

I haven’t actually tried this one, I just assume it would work. I don’t particularly like the solution because I don’t store most of my xqueries in eXist. However, I haven’t ruled it out.

2) Solution 2 is the one I worked on all day yesterday. First I’ve gone back to submitting the data via a POST which sends the data to an xquery, this query then uses the xmldb:store() function to save the data to eXist. This xquery also checks to see if there is a current session using session:exists(), if not it will send back “Not logged in!” if there is a current session the xquery will attempt to save the record to the database, if successful the xmldb:store() function returns the path to the resource, if it fails it returns an empty string.

My XForm now uses an event observer to check if the submit action was successful, if not it will warn the user that they are either not logged in, or have insufficient privileges to edit the resource and toggles them to a login form located at the top of their current page. Users can then login, a separate event is fired that changes or creates a new session and the user can save the data in the form, all without refreshing or leaving the form.

After much tweaking this seems to work. The nice thing is that having the login on the form saves the user from losing the data that they have been working on.

Here are what the additions the form look like:

Two new instances, the first to check what sort of error the xquery sends back if it is not successful:

<xforms:instance id="submitError">
  <dummy xmlns="">
    <submit/>
  </dummy>
</xforms:instance>

The second to hold the login information:

<xforms:instance id="login">
  <dummy xmlns="">
    <user/>
    <pass/>
  </dummy>
</xforms:instance>

Additions to the submission action:

<xforms:submission id="submit" method="post" ref="instance('metadata')"
 replace="text" instance="submitError" action="xqueryGoesHere.xql">
   <xforms:action ev:event="xforms-submit-done" ev:observer="submit">
     <xforms:message level="modal">Item Saved!</xforms:message>
     <xforms:refresh/>
   </xforms:action>
   <xforms:action ev:event="xforms-submit-error" ev:observer="submit">
     <xforms:message level="modal"
       ref="instance('submitError')/child::*[. = '']“>You do not have permission
       to edit this resource! Please log in.</xforms:message>
     <xforms:message level=”modal”
       ref=”instance(’submitError’)/child::*[. = 'Not logged in!']“>You are no longer
       logged in! Please log in.</xforms:message>
     <xforms:toggle case=”case-login”/>
     <xforms:refresh/>
   </xforms:action>
</xforms:submission>

I’ve then added a xforms:switch to the top of the MODS form that toggles between a login form and summary information about the resource. As you can see in the submission, if there is an error, the top part of the form will toggle to display a login form for the users, case-login. Once the user logs in with the new name and password, they click login which fires this submission action:

<xforms:submission id="do-login" method="get" replace="instance"
 instance="login" separator="&" ref="instance('login')" action="login.xql">
  <xforms:action ev:event="xforms-submit-done" ev:observer="do-login">
    <xforms:message level="modal">You are now logged in. Please re-save your data.
    </xforms:message>
    <xforms:toggle case="ready"/>
    <xforms:refresh/>
  </xforms:action>
  <xforms:action ev:event="xforms-submit-error" ev:observer="do-login">
    <xforms:message level="modal">Incorrect login information</xforms:message>
    <xforms:toggle case="case-login"/>
    <xforms:refresh/>
  </xforms:action>
</xforms:submission>

The login.xql looks very similar to this example from the eXist website. If the login is successful the form toggles from the login info to the MODS form info. The user is then prompted to re-save their data as the new user.

I’m still a little nervous putting the catalogers back to work with these forms because I haven’t been able to replicate some of the problems they had reported, but  I’m hoping even if they do get logged out this new form will allow them to log back in without losing any data. I’ve also finally made it possible to page from one record to the next in the queue, maybe that will buy me a little extra goodwill.

Struggling with sessions

March 10, 2008 by wsalesky

Right before I left for Code4lib 2008 we had a serious db crash, this was an eXist issue, which I’m hoping was partially solved by upgrading to 1.2 as well as some re-writing of xqueries to eliminate most of our temporary fragments creation. We haven’t had any problems with crashing since the upgrade, but a by product of the crash was some odd behavior that was reported to me by one of the catalogers. She was getting logged out in the middle of editing a record. Because of the way my forms had been set up she wasn’t getting any reliable messages about the status of the record (if it had been successfully saved or not), so she assumed a record had been saved, but it wasn’t, several records were lost, and worse, cataloger good will is now a bit more shaky than it was.

I had been submitting the data from our XForms to an xquery via post, the xquery then did some post processing and submitted the data to eXist using xupdate. Thinking it would be more straight forward I switched to using the RESTservlet, which allowed me to do a simple PUT from my XForm.  However,  I can’t seem to persist my session through the PUT request. My metadata administrative interface is a series of xqueries, the user must log in on the fist page using the eXist xmldb:login function, they then can browse through records in process, pull up completed records to edit and create new collections. However now when the user presses the save button on a record they are asked to login again. Once they have logged in from an XForm, they can save as many records as they like, so essentially it creates another session.

After some back and forth on the eXist mailing list it appears that sessions persist across xqueries, but that the RESTserver does not reuse this session information. The suggested solution was to post from my XForm to an xquery… which is what I had been doing. The real problem I’m having is how to get reliable information back to the user on whether or not the save was successful, and if not (due to an expired session) allow them to log back in without navigating away from the form they are on.

I’m wondering if I can use an XForms alert that is tied to an xforms-submit-error response. I haven’t found any examples of this so far, but I’m still looking.  Also, I seem to be getting an error message back from my form even when the data has submitted successfully.

I’m racking my brains for a workaround solution, because I’d like to get metadata work back underway, so I have a few weeks to monitor it. Any suggestions, or examples you know of would be greatly appreciated. Otherwise I’ll just carry on testing various solutions until I run across one that works.

XForms for Metadata Creation

February 29, 2008 by wsalesky

I just got back from Code4lib 2008 in Portland Oregon.  Last year I did a lightning talk on XForms, this year I did full 20 minute presentation. Well, actually I did half a 20 minute presentation. I split the slot with Michael Park from Brown. We had both been working on MODS editors and had been in touch over the past few months about our work. Since our needs and solutions were so different it made sense to demo both forms and talk about the different possible approaches. Mike is serving a much more diffuse audience then I am and a server side solution for him was absolutely necessary. Also the users targeted for his editor are faculty as opposed to librarians. In his case the full complexity of MODS is less important that insuring he gets the right minimum data submitted with each record.

Here are the slides, video will be coming eventually, although I need to redo the audio portion of my half of the talk due to a technical snafu.


Doing a joint 20 minute presentation was quite a challenge, I think both Mike and I could have easily used the entire 20 minutes, but I think it was a good thing for people to see the two different editors, and be able to talk to us both after the presentation about which ever approach might be most appropriate for their institution.

Both Mike and I have released the code for our editors as opens source (Mike’s is quite well documented, I’m still working on documentation). You can see examples of the forms and get the code at the locations below.

UVM

Brown

MODS for everyone

February 18, 2008 by wsalesky

When I started working on the CDI project almost 2 years ago there were no easy solutions for integrating the creation and editing of complex metadata into a project workflow. Most DAMS (Digital Asset Management Systems) came with the ability to create Dublin Core, or some form of modified Dublin Core, but the creation of MODS was not supported.

Today there are a plethora of people working on MODS editor solutions. Here are just a few:

  1. The University of Tennessee Libraries - The University of Tennessee has released their MODS workbook as open source, it is a web based form that I believe is javascript based.
  2. Peter Binkley at the University of Alberta - Peter has just announced his MODS editor is complete, or very nearly so and is built using the Cocoon Forms Framework and and you can try it out here.
  3. Michael Park at Brown University - Mike has been working on a MODS editor using Orbeon Forms.
    **edited to add links from our recent presentation:
    Code and documentation: http://dl.lib.brown.edu/its/software/metadata/
    Example: http://riker.services.brown.edu:8080/repo/mods/demo.html
  4. Parmit Chilana formerly at Princeton University - Wrote a MODS editor using Orbeon, I’m not sure who has taken over now that she has moved on.
  5. Clay Redding at the Library of Congress
  6. Me - I just got permission to release our metadata editor as open source from the UVM powers-that-be. The forms need some cleaning up, but I’m posting my demo forms here, and will be slowly adding the code for the entire metadata editing interface, including our forms for posting data to Solr. I’ve also had some time to refine the MODS editor we are using in production, there are two versions, the simple version - used by the copy catalogers - and the full version for editing the full record.
    **links from my recent presentation:
    Code: http://code.google.com/p/xforms4lib/
    Examples:

Its great to see so much development happening in this area, it always seemed crazy to me that we had so many library metadata standards but but no way for our users to create these records. I still prefer the XForms solution to most of the other solutions I’ve seen, just because XForms seems to me the most logical and simple (in spite of all the trouble I’ve had with it) method for creating and editing XML data.

Now, if only someone would build a decent METS editor.

Fixing temprorary fragments

February 12, 2008 by wsalesky

Surprise! Or at least to me… My XForms are not the source of the majority of my temporary fragments, at least as far as I can tell. This is good news because the XForms constitute the most complicated part of my web app, and would take the longest amount of time to troubleshoot and fix.

I’ve follwed the trail of temprorary fragments on my development machine by tracking the exist logs. After every query/page request I check the logs to see if a temprary fragment has been created. In addition  I’ve set up a little xquery that pulls the results out of /db/system/temp so I can see excatly what the temprary fragment is, allowing me to pin point the problems in my xqueries.

The results supprised me, although after a little more reading on eXist and temporary fragments, they make sense. In particular this thread on the eXist mailing list was enlightening. I use the doc() function to return the xml response from Solr and then transform it with XSL stored in eXist. For some reason I was calling the results like this:  doc(’solrResults’)/child::*  As noted in the above thread, applying an xpath to the returned fragment causes it to be stored as a temporary fragment. The fix has been easy enough, I simply removed the child::* operator and adjusted my XSL.

I have a few other queries that were also creating temporary fragments, and for the most part the issues are very similar, the use of xpath on a returned fragment, rather than using xsl, or even creating variables in the xquery to get the values. For the most part it has been pretty simple to rewrite these queries, and has also given me a chance to clean up some of my code.  In addition I’ve allocated more memory to Tomcat. Hopefully these two adjustments will alleviate the issues we have been having the past few weeks.

Troubleshooting eXist

January 28, 2008 by wsalesky

For the past few months I’ve been experiencing some mysterious (to me at any rate) problems with eXist crashing. Generally speaking the problems seemed to be due to corrupted index files, and/or issues with temporary fragments. The problems seem to intensify the more the metadata editor is used, and I noticed when I deleted a collection the database would invariable crash, usually the following night. I’ve been limping along by surreptitiously restarting and re-indexing the database, but have had several instances where the database could not be restored in this way and had to restore from backup. This meant that the site was down and completely non-functional while I restored the files.

I think the issue is pressing enough that I’m going to have to put a lot of other things on hold to sort it out. I’ve found a few threads (1,2,3) on the mailing list that seem to address the issue, and it looks like some of it can be solved by an improvement of my xqueries, and that the issue of temporary fragments is on the minds of the developers.

I think I’ll start going over my xqueries to eleminate the creation of temporary fragments where I can. This will be a useful exercise anyway, as many of the queries were written before I had a through understanding of xquery. Hopefully this will significantly improve performance. Another option could be to move the metadata processing to the development server, and have a master/slave configuration to update the live site every night. Since the biggest problems with crashing/corrupted indexes seems to happen when metadata creation work is highest, this could be a way to cut down interruptions in service on the public side.

I like that UVM has given me the freedom to experiment and use technologies that are not being used elsewhere in the library, however with the freedom have come some headaches. I do not have colleagues to fall back on for troubleshooting. I understand why small digital projects go with systems like ContentDM, the trade off in flexibility may be well worth the time saved in other areas.

For every solution a new problem

January 15, 2008 by wsalesky

It seems like everytime I solve a problem I create a new one.

As I’ve mentioned several times before, we are using MADS (largely downloaded from OCLC as MARC and transformed to MADS) for authority control. This is great for insuring a consistency during cataloging, particularly as we have moved to a slightly more distributed cataloging workflow. We now have 6 part-time people rather than 2 part-time people working on cataloging. Having more consistent cataloging in terms of authorities has in turn has helped with standardizing the faceted browsing of the site.

However I find that I’m now facing two new issues:
1) I now need MADS form that allows the catalogers to create local records, and make changes to existing records. It would also be useful to have an upload feature that allows someone other than me to add new records from OCLC to the database. I knew this would be a need, but for every feature I add to the admin side of the application, time is taken away from the other parts of the project that I should also be working on, including: project management, collection development, metadata workflow management, interface refinement and new feature implementation. So it is a balancing act of what needs to be done the most urgently, and since it only takes me a minute or two to add or change a record, this may fall pretty far down on my list even though the catalogers are requesting it.

2) I’m also struggling with how to take advantage the data in these records. Currently we store them in eXist, the catalogers can search the terms, and then select the appropriate term, and have it added to their record via the nifty subject suggest utility. It is great for the catalogers, but doesn’t really do much for the users of the website.

MADS records are interesting, they have related terms, variants, broader and narrower terms, etc. I’m wondering should all or at least part of these records be indexed with the items? For example, if I have a record cataloged with “Automobiles”, for the best search results, shouldn’t I also index this record with the variants of this term like “auto”, “car” and “Motorcars?” Do OPACS generally do this? I’m unsure, as I really don’t spend much quality time with our OPAC. To me adding the variants and related terms makes a certain amount of sense, but broader and narrower terms probably don’t.

Are there other ways I can take advantage of these broader and narrower terms? Perhaps I could have have a little icon next to each term on the browse collections pages indicating that variations/broader/narrower terms are available?

While thinking about the above, I’ve noticed some eccentricities in the MADS data that might make some taking full advantage of the MADS records a little tricky. For example:

<mads version=”beta”>
<authority>
<name type=”corporate” authority=”naf”>
<namePart>Burlington (Vt.)</namePart>
<namePart>Dept. of Streets.</namePart>
</name>
</authority>
<variant type=”other”>
<name type=”corporate”>
<namePart>Burlington (Vt.)</namePart>
<namePart>Streets, Dept. of</namePart>
</name>
</variant>
<variant type=”other”>
<name type=”corporate”>
<namePart>Burlington (Vt.)</namePart>
<namePart>Engineering Division.</namePart>
<namePart>Dept. of Streets</namePart>
</name>
</variant>

Nowhere in the above record is Department spelled out. I would think that would be a useful variant to have for a keyword search. Also, I find a lot of corporate names that include geographic subdivisions as part of the name:

<name type=”corporate” authority=”local”>
<namePart>Edmunds Elementary School (Burlington, Vt.)</namePart>
</name>

wouldn’t it make more sense like this?

<name type=”corporate” authority=”local”>
<namePart>Edmunds Elementary School</namePart>
</name>
<geographic>Vermont</geographic>
<geographic>Burlington</geographic>

Then you could potentially break out the geographic sub divisions for faceting. I guess that gets back to the argument of pre or post coordinated headings. All in all I find authority records baffling, but I’m still hoping to make some use of them. I’d be interested in any work that others are doing utilizing authority records to improve access.