Miscellany that has been keeping me busy

We are starting to creep up on the end of my initial grant period (in September) and are in the process of spending the remaining money. The library has offered to purchase the CDI’s development server, and a machine to be used as storage for our master tiff files. These will be shared resource, at least in that the library will benefit from CDI expertise, and server space, for other library digital content (this came up because the library is starting to manage a collection of digital thesis). This leaves us with enough money to purchase a book scanner, and to buy some additional workstations a laptop for the conference room, or other equipment.I now have a development machine (just a co-opted desktop running Linux) on my desk, and I spent a part of the past week learning how to install Linux and setting up the development environment. I work closely with someone in systems who takes care of our production machine, but he let me do the Linux installation on my own (while watching over my shoulder). It was pretty easy, and I’ve now moved on to more interesting problems, such as solving my url issues (involving some convoluted issue with mod_jk), and reading up on Subversion. I would like to have the development environment on a branch in subversion, that I can merge to the trunk when it is ready to go live, I assume this is possible, but so far have been making all of my edits to the trunk so I will need to read up on this.

In addition, I have been working on leftover details for the web site, I now have the “remove filters” option working (go ahead, try it) and will be looking into zooming for images and working on continual improvement of the interface and adding of new features.

I’m also still working on the new finding aids site (found here, but still a work in progress). I have a Solr instance set up for the EADs, but am having trouble indexing documents, namely, outputting all the next nodes in the document only once and with the correct spacing (without writing a hugely complex stylesheet) . My brain insists that there is a simple way to do it, but I haven’t managed it yet. Everything I try either outputs some elements multiple times (i.e. a parent and all its children, and then the children again, as it works through the document tree) and/or does not insert spaces between elements, which makes the output fairly useless for searching.

The CDI is also involved in our first collaborative project. We are collaborating with the Landscape Change project here at UVM to digitize several hundred lantern slides from the Long Trail. I have only seen a few of the images, but it looks like a great collection. For this project the CDI will keep the master tiff files, and then each website will host jpg copies and copies of the metadata. I hate the duplication but am not sure how best to coordinate shared metadata at this point, and I didn’t want to stall the project while we figure that out.

I also have a paper to write and a presentation to come up with. So far I have a title for the presentation (“Innovative Interfaces: making the most of the data we have”) and nothing for the paper.


  1. Kevin S. Clarke Says:

    I’m in awe of the amount of work that you get done!

    For the EAD indexing in solr, I’ve done it without too much bending over backwards (though there is always room for improvement). You can see my attempts at:


    The two main XQs to look at are solr.xq and ead2solr.xq

  2. wsalesky Says:

    Hi Kevin,
    I was actually composing an e-mail to you about your solr/ead implementation, but then I went on vacation and got sidetracked. I guess I can toss that draft e-mail…

    I think I see how you solved my problem (util:string-padded) to add in spaces between elements. Thanks for the code help!

