Archive for September, 2006

progress notes

September 28, 2006

Short list of progress made on interface functionality:

  • Simple searches implemented for across collections and within a collection.
    • Searches currently do not de-dupe or use any sort of relevancy ranking.
    • After the prototype demo I will probably investigate using lucene for searches. There has been some talk on the eXist website about how to integrate lucene and eXist.
  • Browse by title page created and formatted, there are several options for browsing that still need to be implemented, including a main browse page that lists all options. I’m partial to the LOC’s implementation of this ala the Making of America website.
  • Collection page created, there remain some decisions to be made about how users will be browsing within a collection, and how to pull that information in the most efficient manner possible. The CSS for this page remains a little buggy.
  • Item pages started.
Advertisements

A little intro to the blog

September 28, 2006

I have been working as a digital initiative librarian for about 6 months, but have been working in various roles on digital library projects for the past 5 years. In my new position I find myself faced with a lot of questions and decisions that had in the past been made by others; the metadata librarian, the programmer, the project director, etc. I now find myself being largely responsible for making these decisions. I hope to use this blog to chronicle the process of starting a digital project including theory and actual code that I’m working with.

The first question that I struggled to answer on this project was which of the many solutions available will be right for my institution. We have certain limitations, the biggest being limited staff, in particular I think not having a programmer is a challenge to a project like this. My main goal in selecting a system was finding something that would allow graceful transitions as the various digital library solutions matured.

These are a few that I looked at (in no particular order):

They all have their strengths and weakness, my institution had already been using ContentDM for various projects but I had some concerns about the limitations to types of metadata and to the rang of data types. I think ContentDM’s strength lies in images, not text based resources. Our first project is heavily text based. We are also looking for a new home for our EAD finding aids which were problematic in ContentDM.

The most exciting of the possible solutions were probably Fedora and XTF, or even a combination of the two, which I know several projects are investigating. However Fedora was simply too much of a bear in terms of the programming needed to get it started. I had until November to get a prototype up and running. I installed a Fedora (Fedora, Elated, and Fez), XTF, Greenstone, ContentDM and eXist. I tried to work with each system but had a limited amount of time to devote to learning the intricacies of each.

In the end I chose eXist because it is open source, it stores my data as xml, outputs it as xml etc. Also I had also had some experience with xquery in my previous position and I liked the idea of being able to run the whole web application using only xquery, xpath and XSL, all of which I was at least familiar with. This alone would speed up development quite a bit. The other advantage is that I can easily export the objects from eXist into another system at a later date. In particular I’m keeping my eye on XTF, with a half eye on Fedora (half an eye, because I think Fedora is a long way away from being an out of the box solution, and I do not have the programming knowledge I think I would need, however, if we hired a programmer…)

The rest of this blog will mostly likely be dealing with how I’m building my systems, including the data processing side, and the user interface. Currently I’m dealing with, xqueries, learning how to use xforms and integrating them into our data processing procedures, information architecture, interface design, and, oh yes, metadata issues.