Archive for the ‘General’ Category

Gone Fishing…

April 14, 2008

I’ve been a little distracted lately:

My daughter Hannah was born last Friday, and as a result, I’m taking a little break from work and blogging. I’ll be back in oh… six weeks or so.


Quick update

January 8, 2008

I have a steadily growing list of things I need to get finished, but I finally managed to get around to adding RSS (actually atom) feeds for tracking search results and also for tracking collections. This will be particularly handy for a collection like the McAllister Photographs which will be a work in progress for quite some time. (It is currently growing at a rate of about 50-100 photos a week, but I imagine this will slow down once the cataloging staff has caught up to our part time scanning tech.)

In the process of adding the feeds I also took a the opportunity to upgrade to Solr 1.2, which was a pretty painless upgrade, with some nice additional functionality. I hope to get a chance to install 1.3 on my development machine next week to explore the MoreLikeThis functionality. I’d like to use this feature on the item pages, allowing users to get some immediate related items, in addition to using the subject and geographic headings to get related items.

Wrapping up

November 16, 2007

As of November 1st the CDI has been flying sans grant money. I was hired onto the project 18 months ago to start up a digital initiatives program and launch a pilot project with a website, preliminary collections and the infrastructure for future growth. It has been a busy 18 months but I think we accomplished a lot, although perhaps not everything that was in the original proposal. This past week has been filled with administrative wrap up details, including writing a final report on our activities during the grant period.  I tend to think more in lists, so here is the wrap-up of what the CDI has gotten done in the past 18 months.

  • CDI office space and digital photography studio designed and built
    • Equipment selected, purchased and installed
  • Backend –
    • Evaluation of available Digital Asset Management Systems (DAMS)
      • Selection of eXist
    • Built metadata administrative interface
      • Dublin Core XForm
      • MODS XForm
      • Solr XForm
      • Authority Control with MADS
    • Creation of metadata workflow (still a bit of a work in progress)
    • Data dictionaries for Dublin Core, MODS and TEI template for transcribed letters.
    • OIA-PMH data provider (with some help)
  • Built web interface
    • Initial design of web interface
    • Site designed and implemented for the EAD collection
    • User testing and design adjustments
    • Implementation of Solr for faceted searching and browsing
    • RSS for news, and search results (still in progress)
  • Steering committee formed
  • Metadata working group formed
  • Content selection committee formed
    • Creation of content development policy and evaluation matrix
    • 6 live collections with around 670 completed records and more than 8000 pages scanned
    • Scanning work on two additional high use photograph collections
  • Several presentations, including an upcoming presentation at the New England Archivists meeting in March 2008
  • A paper in the fall 2007 issue (not out yet) of Microform & Imaging Review

In addition to the final report I’ve been working on getting the MODS editor into production for our next collection. We are trying a new approach to metadata, and bringing in more staff from cataloging to work on metadata creation, so it will be a good test of the forms. I’ve also been exploring what it would take to make the MODS XForm available as open source, it seems to involve some paperwork, some waiting and assurance to the University that the code is not commercially viable. It has been a busy week.

Book Scanner

August 9, 2007

Our i2s CopiBook has arrived:




Pictures courtesy of Toni Fortini and her fancy new cell phone.

We are still experimenting with it, but so far my favorite feature is the semi automation: set a book on the scanner, position it, get all you settings ready, do your first scan, when the scan is done the glass pops open allowing you to turn the page, the next scan is taken as soon as the glass is pressed back into place. Set up and training took a full day, but most of that was spent on the lights, which necessitated a trip to Home Depot, a few hours of fiddling, and some tech support to get the lighting profile calibrated correctly. The scans look great though, and the scanner is generally very easy to use. Hopefully we will be getting the machine into production next week.

Presentation woes

July 9, 2007

I have a presentation to put together for July 20th. Originally I was asked to present something about the changing face of the catalog (kind of like this post, where I pointed to some efforts to un-suck the OPAC by creating new interfaces, mash-ups and more). I demurred, because I don’t actually know a whole lot about the OPAC, my everyday activities rarely necessitate any interactions with the OPAC, and I think I have more of an end users view of the library website than a librarian’s.  I purposed the following presentation as an alternative, “Innovative Interfaces: making the most of the data we have.”

The presentation should be about 30 to 45 minutes long (gulp), and I will be following ALA mover and shaker Meredith Farkas, who is presenting “Social Software in Libraries.” I was just going to whip through a bunch of examples on libraries that are making the most of their data, such as NCSU, Penn Tags, Ann Arbor Public Library, Villanova’s myResearch Portal, BibbApps, LibraryThing, maybe Evergreen, etc. However, I went though some of Meredith’s slides from former presentations on social software in and I hate to say it but it looks like she covers quite a bit of what I was going to talk about, and much more.

I was actually asked to speak because the other presenter, who was going to talk about LibraryThing, was unable to make it and I work with the woman who is organizing the event. I wonder if I can manipulate my presentation to be about XForms instead? Or Solr, or anything else I actually deal with on a day-to-day basis. However, I was told I should probably keep my presentation non-technical, so I have my doubts that any of those topics would be a good match.

I suppose I could talk about the importance of interface design and information architecture. Although information architecture sounds almost anti library 2.0, I think a well designed interface, both graphic design and information architecture, is key to a successful interface. Here is an example I worked on recently:

This is the same information, but with a new layout that separates the information into task based groups and uses basic graphic design elements such as color, icons and lots of white space to make important information stand out. This information is very non library 2.0, but the same principles apply to most interfaces design issues, even for interfaces that can be manipulated by the users. This doesn’t exactly qualify as an “innovative interface” though.

So, I’m stuck for a topic. In the meantime, my to-do list is steadily growing.

Project management with Basecamp

July 5, 2007

We have started using Basecamp for project management. Currently there are only two of us using it and we are mostly using it for the to-do lists. So far I like it. It is a helpful way to keep an eye on the CDI as a whole project, sometimes I get caught up in the backend stuff and don’t give the metadata, or collection development the attention it needs. Basecamp also allows us to assign tasks to each other. If for example, Chris runs across a broken feature, or I find some bad metadata, we can just create a new task and assign it to the appropriate person, Basecamp will then send them an e-mail notification that the task has been added to their to-do list.

Project management has been a bit of an issue for us lately. There has been a lack of clarity in who is doing what and how it is being overseen.  I can see how Basecamp can help with some of these issues, particularly in the “who is doing what” arena.  It may also become very useful as we get more people working on the project. For example, it might be helpful in assigning metadata on a collection basis to our cataloger(s)  and also as a way of tracking collections as they move from collection development, though selection and scanning to the metadata entry phase. We could use it to keep track of what projects we have scheduled and who is working on each phase of the project.

If nothing else the to-do lists have been motivational. There is something about seeing all those items with little check boxes next to them that makes me want to get things done.

Getting back on track

May 25, 2007

The combination of the launch crazieness (i know, it was over a month ago), a week of vacation, and the R2 recommendations has left me a little disorganized. I still have lots to do, but now that the launch is over I’m having some trouble prioritizing them.

Here’s what my list looks like so far (in no particular order):

  • Get back to the metadata processing side of the CDI
    • Create a METS editor so we can finally get rid of that Access DB
    • Continue work on the MODS XForm in order to liberate our descriptive metadata from Dublin Core. (Check out the xforms@code4lib wiki to see some forms in process from UVM (Firefox extension) and Princeton (Orbeon))
    • Create a one/two button method for sending completed records to Solr, from the descriptive metadata form. Currently I send either a collection at a time, or the entire database at once. This was fine for getting started, and I could have the script run every night to collect newly added items but I would rather have the records get added to the index when their status is changed to “complete.” I would really like to be able to submit two instances simultaneously in my XForm, so that when an item is saved, and marked as complete it saves the record to eXist and also sends it to Solr. Unfortunately I haven’t found any examples of this, and am not sure it can actually be done (with XForms), so we may end up with a two button approach.
    • Create some interface for indexing and managing EADs
    • Test the XQuery OAI data provider, and register the CDI collections with OAI harvesters
    • Solve the pesky URL issue. I’d like to set the exist webapp as my root directory, thus eliminating it from the URL altogether. I have no problem doing this with Tomcat, but once you add Apache into the mix bad things happen.  In general my URL’s are not very user friendly, and I’m wondering about fixing that… not sure what I would need to do, but I should at least look into it, obviously this needs to be done sooner rather than later (and should have been resolved before the launch).
  • Finish the new Finding Aids site (which had a stealth release a few weeks ago)
    • Finish configuring Solr for the EADs
    • Add a FOP processor to the server so we can use the XSL-FO stylesheets that I spent so much time on at PU.
    • Fix the problem of really large EAD files causing out of memory errors (maybe by breaking up the files, or by increasing the memory allocation in the eXist config file)
    • Work with the Curator of Manuscripts to create additional browsing/searching options for the new finding aids site.
  • Work on new additions to the front end
    • Allow users to remove filters from their “narrowed” Solr searches (kind of like this)
    • Add faceted browsing to all the browse pages, including the browse collections page
    • Add a news feed
    • Investigate image zooming options: JPEG2000, Zoomify, etc.
    • Add user generated tags
    • Add commenting
  • Find a web stats program that I like, and figure out how I want it configured (happily I don’t actually have to do the configuring)
  • Workflow management – This is a big one, but not something that I can do alone.
  • Look at integrating JHOVE into our workflow
  • Get a development server up and running (high priority, but heavily dependent on the next point)
  • Get a budget quote, and work on purchasing a server (for image storage) and a book scanner. This is a group project and may require some field trips. Fun!
  • Start working on partnerships with interested departments/faculty/organizations
  • Start working on migrating legacy projects into the CDI
  • Clean up the mess I made in developing the CDI so that it would be possible to pack up the whole system for other people to take a look at. CTL and Academic Computing here at UVM have expressed an interest in using the eXist XForms combo for some of their projects.
  • Continue working on information architecture for the library wide redesign
  • Eat chocolate, lots of chocolate

I was also strongly encouraged by my supervisor to take a day a week to work on “scholarship and creative activities” (trying to get published). I’m kind of ambivalent about publishing, but  there is no ambivalence about it at my library; if you want a promotion you will need to publish. Preferably you will publish (in peer reviewed journals), give talks, and be on several regional/national service committees. So in addition to the list above, I guess I’ll be trying to put together a paper or two.

Library Realignment

May 11, 2007

The library recently hired a consulting group to help us evaluate our workflow and think about restructuring and repurposing existing staff openings to best fit the changing library model. We got the report back yesterday.

I’m intrigued by the recommendations, and what seems to me to be a few curiously large holes in them. In particular there was a heavy emphasis on the library’s move towards more digital content, but no mention of a library webmaster. There was a significant amount of discussion about digital access, and there is a recommendation for a “Discovery and Delivery” group, which would investigate additional ways of meeting virtual information needs, but I fail to see how these can be implemented with the current staffing structure. Currently the library website is maintained by committee and while the committee manages to keep the library website mostly up to date, there is no one dedicated to implementing new features, or staying on top of web technologies. I think that it is very important for the library to have a full time professional dedicated to the libraries virtual presence, and by virtual presence I mean more than just updating the website. There are a lot of possibilities for getting content to users in different ways, remixing current library content to be more context relevant, and to improve existing interfaces and tools. (Check out this post to see some interesting developments in libraries.) This kind of work can not be done by part time members of a committee who all have other jobs and professional interests to keep up with. Although this hole in the recommendations doesn’t really effect the CDI it does have a huge impact on my workload, as I’m on the web team and part of the current redesign efforts.

The report was generally very positive for the CDI. The CDI is listed as a “strategic initiative” which indicates a continued commitment from the library. They recommended that my position be made permanent (yay, because funding runs out soon), with the addition of two new positions; a metadata librarian and a programmer. They also suggested the possibility of repurposing a copy cataloger to do metadata work. Which means I need to get those XForms polished and ready for primetime. (We have also had some interest in our architecture, eXist, XForms, and Solr from the Center for Teaching and Learning.)

But here is where things get a little wonky. Currently the CDI is situated under Special Collections, actually I believe it is called Research Collections. I had mentioned in my interview that I thought this may not be the best place for the CDI, as it could give the impression that the CDI was a Special Collections project rather than a university wide resource. I suggested that the CDI should be its own department. The reason I thought, and still think, the CDI could be its own department is that as a digital library, the CDI has many of the same operations that a physical library has (although we don’t do much in the way of reference service). We have cataloging (metadata), collection development, systems, and some unique CDI functions as well. I also mentioned in my interview that the CDI in its current incarnation has some organizational issues. Because there isn’t a clear (in my mind) head of the CDI there are a lot of loose ends, and some unsupervised work flows.

In the report the consultants recommended the CDI be gradually moved under Collection Development. They went on to qualify that this would only be the collection part of the CDI, the rest could live… elsewhere.

“As the grant funding that enabled CDI development wanes, it will be important for UVM to decide how it wants to use these new capabilities. At bottom, decisions related to content and priorities are collection development decisions, and we believe the CDI program should be driven by Collection Development. (We’re referring specifically to content decisions; the actual operation and technical infrastructure of the CDI could reside elsewhere.)”

Huh? I don’t understand how this solves the organizational problems that I mentioned in my interview with the consultants, as a matter of fact I think it confuses rather than clarifies organization, essentially further diversifying CDI functions and farming them out all over the library. I suppose this is one way of running the center, but the I think a diversified model will only exasperate our organizational/management issues.

The more I think about the issue of where the CDI should live in the organizational workflow chart the more agnostic I become. I’m not sure it matters so much. We will still need to interface with systems, collection development, technical services and reference, what we actually need is internal clarity in our management structure. Someone who is in charge, and can oversee all the different aspects of the CDI operations (scanning, metadata, collection development, relationships with faculty, policy and procedure creation and management). This is kind of a touchy subject, who is managing the project, and I don’t really care who is doing it (well maybe I do, a little), I just think there needs to be someone who can devote the necessary time, and has the right skills to oversee the project. A lot of this I have been doing myself with the metadata portion handled part time by the Curator of Manuscripts, but if it is my job (unclear) then I think I need to have more of a mandate, and also more time to devote to project management.

One other issue that the report raises is the issue of an institutional repository. The report assumes a natural evolution of the CDI from special grant funded project to “something more like an institutional repository.” I’m not sure if the consultants understand the implications of an institutional repository, but I’ve been very careful about not throwing around the phrase “institutional repository” in relation to the CDI. The CDI was built as a digital library project, and while it is pretty flexible, I’m fairly confident it is not heavy hitting enough to function as an institutional repository. Nor do we have the mandate to insure we get participation in an institutional repository from the university administration. Not to mention all of the other issues associated with an IR. (And I follow Dorthea‘s blog, so I have at least a vague idea, of the craziness we could be getting into.) I have a feeling this is more a misunderstand of what an IR is than of what the CDI is. We have talked a lot about the CDI being a place for faculty research collections, and creating long-term classroom use collections, all of which I think the CDI is poised to accomplish, but I think that is a far cry from an IR.

I guess the recommendations are generally very positive for the CDI and I look forward to see where the discussion in the library goes from here.

Other discussions on the R2 recommendations can be found here, and here.

Center for Digital Initiatives: Virtual Tour

April 19, 2007

Check out our new office space (to see how far the space has come, here are some earlier pictures), visit our website [], and sign our virtual guest book.

The new home for the Center for Digital Initiatives at UVM. This is room 313 in the Bailey/Howe Library, three floors up and tucked away in the stacks, I believe we are in the horticulture section.  We are still waiting for a sign for the door, and hopefully a few signs elsewhere in the library to help people find us. But if you manage to make to the third floor, just go all the way to the corner farthest from the front door and you should find this:

The seating area includes a data port for visitors with laptops.

CDI seating area

Looking down the hallway you can see the doorway to the scanning room on the right, my office on the left and the new conference room all the way at the back. Right above the door to the conference room is a wireless router, which means I have a really great (strong, and consistent) signal in my office.

Looking into the scanning room, you can also see some of the photographs we are using. The one on the right is from the Tennie Toussaint collection, and is available on the website.

The scanning room is a light controlled environment designed to maximize color accuracy. Color neutral, daylight balanced, lights are provided on dimmer switches allowing the technicians the low level light environment need for evaluating color accuracy.

The conference room:

Some collection highlights:

We have six collections, most of them are related to congressional papers and with items ranging in date from 18182004 and on topics as such as milk, slavery, and the maple sugar industry. In addition to the congressional papers we have a collection of Vermont historical photographs that has some real gems.

Also of interest are some of the new features such as the browse within a collection and the ability to do faceted searching (using the “Narrow your search” options).

Don’t forget to sign our virtual guest book.


April 17, 2007

UVM Libraries Center for Digital Initiatives:

We are officially live. There was a press conference yesterday to launch the site, you can read the press release here, and there will be additional events during the week including open houses on Thursday, April 19th and Friday, April 20th from 1 to 3PM in Bailey/Howe’s Room 313. We will be providing some refreshments, and raffling off an iPod nano at the open house, so stop by.

I will also be hosting a virtual open house here on Thursday with pictures of the center, and some highlights from the online collections.