Archive for January, 2008

Troubleshooting eXist

January 28, 2008

For the past few months I’ve been experiencing some mysterious (to me at any rate) problems with eXist crashing. Generally speaking the problems seemed to be due to corrupted index files, and/or issues with temporary fragments. The problems seem to intensify the more the metadata editor is used, and I noticed when I deleted a collection the database would invariable crash, usually the following night. I’ve been limping along by surreptitiously restarting and re-indexing the database, but have had several instances where the database could not be restored in this way and had to restore from backup. This meant that the site was down and completely non-functional while I restored the files.

I think the issue is pressing enough that I’m going to have to put a lot of other things on hold to sort it out. I’ve found a few threads (1,2,3) on the mailing list that seem to address the issue, and it looks like some of it can be solved by an improvement of my xqueries, and that the issue of temporary fragments is on the minds of the developers.

I think I’ll start going over my xqueries to eleminate the creation of temporary fragments where I can. This will be a useful exercise anyway, as many of the queries were written before I had a through understanding of xquery. Hopefully this will significantly improve performance. Another option could be to move the metadata processing to the development server, and have a master/slave configuration to update the live site every night. Since the biggest problems with crashing/corrupted indexes seems to happen when metadata creation work is highest, this could be a way to cut down interruptions in service on the public side.

I like that UVM has given me the freedom to experiment and use technologies that are not being used elsewhere in the library, however with the freedom have come some headaches. I do not have colleagues to fall back on for troubleshooting. I understand why small digital projects go with systems like ContentDM, the trade off in flexibility may be well worth the time saved in other areas.

Quick update

January 8, 2008

I have a steadily growing list of things I need to get finished, but I finally managed to get around to adding RSS (actually atom) feeds for tracking search results and also for tracking collections. This will be particularly handy for a collection like the McAllister Photographs which will be a work in progress for quite some time. (It is currently growing at a rate of about 50-100 photos a week, but I imagine this will slow down once the cataloging staff has caught up to our part time scanning tech.)

In the process of adding the feeds I also took a the opportunity to upgrade to Solr 1.2, which was a pretty painless upgrade, with some nice additional functionality. I hope to get a chance to install 1.3 on my development machine next week to explore the MoreLikeThis functionality. I’d like to use this feature on the item pages, allowing users to get some immediate related items, in addition to using the subject and geographic headings to get related items.