Dan Cohen

Archive for the ‘Web Services’ Category

Project Bamboo Launches

Monday, March 24th, 2008

Project Bamboo LogoIf you’re interested in the present and future of the digital humanities, you’ll be hearing a lot about Project Bamboo over the next two years, including in this space. I was lucky enough to read and comment upon the Bamboo proposal a few months ago and was excited by its promise to begin to understand how technology—especially technology connected by web services—might be able to transform scholarship and academia. Bamboo is somewhat (and intentionally) amorphous right now—this doesn’t do it justice, but you can think of its initial phase as a listening tour—but I expect big things from the project in the not-so-distant future. From the brief description on the project website:

Bamboo is a multi-institutional, interdisciplinary, and inter-organizational effort that brings together researchers in arts and humanities, computer scientists, information scientists, librarians, and campus information technologists to tackle the question:

How can we advance arts and humanities research through the development of shared technology services?

A good question, and the right time to ask it. And the overall goal?

If we move toward a shared services model, any faculty member, scholar, or researcher can use and reuse content, resources, and applications no matter where they reside, what their particular field of interest is, or what support may be available to them. Our goal is to better enable and foster academic innovation through sharing and collaboration.

Project Bamboo was funded by the Andrew W. Mellon Foundation.

Zotero News, Big and Small

Tuesday, September 19th, 2006

So much for a modest, stealthy launch of Zotero. I promised a couple of weeks ago that I would return to my blog soon with a few updates about user feedback, some hints about new features, and perhaps some additional news items. With a modest private beta test and a few pages explaining the software on our new site, I assumed that Zotero would quietly and slowly enter into public consciousness. Little did I know that within two weeks I would get over 400 emails asking to join the beta test, help develop and extend Zotero, make it work better with resources on the web, and evangelize it on campuses and in offices around the globe. (Sorry to those I haven’t responded to yet; I’m still working on my email backlog.) Better yet, we received some fantastic news about support for the project, which is where I’ll begin this update.

The big news is that the Center for History and New Media has received an incredibly generous grant from the Andrew W. Mellon Foundation to help build major new features into the 2.0 release of Zotero (coming in 2007). Included in this substantial upgrade are great capabilities that beta testers are already clamoring for (as I’ll describe below). I’m deeply appreciative to the Mellon Foundation and especially Ira Fuchs and Chris Mackie for their support of the project, and we’re delighted to join the stable of other Mellon-funded, open-source projects that are trying to revolutionize higher education and the scholarly enterprise through the use of innovative information technology. We have a very ambitious set of goals we would like to accomplish in the next two years under Mellon funding, and we’re really excited to get started and push these advances out to an eager audience.

My thanks also to the beta testers who have reported bugs and sent in suggestions. (For a few early reviews and thoughts about Zotero, see posts on the blogs of Bill Turkel, Bruce D’Arcus (1, 2), Adrian Cooke, Jeanne Kramer-Smyth, and Mark Phillipson.) We’re planning on rolling all of the bug fixes and a few of the suggestions that we’ve already implemented into the public beta that will be released shortly. The most requested new features were auto-completion/suggestions for tags, better support for non-Western and institutional authors, full-text searches of articles that are saved into one’s Zotero collection, more import/export options, support for other online collections and resources, and the detection of duplicate records. The developers are working feverishly on all of these fronts, and I think the Beta 2 release (our public beta) will be considerably better because of all of this helpful feedback.

I have intentionally left out perhaps the most wanted feature: tools for collaboration. Some of those who have started to hack the software have noticed what we at the Center for History and New Media have been thinking about from the start—that it seems very easy to add ways to send and receive information to and from Zotero (it does reside in the web browser, after all). What if you could share a folder of references and notes with a colleague across the country? What if you could receive a feed of new resources in your area of interest? What if you could synchronize your Zotero library with a server and access it from anywhere? What if you could send your personal collection to other web services, e.g., a mapping service or text analyzer or translation engine?

I’m glad so many of us are thinking alike. Those are the issues we’ve just started to work on, thanks to the Mellon Foundation. Stay tuned for the Zotero server and additional exciting extensions to the Zotero platform.

And despite my email backlog, please do contact me if you would like to join the Zotero movement.

Where Are the Noncommercial APIs?

Friday, March 10th, 2006

Readers of this blog know that one of my pet peeves as someone trying to develop software tools for scholars, teachers, and students is the lack of application programming interfaces (APIs) for educational resources. APIs greatly facilitate the use of these resources and allow third parties to create new services on top of them, such as the Google Maps “mashups” that have become a phenomenon in the last year. (Please see my post “Do APIs Have a Place in the Digital Humanities?” as well as the Hurricane Digital Memory Bank for more on APIs and to see what a historical mashup looks like.) Now a clearing house for APIs shows the extent to which noncommercial resources—and especially those in the humanities—have been left out in the cold in this promising new phase of the web. Count with me the total number of noncommercial, educationally-oriented APIs out of the nearly 200 listed on Programmable Web.

That’s right, for the humanities the answer is one: the Library of Congress’s somewhat clunky SRU (Search/Retrieve via URL). Maybe in a broader definition you could count the API from the BBC archive, though it seems to be more about current events. The Internet Archive’s API is currently focused on facilitating uploads into its system rather than, say, historical data mining of the web. A potentially rich API for finding book information, ISBNdb.com, seems promising, but shouldn’t there be a noncommercial entity offering this service (I assume ISSNdb.com will eventually charge or limit this important service)?

By my count the only other noncommercial APIs are from large U.S. government scientific institutions such as NASA, NIH, and NOAA. Surely this long list is missing some other APIs out there, such as one for OAI-PMH. If so, let Programmable Web know—most “Web 2.0″ developers are looking here first to get ideas for services, and we don’t need more mashups focusing on the real estate market.

Wikipedia vs. Encyclopaedia Britannica for Digital Research

Monday, January 30th, 2006

In a prior post I argued that the recent coverage of Wikipedia has focused too much on one aspect of the online reference source’s openness—the ability of anyone to edit any article—and not enough on another aspect of Wikipedia’s openness—the ability of anyone to download or copy the entire contents of its database and use it in virtually any way they want (with some commercial exceptions). I speculated that, as I discovered in my data-mining work with H-Bot, which uses Wikipedia in its algorithms, having an open and free resource such as this could be very important for future digital research—e.g., finding all of the documents about the first President Bush in a giant, untagged corpus on the American presidency. For a piece I’m writing for D-Lib Magazine, I decided to test this theory by pulling out significant keywords and phrases from matching articles in Wikipedia and the Encyclopaedia Britannica on George H. W. Bush to see if one was better than the other for this purpose. Which resource is better? Here are the unedited term lists, derived by running plain text versions of each article through Yahoo’s Term Extraction web service. Vote on which one you think is a better profile, and I’ll reveal which list belongs to which reference work later this week.

Article #1
president bush
saddam hussein
fall of the berlin wall
tiananmen square
thanksgiving day
american troops
manuel noriega
halabja
invasion of panama
gulf war
help
saudi arabia
united nations
berlin wall

Article #2
president george bush
george bush
mikhail gorbachev
soviet union
collapse
reunification of germany
thurgood marshall
union
clarence thomas
joint chiefs of staff
cold war
manuel antonio noriega
iraq
george
nonaggression pact
david h souter
antonio noriega
president george