Dan Cohen

Programming for Poets
Explaining programming concepts to a nonprogrammer audience, and tracking the development of software for scholars.



Creating a Blog from Scratch, Part 7: Tags, What Are They Good For?
Posted to Programming for Poets on 5 January 2007, 3:17 PM EST

Evidently quite a few things. In the past few years, tags have been attached to virtually everything, from web links to photos to bars. The University of Pennsylvania has recently introduced a way for those on campus to tag items in their online catalog, Franklin. With the arrival of the Zotero server this year, it will be possible for the community of Zotero users to collaboratively tag almost any object of research, from books to sculptures to letters. For their promoters, tags are a low-cost, democratic advance over traditional systems of cataloging. Detractors disparage tags as lacking the rigor of those tried-and-true methods. As I started to think about the composition of this blog, all I wanted to know was, why do so many blogs have tags all over them and what function or functions do they serve? Do I need them? What are they good for?.. [Read on...]


Creating a Blog from Scratch, Part 6: One Year Later
Posted to Programming for Poets on 11 December 2006, 6:05 PM EST

Well, it's been over a year since I started this blog with a mix of trepidation, ambivalence, and faint praise for the genre—not exactly promising stuff—and so it's with a mixture of relief and a smidgen of smug self-satisfaction that I'm writing this post. I'm extremely glad that I started this blog last fall and have kept it going. (Evidently the half-life of blogs is about three months, so an active year-old blog is, I suppose, some kind of accomplishment in our attention-deficit age.) I thought it would be a good idea (and several correspondents have prodded me in this direction) to return to my series of posts about starting this blog, "Creating a Blog from Scratch." (For latecomers, this blog is not powered by Blogger, TypePad, or WordPress, but rather by my own feeble concoction of programming and design.) Over the next few posts I'll be revisiting some of the decisions I made, highlighting some good things that have happened and some regrets. And at the end of the series I'll be introducing some adjustments to my blog that I hope will make it better. But first, in something of a sequel to my call to my colleagues to join me in this endeavor, "Professors, Start Your Blogs," some of the triumphs and tribulations I've encountered over the last year... [Read on...]


Mapping What Americans Did on September 11
Posted to Programming for Poets on 8 August 2006, 4:32 PM EDT

I gave a talk a couple of days ago at the annual meeting of the Society for American Archivists (to a great audience—many thanks to those who were there and asked such terrific questions) in which I showed how researchers in the future will be able to intelligently search, data mine, and map digital collections. As an example, I presented some preliminary work I've done on our September 11 Digital Archive combining text analysis with geocoding to produce overlays on Google Earth that show what people were thinking or doing on 9/11 in different parts of the United States. I promised a follow-up article in this space for those who wanted to learn how I was able to do this. The method provides an overarching view of patterns in a large collection (in the case of the September 11 Digital Archive, tens of thousands of stories), which can then be prospected further to answer research questions. Let's start with the end product: two maps (a wide view and a detail) of those who were watching CNN on 9/11 (based on a text analysis of our stories database, and colored blue) and those who prayed on 9/11 (colored red)... [Read on...]


Using AJAX Wisely
Posted to Programming for Poets on 2 May 2006, 11:49 AM EDT

Since its name was coined on February 18, 2005, AJAX (for Asynchronous JavaScript and XML) has been a much-discussed new web technology. For those not involved in web production, essentially AJAX is a method for dynamically changing parts of a web page without reloading the entire thing; like other dynamic technologies such as Flash, it makes the web browser seem more like a desktop application than a passive window for reading documents. Unlike Flash, however, AJAX applications have generally focused less on interactive graphics (and the often cartoony elements that are now associated with Flash) and more on advanced presentation of text and data, making it attractive to those in academia, libraries, and museums. It's easy to imagine, for instance, an AJAX-based online library catalog that would allow for an easy refinement of a book search (reordering or adding new possibilities) without a new query submission for each iteration. Despite such promise, or perhaps because of the natural lag between commercial and noncommercial implementations of web technologies, AJAX has not been widely used in academia. That's fine. Unlike the dot-coms, we should first be asking: What are appropriate uses for AJAX?.. [Read on...]


An Actual Use for Windows on the Mac
Posted to Programming for Poets on 20 April 2006, 10:10 AM EDT

OK, so you can now run Windows on a Mac. So what? For most of us in the humanities, all we need is already on the Mac, which (in addition to intangibles such as the Mac's design) is why so many of us remain stubbornly attached to Apple's computers while over the last twenty years almost everyone else has moved to the more generic platform of the PC. Most educational, graphics, and web development software is available for the Mac. (For those in the social and natural sciences, on the other hand, many important software packages are either not available for the Mac or come out later than they do for the PC.) But perhaps there's the rub. Since many of us only use Macs—especially those that build academic or museum websites—we often don't see how most people view our sites. Since websites often render differently on different operating systems and web browsers, not checking how your site will look (and perform, if you are using dynamic web technologies) on a PC with IE (still 85% of web surfers) is unwise. Now with Parallels Workstation—the Windows-on-Mac solution that doesn't require rebooting your computer to switch OSes—you can literally have a window into the world of Windows sitting on your desktop in parallel with your Mac applications. For instance, here's a screenshot of my Mac desktop with Firefox for the Mac running on the left, and IE for Windows running on the right:.. [Read on...]


Where Are the Noncommercial APIs?
Posted to Programming for Poets on 10 March 2006, 10:50 AM EST

Readers of this blog know that one of my pet peeves as someone trying to develop software tools for scholars, teachers, and students is the lack of application programming interfaces (APIs) for educational resources. APIs greatly facilitate the use of these resources and allow third parties to create new services on top of them, such as the Google Maps "mashups" that have become a phenomenon in the last year. (Please see my post "Do APIs Have a Place in the Digital Humanities?" as well as the Hurricane Digital Memory Bank for more on APIs and to see what a historical mashup looks like.) Now a clearing house for APIs shows the extent to which noncommercial resources—and especially those in the humanities—have been left out in the cold in this promising new phase of the web. Count with me the total number of noncommercial, educationally-oriented APIs out of the nearly 200 listed on Programmable Web... [Read on...]


Creating a Blog from Scratch, Part 5: What is XHTML, and Why Should I Care?
Posted to Programming for Poets on 5 January 2006, 10:13 PM EST

In prior posts in this series (1, 2, 3, and 4), I described with some glee my rash abandonment of common blogging software in favor of writing my own. For my purposes there seemed to be some key disadvantages to these popular packages, including an overemphasis on the calendar (I just saw the definition of a blog at the South by Southwest Interactive Festival—"a page with dated entries"—which, to paraphrase Woody Allen, is like calling War and Peace "a book about Russia"), a sameness to their designs, and comments that are rarely helpful and often filled with spam. But one of the greatest advantages of recent blog software packages is that they generally write standards-compliant code. More specifically, blog software like WordPress automatically produces XHTML. Some of you might be asking, what is XHTML, and who cares? And why would I want to spend a great deal of effort ensuring that this blog complied strictly with this language?.. [Read on...]


Creating a Blog from Scratch, Part 4: Searching for a Good Search
Posted to Programming for Poets on 26 December 2005, 9:15 PM EST

It often surprises those who have never looked at server logs (the detailed statistics about a website) that a tremendous percentage of site visitors come from searches. In the case of the Center for History and New Media, this is a staggering 400,000 unique visitors a month out of about one million. Furthermore, many of these visitors ignore a website's navigation and go right to the site search box to complete their quest for information. While I'm not a big fan of consultants that tell webmasters to sacrifice virtually everything for usability, I do feel that searching has been undervalued by digital humanities projects, in part because so much effort goes into digitization, markup, interpretation, and other time-consuming tasks. But there's another, technical reason too: it's actually very hard to create an effective search—one, for instance, that finds phrases as well as single words, that is able to rank matches well, and that is easy to maintain through software and server upgrades. In this installment of "Creating a Blog from Scratch" (for those who missed them, here are parts 1, 2, and 3) I'll take you behind the scenes to explain the pluses and minuses of the various options for adding a search feature to a blog, or any database-driven website for that matter... [Read on...]


Creating a Blog from Scratch, Part 3: The Double Life of Blogs
Posted to Programming for Poets on 22 December 2005, 2:44 PM EST

In the first two posts in this series, I discussed the origins of blogs and how they led to certain elements in popular blog software that were in some cases good and in others bad for my own purposes—to start a blog that consisted of short articles on the intersection of digital technology, the humanities, and related topics (rather than my personal life or links with commentary). What I didn't realize as I set about writing my own blog software from scratch for this project was that in truth a blog leads two lives: one as a website and another as a feed, or syndicated digest of the more complete website. Understanding this double life and the role and details of RSS feeds led to further thoughts about how to design a blog, and how certain choices are encoded into blogging software. Those choices, as I'll explain in this post, determine to a large extent what kind of blog you are writing... [Read on...]


Creating a Blog from Scratch, Part 2: Advantages and Disadvantages of Popular Blog Software
Posted to Programming for Poets on 18 December 2005, 9:59 PM EST

In the first post in this series I briefly recounted the early history of blogs (all of five years ago) and noted how many of their current uses have diverged from two early incarnations (as a place to store interesting web links and as the online equivalent of a diary). Unfortunately, these early, dominant forms gave rise to existing blog software that, at least in my mind, is problematic. This "encoding" of original purposes into the basic structure of software is common in software development, and it often leads to features and configurations in later releases that are undesirable to a large number of users. In this post, I discuss the advantages and disadvantages of common blog packages—often deeply encoded into the software... [Read on...]


Creating a Blog from Scratch, Part 1: What is a Blog, Anyway?
Posted to Programming for Poets on 16 December 2005, 11:19 AM EST

If you look at the bottom of this page, you won't see any of the telltale signs that it is generated by a blog software package like Blogger, Moveable Type, or WordPress. When I was redesigning this site and wanted to add a blog to it, I made the perhaps foolhardy decision to write my own blogging software. Why, you might ask, would I recreate the proverbial wheel? As I'll explain in several other columns in this space, writing your own software is one of the best ways to learn—not only about how to write software, but also about genres and to think about (and rethink) some of the assumptions that go into the construction of software written for specific genres. The first question I therefore asked myself was, What is a blog, anyway?.. [Read on...]


Q and A on Firefox Scholar
Posted to Programming for Poets on 11 December 2005, 1:44 PM EST

Thanks so much everyone who emailed me over the past week in response to hearing about Firefox Scholar. It's great to get a sense that a wide range of people (from a number of countries) feel that the time has come for this kind of enhanced scholarly web browser, and it gives our team at the Center for History and New Media a great deal of confidence as we move forward. I've received a lot of questions about the project, so I thought I would answer some of the common ones here... [Read on...]


Introduction to Firefox Scholar
Posted to Programming for Poets on 7 December 2005, 9:27 PM EST

This week in the electronic version, and next week in the print version, the Chronicle of Higher Education is running an article (subscription required) on a new software project I'm co-directing, Firefox Scholar, which will be a set of extensions to the popular open source web browser that will help researchers, teachers, and students. My thanks to the many people who have emailed who are interested in the project. For them and for others who would like to know more, here's a brief summary of Firefox Scholar from our grant proposal to the Institute for Museum and Library Services, which has generously provided $250,000 to initiate the project. Please contact me if you would like occasional updates on the project or would like a beta release of the browser when it is available in the late summer of 2006... [Read on...]