Category Archives: History

Mapping Recent History

As the saying goes, imitation is the sincerest form of flattery. So at the Center for History and New Media, we’re currently feeling extremely flattered that our initiatives in collecting and presenting recent history—the Echo Project (covering the history of science, technology, and industry), the September 11 Digital Archive, and the Hurricane Digital Memory Bank—are being imitated by people using a wave of new websites that help them locate recollections, images, and other digital objects on a map. Here’s an example from the mapping site Platial:

And similar map from our 9/11 project:

Of course, we’re delighted to have imitators (and indeed, in turn we have imitated others), since we are trying to disseminate as widely as possible methods for saving the digital record of the present for future generations. It’s great to see new sites like Platial, CommunityWalk, and Wayfaring providing easy-to-use, collaborative maps that scattered groups of people can use to store photos, memories, and other artifacts.

No Computer Left Behind

In this week’s issue of the Chronicle of Higher Education Roy Rosenzweig and I elaborate on the implications of my H-Bot software, and of similar data-mining services and the web in general. “No Computer Left Behind” (cover story in the Chronicle Review; alas, subscription required, though here’s a copy at CHNM) is somewhat more polemical than our recent article in First Monday (“Web of Lies? Historical Knowledge on the Internet”). In short, we argue that just as the calculator—an unavoidable modern technology—muscled its way into the mathematics exam room, devices to access and quickly scan the vast store of historical knowledge on the Internet (such as PDAs and smart phones) will inevitably disrupt the testing—and thus instruction—of humanities subjects. As the editors of the Chronicle put it in their headline: “The multiple-choice test is on its deathbed.” This development is to be praised; just as the teaching of mathematics should be about higher principles rather than the rote memorization of multiplication tables, the teaching of subjects like history should be freed by new technologies to focus once again (as it was before a century of multiple-choice exams) on more important principles such as the analysis and synthesis of primary sources. Here are some excerpts from the article.

“What if students will have in their pockets a device that can rapidly and accurately answer, say, multiple-choice questions about history? Would teachers start to face a revolt from (already restive) students, who would wonder why they were being tested on their ability to answer something that they could quickly find out about on that magical device?

“It turns out that most students already have such a device in their pockets, and to them it’s less magical than mundane. It’s called a cellphone. That pocket communicator is rapidly becoming a portal to other simultaneously remarkable and commonplace modern technologies that, at least in our field of history, will enable the devices to answer, with a surprisingly high degree of accuracy, the kinds of multiple-choice questions used in thousands of high-school and college history classes, as well as a good portion of the standardized tests that are used to assess whether the schools are properly “educating” our students. Those technological developments are likely to bring the multiple-choice test to the brink of obsolescence, mounting a substantial challenge to the presentation of history—and other disciplines—as a set of facts or one-sentence interpretations and to the rote learning that inevitably goes along with such an approach…

“At the same time that the Web’s openness allows anyone access, it also allows any machine connected to it to scan those billions of documents, which leads to the second development that puts multiple-choice tests in peril: the means to process and manipulate the Web to produce meaningful information or answer questions. Computer scientists have long dreamed of an adequately large corpus of text to subject to a variety of algorithms that could reveal underlying meaning and linkages. They now have that corpus, more than large enough to perform remarkable new feats through information theory.

“For instance, Google researchers have demonstrated (but not yet released to the general public) a powerful method for creating ‘good enough’ translations—not by understanding the grammar of each passage, but by rapidly scanning and comparing similar phrases on countless electronic documents in the original and second languages. Given large enough volumes of words in a variety of languages, machine processing can find parallel phrases and reduce any document into a series of word swaps. Where once it seemed necessary to have a human being aid in a computer’s translating skills, or to teach that machine the basics of language, swift algorithms functioning on unimaginably large amounts of text suffice. Are such new computer translations as good as a skilled, bilingual human being? Of course not. Are they good enough to get the gist of a text? Absolutely. So good the National Security Agency and the Central Intelligence Agency increasingly rely on that kind of technology to scan, sort, and mine gargantuan amounts of text and communications (whether or not the rest of us like it).

“As it turns out, ‘good enough’ is precisely what multiple-choice exams are all about. Easy, mechanical grading is made possible by restricting possible answers, akin to a translator’s receiving four possible translations for a sentence. Not only would those four possibilities make the work of the translator much easier, but a smart translator—even one with a novice understanding of the translated language—could home in on the correct answer by recognizing awkward (or proper) sounding pieces in each possible answer. By restricting the answers to certain possibilities, multiple-choice questions provide a circumscribed realm of information, where subtle clues in both the question and the few answers allow shrewd test takers to make helpful associations and rule out certain answers (for decades, test-preparation companies like Kaplan Inc. have made a good living teaching students that trick). The ‘gaming’ of a question can occur even when the test taker doesn’t know the correct answer and is not entirely familiar with the subject matter…

“By the time today’s elementary-school students enter college, it will probably seem as odd to them to be forbidden to use digital devices like cellphones, connected to an Internet service like H-Bot, to find out when Nelson Mandela was born as it would be to tell students now that they can’t use a calculator to do the routine arithmetic in an algebra equation. By providing much more than just an open-ended question, multiple-choice tests give students—and, perhaps more important in the future, their digital assistants—more than enough information to retrieve even a fairly sophisticated answer from the Web. The genie will be out of the bottle, and we will have to start thinking of more meaningful ways to assess historical knowledge or ‘ignorance.'”

Doing Digital History June 2006 Workshop

If your work deals in some way with the history of science, technology, or industry, and you would like to learn how to create online history projects, the Echo Project at the Center for History and New Media is running another one of our free, week-long workshops. The workshop covers the theory and practice of digital history; the ways that digital technologies can facilitate the research, teaching, writing and presentation of history; genres of online history; website infrastructure and design; document digitization; the process of identifying and building online history audiences; and issues of copyright and preservation.

As one of the teachers for this workshop, I can say somewhat immodestly that it’s really a great way to get up to speed on the many (sometimes complicated) elements necessary for website development. Unfortunately space is limited, so be sure to apply online by March 10, 2006. The workshop will take place from June 12-16, 2006, at George Mason University’s Arlington campus, right outside of Washington, DC. It is co-sponsored by the American Historical Association and the National History Center, and funded by the Alfred P. Sloan Foundation. There is no registration fee, and a limited number of fellowships are available to defray the costs of travel and lodging for graduate students and young scholars. Hope to see you there!

Digital History on Focus 580

From the shameless plug dept.: If you missed Roy Rosenzweig’s and my appearance on the Kojo Nnamdi Show, I’ll be on Focus 580 this Friday, February 3, 2006, at 11 AM ET/10 AM CT on the Illinois NPR station WILL. (If you don’t live in the listening area for WILL, their website also has a live stream of the audio.) I’ll be discussing Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web and answering questions from the audience. If you’re reading this message after February 3, you can download the MP3 file of the show.

10 Most Popular History Syllabi

My Syllabus Finder search engine has been in use for three years now, and I thought it would be interesting to look back at the nearly half-million searches and 640,000 syllabi it has handled to see which syllabi have been the most popular. The following list was compiled by running a series of calculations to determine the number of times Syllabus Finder users glanced at a syllabus (had it turn up in a search), read a syllabus (actually went from the Syllabus Finder website to the website of the syllabus to do further reading), and “attractiveness” of a syllabus (defined as the ratio of full reads to mere glances). Here are the most popular history syllabi on the web.

#1 – U.S. History to 1870 (Eric Mayer, Victor Valley College, total of 6104 points)

#2 – America in the Progressive Era (Robert Bannister, Swarthmore College, 6000 points)

#3 – The American Colonies (Bruce Dorsey, Swarthmore College, 5589 points)

#4 – The American Civil War (Sheila Culbert, Dartmouth College, 5521 points)

#5 – Early Modern Europe (Andrew Plaa, Columbia University, 5485 points)

#6 – The United States since 1945 (Robert Griffith, American University, 5109 points)

#7 – American Political and Social History II (Robert Dykstra, University at Albany, State University of New York, 5048 points)

#8 – The World Since 1500 (Sarah Watts, Wake Forest University, 4760 points)

#9 – The Military and War in America (Nicholas Pappas, Sam Houston State University, 4740 points)

#10 – World Civilization I (Jim Jones, West Chester University of Pennsylvania, 4636 points)

This is, of course, a completely unscientific study. It obviously gives an advantage to older syllabi, since those courses have been online longer and thus could show up in search results for several years. On the other hand, the ten syllabi listed here range almost uniformly from 1998 to 2005.

Whatever its faults, the study does provide a good sense of the most visible and viewed syllabi on the web (high Google rankings help these syllabi get into a lot of Syllabus Finder search results), and I hope it provides a sense of the kinds of syllabi people frequently want to consult (or crib)—mostly introductory courses in American history. The variety of institutions represented is also notable (and holds true beyond the top ten; no domination by, e.g., Ivy League schools). I’ll probably do some more sophisticated analyses when I have the time; if there’s interest from this blog’s audience I’ll calculate the most popular history syllabi from 2005 courses, or the top ten for other topics. If you would like to read a far more elaborate (and scientific) data-mining study I did using the Syllabus Finder, please take a look at “By the Book: Assessing the Place of Textbooks in U.S. Survey Courses.”

[How the rankings were determined: 1 point was awarded for each time a syllabus showed up in a Syllabus Finder search result; 10 points were awarded for each time a Syllabus Finder user clicked through to view the entire syllabus; 100 points were awarded for each percent of “attractiveness,” where 100% attractive meant that every time a syllabus made an appearance in a search result it was clicked on for further information. For instance, the top syllabus appeared in 1211 searches and was clicked on 268 times (22.13% of the searches), for a point total of 1211 + (268 X 10) + (22.13 X 100) = 6104.]

Kojo Nnamdi Show Questions

Roy Rosenzweig and I had a terrific time on The Kojo Nnamdi Show today. If you missed the radio broadcast you can listen to it online on the WAMU website. There were a number of interesting calls from the audience, and we promised several callers that we would answer a couple of questions off the air; here they are.

Barbara from Potomac, MD asks, “I’m wondering whether new products that claim to help compress and organize data (I think one is called “C-Gate” [Kathy, an alert reader of his blog, has pointed out that Barbara probably means the giant disk drive company Seagate]) help out [to solve the problem of storing digital data for the long run]? The ads claim that you can store all sorts of data—from PowerPoint presentations and music to digital files—in a two-ounce standalone disk or other device.”

As we say in the book, we’re skeptical of using rare and/or proprietary formats to store digital materials for the long run. Despite the claims of many companies about new and novel storage devices, it’s unclear whether these specialized devices will be accessible in ten or a hundred years. We recommend sticking with common, popular formats and devices (at this point, probably standard hard drives and CD- or DVD-ROMs) if you want to have the best odds of preserving your materials for the long run. The National Institute of Standards and Technology (NIST) provides a good summary of how to store optical media such as CDs and DVDs for long periods of time.

Several callers asked where they could go if they have materials on old media, such as reel-to-reel or 8-track tapes, that they want to convert to a digital format.

You can easily find online some of the companies we mentioned that will (for a fee) transfer your own media files onto new devices. Google for the media you have (e.g., “8-track tape”) along with the words “conversion services” or “transfer services.” I probably overestimated the cost for these services; most conversions will cost less than $100 per tape. However, the older the media the more expensive it will be. I’ll continue to look into places in the Washington area that might provide these services for free, such as libraries and archives.

Digital History on The Kojo Nnamdi Show

From the shameless plug dept.: Roy Rosenzweig and I will be discussing our book Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web this Tuesday, January 10, on The Kojo Nnamdi Show. The show is produced at Washington’s NPR station, WAMU. We’re on live from noon to 1 PM EST, and you’ll be able to ask us questions by phone (1-800-433-8850), via email (kojo@wamu.org), or through the web. The show will be replayed from 8-9 PM EST on Tuesday night, and syndicated via iTunes and other outlets as part of NPR’s terrific podcast series (look for The Kojo Nnamdi Show/Tech Tuesday). You’ll also be able to get the audio stream directly from the show’s website. I’ll probably answer some additional questions from the audience in this space.

Hurricane Digital Memory Bank Featured on CNN

I was interviewed yesterday by CNN about a new project at the Center for History and New Media, the Hurricane Digital Memory Bank, which uses digital technology to record memories, photographs, and other media related to the Hurricanes Katrina, Rita, and Wilma. (CNN is going to feature the project sometime this week on its program The Situation Room.) The HDMB is a democratic historical project similar to our September 11 Digital Archive, which saved the recollections and digital files of tens of thousands of contributors from around the world; this time we’re trying to save thousands of perspectives on what occurred on the Gulf Coast in the fall of 2005. What amazes me is how the interest in online historical projects and collections has exploded recently. Several of the web projects I’ve co-directed over the last five years have engaged in collecting history online. But even a project with as prominent a topic as September 11 took a long time to be picked up by the mass media. This time CNN called us just a few weeks after we launched the website, and before we’ve done any real publicity. Here are three developments from the last two years I think account for this sharply increased interest.

Technologies enabling popular writing (blogs) and image sharing (e.g., Flickr) have moved into the mainstream, creating an unprecedented wave of self-documentation and historicizing. Blogs, of course, have given millions of people a taste for daily or weekly self-documentation unseen since the height of diary use in the late nineteenth century. And it used to be fairly complicated to set up an online gallery of one’s photos. Now you can do it with no technical know-how whatsoever, and it’s become much easier for others to find these photos (partly due to tagging/folksonomies). The result is that millions of photographs are being shared daily and the general public is getting used to the instantaneous documentation of events. Look at what happened in the hours after the London subway bombings— photographic documentation of the event that took place on photo-sharing sites within two days formerly would have taken months or even years for archivists to compile.

New web services are making combinations of these democratic efforts at documentation feasible and compelling. Our big innovation for the HDMB is to locate each contribution on an interactive map (using the Google Maps API), which allows one to compare the experiences and images from one place (e.g. an impoverished parish in New Orleans) with another (e.g., a wealthier suburb of Baton Rouge). (Can someone please come up with a better word for these combinations than the current “mashups”?) Through the savvy use of unique Technorati or Flickr tags, a scattered group of friends or colleagues can now automatically associate a group of documents or photographs to create an instant collection on an event or issue.

The mass media has almost completely reversed its formerly antagonistic posture toward new media. CNN now has at least two dedicated “Internet reporters” who look for new websites and scan blogs for news and commentary—once disparaged as the last refuge of unpublishable amateurs. In the last year the blogosphere has actually broken several stories (e.g., the Dan Rather document scandal), and many journalists have started their own blogs. The Washington Post has just hired its first full-time blogger. Technorati now tracks over 24 million blogs; even if 99% of those are discussing the latest on TomKat (the celebrity marriage) or Tomcat (the Linux server technology for Java), there are still a lot of new, interesting perspectives out there to be recorded for posterity.

Reliability of Information on the Web

Given the current obsession with the reliability (or more often in media coverage, the unreliability) of information on the web—the New York Times weighed in on the matter yesterday, and USA Today carried a scathing op-ed last week—I feel lucky that an article Roy Rosenzweig and I wrote entitled “Web of Lies? Historical Information on the Internet” happens to appear today in First Monday. If you’re interested in the subject, it’s probably best to read the full article, but I’ll provide a quick summary of our argument here.

Using my H-Bot software tool, Roy and I scanned the Internet to assess the quality of online information about history. In short, we found that while critics are correct that there are many error-riddled web pages, on the whole the web presents a relatively sound portrayal of historical facts through a process of consensus. With the right tools, these facts can be extracted from the web, leaving the more problematic web pages aside.

Moreover, this process of historical data mining on the web should prompt further discussion about the significance of all of this historical information online. To do some of our own prompting, we had a special multiple-choice test-taking version of H-Bot take the National Assessment of Educational Progress U.S. History exam using nothing but the web and some fancy algorithms borrowed from computer science. [Spoiler alert: it passed.] This raises new questions that move far beyond simple debates over the reliability of information on the web and into the very nature of teaching, learning, and research in our digital age.