Category Archives: Crowdsourcing

Roy’s World

In one of his characteristically humorous and self-effacing autobiographical stories, Roy Rosenzweig recounted the uneasy feeling he had when he was working on an interactive CD-ROM about American history in the 1990s. The medium was brand new, and to many in academia, superficial and cartoonish compared to a serious scholarly monograph.

Roy worried about how his colleagues and others in the profession would view the shiny disc on the social history of the U.S., and his role in creating it. After a hard day at work on this earliest of digital histories, he went to the gym, and above his treadmill was a television tuned to Entertainment Tonight. Mary Hart was interviewing Fabio, fresh off the great success of his “I Can’t Believe It’s Not Butter” ad campaign. “What’s next for Fabio?” Hart asked him. He replied: “Well, Mary, I’m working on an interactive CD-ROM.”

Roy Rosenzweig

Ten years ago today Roy Rosenzweig passed away. Somehow it has now been longer since he died than the period of time I was fortunate enough to know him. It feels like the opposite, given the way the mind sustains so powerfully the memory of those who have had a big impact on you.

The field that Roy founded, digital history, has also aged. So many more historians now use digital media and technology to advance their discipline that it no longer seems new or odd like an interactive CD-ROM.

But what hasn’t changed is Roy’s more profound vision for digital history. If anything, more than ever we live in Roy’s imagined world. Roy’s passion for open access to historical documents has come to fruition in countless online archives and the Digital Public Library of America. His drive to democratize not only access to history but also the historical record itself—especially its inclusion of marginalized voices—can been seen in the recent emphasis on community archive-building. His belief that history should be a broad-based shared enterprise, rather than the province of the ivory tower, can be found in crowdsourcing efforts and tools that allow for widespread community curation, digital preservation, and self-documentation.

It still hurts that Roy is no longer with us. Thankfully his mission and ideas and sensibilities are as vibrant as ever.

Some Thoughts on the Hacking the Academy Process and Model

I’m delighted that the edited version of Hacking the Academy is now available on the University of Michigan’s DigitalCultureBooks site. Here are some of my quick thoughts on the process of putting the book together. (For more, please read the preface Tom Scheinfeldt and I wrote.)

1) Be careful what you wish for. Although we heavily promoted the submission process for HTA, Tom and I had no idea we would receive over 300 contributions from nearly 200 authors. This put an enormous, unexpected burden on us; it obviously takes a long time to read through that many submissions. Tom and I had to set up a collaborative spreadsheet for assessing the contributions, and it took several months to slog through the mass. We also had to make tough decisions about what kind of work to include, since we were not overly prescriptive about what we were looking for. A large number of well-written, compelling pieces (including many from friends of ours) had to be left out of the volume, unfortunately, because they didn’t quite match our evolving criteria, or didn’t fit with other pieces in the same chapter.

2) Set aside dedicated time and people. Other projects that have crowdsourced volumes, such as Longshot Magazine, have well-defined crunch times for putting everything together, using an expanded staff and a lot of coffee. I think it’s fair to say (and I hope not haughty to say) that Tom and I are incredibly busy people and we had to do the assembly and editing in bits and pieces. I wish we could have gotten it done much sooner to sustain the energy of the initial week. We probably could have included others in the editing process, although I think we have good editorial consistency and smooth transitions because of the more limited control.

3) Get the permissions set from the beginning. One of the delays on the edited volume was making sure we had the rights to all of the materials. HTA has made us appreciate even more the importance of pushing for Creative Commons licenses (especially the simple CC-BY) in academia; many of our contributors are dedicated to open access and already had licensed their materials under a permissive reproduction license, but we had to annoy everyone else (and by “we,” I mean the extraordinary helpful and capable Shana Kimball at MPublishing). This made the HTA process a little more like a standard publication, where the press has to hound contributors for sign-offs, adding friction along the way.

4) Let the writing dictate the form, not vice versa. I think one of the real breakthroughs that Tom and I had in this process is realizing that we didn’t need to adhere to a standard edited-volume format of same-size chapters. After reading through odd-sized submissions and thinking about form, we came up with an array of “short, medium, long” genres that could fit together on a particular theme. Yes, some of the good longer pieces could stand as more-or-less standard essays, but others could be paired together or set into dialogues. It was liberating to borrow some conventions from, e.g., magazines and the way they handle shorter pieces. In some cases we also got rather aggressive about editing down articles so that they would fit into useful spaces.

5) This is a model that can be repeated. Sure, it’s not ideal for some academic cases, and speed is not necessarily of the essence. But for “state of the field” volumes, vibrant debates about new ideas, and books that would benefit from blended genres, it seems like an improvement upon the staid “you have two years to get me 8,000 words for a chapter” model of the edited book.

What Should Scholarly Society Meetings Look Like in the 2010s?

Unlike some of my blog post titles, this one really is a question. What do you think they should look like? I ask because I am now on the program committee for the American Historical Association and this Saturday we begin planning for the January 2012 meeting. Committee members are encouraged to bring five “panel ideas” with them to the initial planning meeting; I, of course, plan to agitate for non-panel forms as well (think: THATCamp), and I suspect that the audience for this blog has even more creative ideas.

So: What would you propose? Let me know in the comments.

The Maddening Crowd

[In July 2010, The Chronicle of Higher Education asked twenty-three scholars and illustrators to answer this question: What will be the defining idea of the coming decade, and why? As an intellectual historian I’m skeptical of my ability to predict the future, but I have to say I think my crystal ball functioned well this time, especially since unbeknownst to me Jaron Lanier was also asked to answer the question and proved my point; the new movie about Facebook has this tension as one of its themes; and since I wrote this the number of Facebook users has gone up by 100 million. Here’s my take on the “big idea of 2010-20.”]

Friedrich Nietzsche would have hated Twitter and Wikipedia even more than organized religion. The great champion of the individual will rising above the sheepish masses would have shuddered at what the Internet has given us in the last decade, when the Web became exponentially more social and collaborative. One can only imagine Nietzsche’s fury at a method called “crowdsourcing” and a Web browser called Flock.

I suppose every age has its debate about the individual versus the collective, with associated concerns about the place of genius and expertise, but I suspect we are heading into a decade of especially heightened sensitivity over this tension.

A new romanticism that reveres personal drive and uniqueness is dawning. The spate of books critical of the frenetic social Web, from Andrew Keen’s The Cult of the Amateur: How Today’s Internet Is Killing Our Culture (Crown Business) to Jaron Lanier’s You Are Not a Gadget (Knopf) and Nicholas Carr’s The Shallows: What the Internet Is Doing to Our Brains (W.W. Norton), are leading indicators. Just as the global expansion of fast food begat the slow-food movement, the next decade will see a “slow information” counterrevolution focused on restoring individual thought and creativity. The neo-Nietzscheans will advocate turning off (your computer) and dropping out (of Facebook).

On the other side will be those who assert, like Aristotle, that human beings are social animals and that the Internet is simply enabling the kind of interaction and collaboration we have desired since the first polis. Facebook’s Mark Zuckerberg was a classics major, after all. This big idea will reach its apex if Facebook, current population 500 million, surpasses China and India sometime in the coming decade to become the largest collective in history.

On a smaller scale, the tension between the individual and the collective will result in hand-wringing about the value of expertise and that elusive element, genius. What good is a professional restaurant reviewer when the crowd can provide wider (if not necessarily deeper) coverage? Will there be any more Newtons and Einsteins now that discoveries at the Large Hadron Collider have hundreds of co-authors? What is the effect on our psyches after we repeatedly find, via Google, that our supposedly original ideas have been previously and precisely explicated by a dozen other people?

And in 2020, will The Chronicle of Higher Education ask a handful of intellectuals to come up with the big idea of the 2020s, or instead aggregate the answers from thousands of readers?

Crowdsourcing the Title of My Next Book

Already put this out on Twitter but will reblog here:

I’m crowdsourcing the title of my next book, which is about the way in which common web tech/methods should influence academia, rather than academia thinking it can impose its methods and genres on the web. The title should be a couplet like “The X and the Y” where X can be “Highbrow Humanities” “Elite Academia” “The Ivory Tower” “Deep/High Thought” [insert your idea] and Y can be “Lowbrow Web” “Common Web” “Vernacular Technology/Web” “Public Web” [insert your idea]. so possible titles are “The Highbrow Humanities and the Lowbrow Web” or “The Ivory Tower and the Wild Web” etc. What’s your choice? Thanks in advance for the help and suggestions.

Introducing Digital Humanities Now

Do the digital humanities need journals? Although I’m very supportive of the new journals that have launched in the last year, and although I plan to write for them from time to time, there’s something discordant about a nascent field—one so steeped in new technology and new methods of scholarly communication—adopting a format that is struggling in the face of digital media.

I often say to non-digital humanists that every Friday at 5 I know all of the most important books, articles, projects, and news of the week—without the benefit of a journal, a newsletter, or indeed any kind of formal publication by a scholarly society. I pick up this knowledge by osmosis from the people I follow online.

I subscribe to the blogs of everyone working centrally or tangentially to digital humanities. As I have argued from the start, and against the skeptics and traditionalists who thinks blogs can only be narcissistic, half-baked diaries, these outlets are just publishing platforms by another name, and in my area there are an incredible number of substantive ones.

More recently, social media such as Twitter has provided a surprisingly good set of pointers toward worthy materials I should be reading or exploring. (And as happened with blogs five years ago, the critics are now dismissing Twitter as unscholarly, missing the filtering function it somehow generates among so many unfiltered tweets.) I follow as many digital humanists as I can on Twitter, and created a comprehensive list of people in digital humanities. (You can follow me @dancohen.)

For a while I’ve been trying to figure out a way to show this distilled “Friday at 5” view of digital humanities to those new to the field, or those who don’t have time to read many blogs or tweets. This week I saw a tweet from Tom Scheinfeldt (blog|Twitter) (who in turn saw a tweet from James Neal) about a new service called Twittertim.es, which creates a real-time publication consisting of articles highlighted by people you follow on Twitter. I had a thought: what if I combined the activities of several hundred digital humanities scholars with Twittertim.es?

Digital Humanities Now is a new web publication that is the experimental result of this thought. It aggregates thousands of tweets and the hundreds of articles and projects those tweets point to, and boils everything down to the most-discussed items, with commentary from Twitter. A slightly longer discussion of how the publication was created can be found on the DHN “About” page.

Digital Humanities Now home page

Does the process behind DHN work? From the early returns, the algorithms have done fairly well, putting on the front page articles on grading in a digital age, bringing high-speed networking to liberal arts colleges, Google’s law archive search, and (appropriately enough) a talk on how to deal with streams of content given limited attention. Perhaps Digital Humanities Now will show a need for the light touch of a discerning editor. This could certainly be added on top of the raw feed of all interest items (about 50 a day, out of which only 2 or 3 make it into DHN), but I like the automated simplicity of DHN 1.0.

Despite what I’m sure will be some early hiccups, my gut is that some version of this idea could serve as a rather decent new form of publication that focuses the attention of those in a particular field on important new developments and scholarly products. I’m not holding my breath that someday scholars will put an appearance in DHN on their CVs. But as I recently told an audience of executive directors of scholarly societies at an American Council of Learned Societies meeting, if you don’t do something like this, someone else will.

I suppose DHN is a prod to them and others to think about new forms of scholarly validation and attention, beyond the journal. Ultimately, journals will need the digital humanities more than we need them.

The Spider and the Web: Results

A couple of weeks ago at the Digital Dilemmas Symposium in New York I tried something new: using Twitter to replicate digitally the traditional “author’s query,” where a scholar asks readers of a journal for assistance with a research project. I believe the results of this experiment are instructive about the significant advantages—and some disadvantages—for academia of what has come to be known as crowdsourcing.

For those who didn’t follow this experiment live via Twitter, you should first read the two initial posts in this series. The experiment was fairly simple: I prepared followers of my blog and my Twitter feed (as of this writing I have roughly the same number of blog subscribers and Twitter followers, about 1,600 on each service) by noting that I would reveal a historical puzzle at a particular time. At the beginning of my talk in New York, my blog auto-posted the scan of an object found in a Victorian archaeological dig, which I simultaneously tweeted.

I asked those following me online to work together to figure out what the object was. Participants in the experiment could post live comments on Twitter, and others could follow along by searching for the #digdil09 hashtag. (A hashtag is a hopefully unique string of characters that enables a search of Twitter to reveal all comments at a specific conference or on a particular subject.) I encouraged everyone to talk to each other and leverage each other’s knowledge. In addition, I set up what in the age of the print journal would have been a ridiculous deadline: only one hour for the crowd to solve the mystery. For a bit of theater (“stunt lecturing”?) I flashed the Twitter stream behind me from time to time during my talk.

It took much less time than an hour for a solution: nine minutes, to be exact, for a preliminary answer and 29 minutes for a fairly rich description of the object to emerge from the collective responses of roughly a hundred participants. Solution: the object was an ornamental gorget from the Cahokia tribe.

spider_tweet_2

What happened along the way was as interesting as the result (which I have to admit was rather satisfying given the possibility of a live crowd in NYC laughing at me for using Twitter). First, Twitter was remarkably effective in multiplying my voice. Indeed, in the first five minutes about a dozen others on Twitter retweeted (rebroadcast) my mystery to their followers. This “Twitter multiplier effect” meant that within minutes many thousands of people got word of my experiment; over 1,900 actually viewed the object on my blog. And I’m lucky enough to have a particularly knowledgeable crowd following me on Twitter, as you can see from the word cloud of my followers’ bios.

Once the race was on, solvers took two distinct paths toward a solution. The first path was the one I was trying to encourage: some quick thoughts about facets of the object, followed by scholarly debate. I mentioned that the object was made out of shell but was found far away from water in the Midwest (of the U.S.), which led to some interesting speculation about origins and movement of Native Americans, Europeans, and Africans. Others focused on the iconography of the spider; what could it symbolize and which cultures used it? These were decent lines of inquiry that one could imagine in the back pages of a Victorian journal.

spider_tweet_5

spider_tweet_4

Twitter is mocked for its almost comical terseness, but even the most hardened Twitter skeptic must admit tweets such as these are far from useless assistance. And the power of this crowdsourcing is even more evident as you look at the full discussion trail as researchers pick up information from each other to take their speculations a step further.

The experiment was not, however, an unalloyed success, partly due to a mistake I made in setting it up. In hindsight I gave away too much my original post, mentioning St. Clair and the fact that the piece was made out of shell. Alas, Googling keywords such as these (as well as the obvious “spider”) immediately gets one hot on the trail of the solution. It’s clear from the stream of tweets that a good portion of the solving audience took the “Google knows all” approach rather than the “scholarly discussion” approach.

I suppose even this aspect of the experiment is not uninteresting; I’ll leave it to others in the comments below to discuss the merits of the “Google” approach, as well as the merits (and demerits) of this experiment in general.

[Afterword: As many have pointed out on Twitter, the experiment would have been better had I not posted an object that could be found online. To be honest, I thought I had found an unusual object with no scanned version; it shows how much has been digitized, and how good search is even on a small amount of metadata.]

The Spider and the Web: What Is This?

In 1882, a young anthropologist from Washington, D.C., went west to collect objects for the Smithsonian. He found this object buried in a small hill in St. Clair county, Illinois. It’s about three inches (8 cm) across, and seems to be made of a shell. It has two holes in it.

Confused about what this was, the anthropologist brought the object back and presented it colleagues. I would like to reproduce that activity digitally by presenting the object online, to see what readers of this blog and my followers on Twitter can make of it, individually and by talking to each other. Although you can post some conjectures in the comments on the blog, if you’re reading this at 3p Eastern/Noon Pacific/20:00 GMT on Thursday, April 16, 2009, please post ideas via Twitter by @ replying to me or by using the hashtag #digdil09. You only have one hour.

I’ll be posting the full results of this experiment in this space in a day or two.

So: What is this?

http://www.dancohen.org/images/what_is_this.jpg

The Spider and the Web: A Crowdsourcing Experiment

If you read your blog posts on the same day they’re written, please join in later today for an experiment in scholarly crowdsourcing. I’ll be posting a historical mystery on this blog at exactly 3pm Eastern/Noon Pacific/20:00 GMT on Thursday, April 16, 2009, and will be linking to it from Twitter. I’ll be asking my followers on Twitter and blog subscribers to see if they can figure out what an unusual object is within one hour. You can follow the crowdsourced analysis live on Twitter, or find the results in this space in a day or two.

Items of Interest for June 12, 2008