Dan Cohen

Archive for the ‘Scholarship’ Category

Open Access Publishing and Scholarly Values

Thursday, May 27th, 2010

[A contribution to the Hacking the Academy book project. Tom Scheinfeldt and I are crowdsourcing the content of that book in one week.]

In my post The Social Contract of Scholarly Publishing, I noted that there is a supply side and a demand side to scholarly communication:

The supply side is the creation of scholarly works, including writing, peer review, editing, and the form of publication. The demand side is much more elusive—the mental state of the audience that leads them to “buy” what the supply side has produced. In order for the social contract to work, for engaged reading to happen and for credit to be given to the author (or editor of a scholarly collection), both sides need to be aligned properly.

I would now like to analyze and influence that critical mental state of the scholar by appealing to four emotions and values, to try both to  increase the supply of open access scholarship and to prod scholars to be more receptive to scholarship that takes place outside of the traditional publishing system.

1. Impartiality

In my second year in college I had one of those late-night discussions where half-baked thoughts are exchanged and everyone tries to impress each other with how smart and hip they are. A sophomoric gabfest, literally and figuratively. The conversation inevitably turned to music. I reeled off the names of bands I thought would get me the most respect. Another, far more mature student then said something that caught everyone off guard: “Well, to be honest, I just like good music.” We all laughed—and then realized how true that statement was. And secretly, we all did like a wide variety of music, from rock to bluegrass to big band jazz.

Upon reflection, many of the best things we discover in scholarship—and life—are found in this way: by disregarding popularity and packaging and approaching creative works without prejudice. We wouldn’t think much of Moby-Dick if Carl Van Doren hadn’t looked past decades of mixed reviews to find the genius in Melville’s writing. Art historians have similarly unearthed talented artists who did their work outside of the royal academies or art schools. As the unpretentious wine writer Alexis Lichine shrewdly said in the face of fancy labels and appeals to mythical “terroir”: “There is no substitute for pulling corks.”

Writing is writing and good is good, no matter the venue of publication or what the crowd thinks. Scholars surely understand that on a deep level, yet many persist in the valuing venue and medium over the content itself. This is especially true at crucial moments, such as promotion and tenure. Surely we can reorient ourselves to our true core value—to honor creativity and quality—which will still guide us to many traditionally published works but will also allow us to consider works in some nontraditional venues such as new open access journals, blogs or articles written and posted on a personal website or institutional repository, or non-narrative digital projects.

2. Passion

Do you get up in the morning wondering what journal you’re going to publish in next or how you’re going to spend your $10 royalty check? Neither do I, nor do most scholars. We wake up with ideas swirling around inside our head about the topic we’re currently thinking about, and the act of writing is a way to satisfy our obsession and communicate our ideas to others. Being a scholar is an affliction of which scholarship is a symptom. If you’re publishing primarily for careerist reasons and don’t deeply care about your subject matter, let me recommend you find another career.

The entire commercial apparatus of the existing publishing system merely leeches on our scholarly passion and the writing that passion inevitably creates. The system is far from perfect for maximizing the spread of our ideas, not to mention the economic bind it has put our institutions in. If you were designing a system of scholarly communication today, in the age of the web, would it look like the one we have today? Disparage bloggers all you like, but they control their communication platform, the outlet for their passion, and most scholars and academic institutions don’t.

3. Shame

This spring Ithaka, the nonprofit that runs JSTOR and that has a research wing to study the transition of academia into the digital age, put out a report based on their survey of faculty in 2009. The report has two major conclusions. First, scholars are increasingly using online resources like Google Books as a starting point for their research rather than the physical library. That is, they have become comfortable in certain respects with “going digital.”

But at the same time the Ithaka report notes that they remain stubbornly wedded to their old ways when it comes to using the digital realm for the composition and communication of their research. In other words, somehow it is finally seeming acceptable to use digital media and technology for parts of our work but to resist it in others.

This divide is striking. The professoriate may be more liberal politically than the most latte-filled ZIP code in San Francisco, but we are an extraordinarily conservative bunch when it comes to scholarly communication. Look carefully at this damning chart from the Ithaka report:

Any faculty member who looks at this chart should feel ashamed. We professors care less about sharing our work—even with underprivileged nations that cannot afford access to gated resources—than with making sure we impress our colleagues. Indeed, there was actually a sharp drop in professors who cared about open access between 2003 and the present.

This would be acceptable, I suppose, if we understood ourselves to be ruthless, bottom-line driven careerists. But that’s not the caring educators we often pretend to be. Humanities scholars in particular have taken pride in the last few decades in uncovering and championing the voices of those who are less privileged and powerful, but here we are in the ivory tower, still preferring to publish in ways that separate our words from those of the unwashed online masses.

We can’t even be bothered to share our old finished articles, already published and our reputation suitably burnished, by putting them in an open institutional repository:

I honestly can’t think of any other way to read these charts than as shameful hypocrisy.

4. Narcissism

The irony of this situation is that in the long run it very well may be better for the narcissistic professor in search of reputation to publish in open access venues. When scholars do the cost-benefit analysis about where to publish, they frequently think about the reputation of the journal or press. That’s the reason many scholars consider open access venues to be inferior, because they do not (yet) have the same reputation as the traditional closed-access publications.

But in their cost-benefit calculus they often forget to factor in the hidden costs of publishing in a closed way. The largest hidden cost is the invisibility of what you publish. When you publish somewhere that is behind gates, or in paper only, you are resigning all of that hard work to invisibility in the age of the open web. You may reach a few peers in your field, but you miss out on the broader dissemination of your work, including to potential other fans.

The dirty little secret about open access publishing is that despite the fact that although you may give up a line in your CV (although not necessarily), your work can be discovered much more easily by other scholars (and the general public), can be fully indexed by search engines, and can be easily linked to from other websites and social media (rather than producing the dreaded “Sorry, this is behind a paywall”).

Let me be utterly narcissistic for a moment. As of this writing this blog has 2,300 subscribers. That’s 2,300 people who have actively decided that they would like to know when I have something new to say. Thousands more read this blog on my website every month, and some of my posts, such as “Is Google Good for History?“, garner tens of thousands of readers. That’s more readers than most academic journals.

I suppose I could have spent a couple of years finding traditional homes for longer pieces such as “Is Google Good for History?” and gotten some supposedly coveted lines on my CV. But I would have lost out on the accumulated reputation from a much larger mass of readers, including many within the academy in a variety of disciplines beyond history.

* * *

When the mathematician Grigori Perelman solved one of the greatest mathematical problems in history, the Poincaré conjecture, he didn’t submit his solution to a traditional journal. He simply posted it to an open access website and let others know about it. For him, just getting the knowledge out there was enough, and the mathematical community responded in kind by recognizing and applauding his work for what it was. Supply and demand intersected; scholarship was disseminated and credited without fuss over venue, and the results could be accessed by anyone with an internet connection.

Is it so hard to imagine this as a more simple—and virtuous—model for the future of scholarly communication?

The Promise of Digital History

Thursday, September 11th, 2008

Back in January of this year I mentioned in this space that I was participating in an online discussion on digital history for the Journal of American History. That discussion has just been published in the September 2008 issue under the title “The Promise of Digital History.” The discussion ended up being extremely wide-ranging, including research possibilities in the digital age, the future of scholarly communication, training, and teaching. I’m obviously biased since I’m one of the interlocutors, but I believe the article is the perfect introduction to digital history for those who are new to the subject, and it also includes some important debates about where the field is headed. The article is available online at the History Cooperative, which is, alas, gated. Open access is another topic discussed in the article; I hope the JAH will make the article freely available soon.

Many thanks to the seven other digital historians—Bill Turkel, Will Thomas, Amy Murrell Taylor, Patrick Gallagher, Michael Frisch, Kristen Sword, and Steven Mintz—who participated in such a lively exchange!

Mills Kelly on Making Digital Scholarship Count

Friday, June 27th, 2008

If you haven’t already been reading Mills Kelly’s outstanding series “Making Digital Scholarship Count,” (part 1, part 2, part 3) you should put it on your must-read list. Mills finished the series today with a perfectly sensible conclusion about how academia might assess digital work for promotion and tenure. I completely agree.

Oh, and yes, even though Mills published this work on his blog rather than in a journal, it is scholarship. And it should count.

The Pirate Problem

Tuesday, April 22nd, 2008

Jolly Roger FlagLast summer, a few blocks from my house, a new pub opened. Normally this would not be worth noting, except for the fact that this bar is staffed completely by pirates, with eye patches, swords, and even the occasional bird on the shoulder. These are not real pirates, of course, but modern men and women dressed up as pirates. But they wear the pirate garb with no hint of irony or thespian affect whatsoever; these are dedicated, earnest pirates.

At this point I should note that I do not live in Orlando, Florida, or any other place devoted to make-believe, but in a sleepy suburb of Washington, D.C., that is filled with Very Serious Professionals. When the pirate pub opened, the neighborhood VSPs (myself very much included) concluded that it was strange and silly and that it was an incontrovertible fact that no one would patronize the place. Or if they did, it would be as a lark.

We clung to this belief for approximately 24 hours, until, upon a casual stroll by the storefront, we witnessed six pirate-garbed pubgoers outside. Singing sea chanteys. Without sheet music. The tavern has been filled ever since.

Such an experience usefully reminds oneself that there are ways of acting and thinking that we can’t understand or anticipate. Who knew that there was a highly developed pirate subculture, and that it thrived among the throngs of politicos and think-tankers and professors of Washington? Who are these people?

My thoughts turned to pirates during my experience at a workshop at the University of North Carolina at Chapel Hill a week ago, which was devoted to the digitization of the unparalleled Southern Historical Collection, and—in a less obvious way—to thinking about the past and future of humanities scholarship. Dozens of historians came to the workshop to discuss the way in which the SHC, the source of so many books and articles about the South and the home of 16 million archival documents, should be put on the web.

I gave the keynote, which I devoted to prodding the attendees into recognizing that the future of archives and research might not be like the past, and I showed several examples from my work and the work of CHNM that used different ways of searching and analyzing documents that are in digital, rather than analog, forms. Longtime readers of this blog will remember some of the examples, including an updated riff on what a future historian might learn about the state of religion in turn-of-the-century America by data mining our September 11 Digital Archive.

The most memorable response from the audience was from an award-winning historian I know from my graduate school years, who said that during my talk she felt like “a crab being lowered into the warm water of the pot.” Behind the humor was the difficult fact that I was saying that her way of approaching an archive and understanding the past was about to be replaced by techniques that were new, unknown, and slightly scary.

This resistance to thinking in new ways about digital archives and research was reflected in the pre-workshop survey of historians. Extremely tellingly, the historians surveyed wanted the online version of the SHC to be simply a digital reproduction of the physical SHC:

With few exceptions, interviewees believed that the structure of the collection in the virtual space should replicate, not obscure, the arrangement of the physical collection. Thus, navigating a manuscript collection online would mimic the experience of navigating the physical collection, and the virtual document containers—e.g., folders—and digital facsimiles would map clearly back to the physical containers and documents they represent. [Laura Clark Brown and David Silkenat, "Extending the Reach of Southern Sources," p. 10]

In other words, in the age of Google and advanced search tools and techniques, most historians just want to do their research they way they’ve always done it, by taking one letter out of the box at a time. One historian told of a critical moment in her archival work, when she noticed a single word in a letter that touched off the thought that became her first book.

So in Chapel Hill I was the pirate with the strange garb and ways of behaving, and this is a good lesson for all boosters of digital methods within the humanities. We need to recognize that the digital humanities represent a scary, rule-breaking, swashbuckling movement for many historians and other scholars. We must remember that these scholars have had—for generations and still in today’s graduate schools—a very clear path for how they do their work, publish, and get rewarded. Visit archive; do careful reading; find examples in documents; conceptualize and analyze; write monograph; get tenure.

We threaten all of this. For every time we focus on text mining and pattern recognition, traditionalists can point to the successes of close reading—on the power of a single word. We propose new methods of research when the old ones don’t seem broken. The humanities have an order, and we, mateys, threaten to take that calm ship into unknown waters.

[Image credit: &y.]

Project Bamboo Launches

Monday, March 24th, 2008

Project Bamboo LogoIf you’re interested in the present and future of the digital humanities, you’ll be hearing a lot about Project Bamboo over the next two years, including in this space. I was lucky enough to read and comment upon the Bamboo proposal a few months ago and was excited by its promise to begin to understand how technology—especially technology connected by web services—might be able to transform scholarship and academia. Bamboo is somewhat (and intentionally) amorphous right now—this doesn’t do it justice, but you can think of its initial phase as a listening tour—but I expect big things from the project in the not-so-distant future. From the brief description on the project website:

Bamboo is a multi-institutional, interdisciplinary, and inter-organizational effort that brings together researchers in arts and humanities, computer scientists, information scientists, librarians, and campus information technologists to tackle the question:

How can we advance arts and humanities research through the development of shared technology services?

A good question, and the right time to ask it. And the overall goal?

If we move toward a shared services model, any faculty member, scholar, or researcher can use and reuse content, resources, and applications no matter where they reside, what their particular field of interest is, or what support may be available to them. Our goal is to better enable and foster academic innovation through sharing and collaboration.

Project Bamboo was funded by the Andrew W. Mellon Foundation.

Methodology, Not Ideology

Thursday, March 13th, 2008

Tom Scheinfeldt hits the nail on the head with a brilliant blog post about how the game-changing nature of digital media and technology means that scholarship will have to shift back, after a theory-centric century of monographs, to an emphasis on methodological questions.

The Vision of ORE

Monday, March 3rd, 2008

ORE logoOne form of serious intellectual work that could use much more respect and appreciation within the humanities is the often unglamorous—but occasionally revolutionary—work of creating technical standards. At their best, such standards transcend the code itself to envision new forms of human interaction or knowledge creation that would not be possible without a lingua franca. We need only think of the web; look at what the modest HTML 1.0 spec has wrought.

The Object Reuse and Exchange (ORE) specification that was unveiled today at Johns Hopkins University has, beyond all of the minute technical details, a very clear and powerful vision of scholarly research and communication in a digital age. It is thus worth following the specification as it moves toward a final version in the fall of 2008, and to begin thinking about how we might use it in the humanities (even though it will undoubtedly be adopted faster in the sciences).

The vision put forth by Carl Lagoze, Herbert Van de Sompel, and others in the ORE working group for the first time tries to map the true nature of contemporary scholarship onto the web. The ORE community realized in 2006 that neither basic web pages nor advanced digital repositories truly capture today’s scholarship.

This scholarship cannot be contained by web pages or PDFs put into an institutional repository, but rather consists of what the ORE team has termed “aggregates,” or constellations of digital objects that often span many different web servers and repositories. For instance, a contemporary astronomy article might consist of a final published PDF, its metadata (author, title, publication info, etc.), some internal images, and then—here’s the important part—datasets, telescope imagery, charts, several publicly available drafts, and other matter (often held by third parties) that does not end up in the PDF. Similarly, an article in art history might consist of the historian’s text, paintings that were consulted in a museum, low-resolution copies of those paintings that are available online (perhaps a set of photos on Flickr of the referenced paintings), citations to other works, and perhaps an associated slide show.

How can one reliably reference and take full advantage of such scholarly constellations given the current state of the web? As Herbert Van de Sompel put it, ORE tries to identify in a commonsensical way “identified, bounded aggregations of related objects that form a logical whole.” In other words, ORE attempts to shift the focus from repositories for scholarship to the complex products of scholarship themselves.

By forging semantic links between pieces entailed in a work of scholarship it keeps those links active and dynamic and allows for humans, as well as machines that wish to make connections, to easily find these related objects. It also allows for a much better preservation path for digital scholarship because repositories can use ORE to get the entirety of a work and its associated constellation rather than grabbing just a single published instantiation of the work.

The implementation of ORE is perhaps less commonsensical for those who do not wish to dive into lots of semantic web terms and markup languages, but put simply, the approach the ORE group has taken is to provide a permanent locator (i.e., a URI, like a web address) that links to what they call a “resource map,” which in turn describes an aggregation. Think of a constellation in the night’s sky. We have Orion, which consists of certain stars; a star map specifies which stars comprise Orion and where to find each of them. The creators of ORE have chosen to use widely adopted formats like RDF and Atom to “serialize” (or make available in a machine-readable and easily exchangeable text format) their resource maps. [Geeks can read the full specification in their user guide.]

In the afternoon today several compelling examples of ORE in action were presented. Ray Plante of the NCSA and National Virtual Observatory showed how astronomers could use ORE and a wiki to create aggregates and updates about unusual events like supernovas, as different observatories add links to images and findings about each event (again, think of Van de Sompel’s “logical whole”). Several presenters mentioned our Zotero project as an ideal use case for ORE, since it already downloads associated objects as part of a single parent item (e.g., it stores metadata, a link to the page it got an item from, and perhaps a PDF or web snapshot). Zotero is already ORE Lite, in a way, and it will be good to try out a full Zotero translator for ORE resource maps that would permit Zotero users to grab aggregates for their research and subsequently publish aggregates back onto the web—object reuse and exchange in action.

Obviously it’s still very early and the true impact of ORE remains to be seen. But it would be a shame if humanities scholars fail to participate in the creation of scholarly standards like ORE, or to help envision their uses in research, communication, and collaboration.

There has been much talk recently of the social graph, the network of human connections that sites like Facebook bring to light and take advantage of. If widely adopted, ORE could help create the scholarly graph, the networked relations of scholars, publications, and resources.

Zotero and the Internet Archive Join Forces

Wednesday, December 12th, 2007

IA LogoZotero LogoI’m pleased to announce a major alliance between the Zotero project at the Center for History and New Media and the Internet Archive. It’s really a match made in heaven—a project to provide free and open source software and services for scholars joining together with the leading open library. The vision and support of the Andrew W. Mellon Foundation has made this possible, as they have made possible the major expansion of the Zotero project over the last year.

You will hear much more about this alliance in the coming months on this blog, but I wanted to outline five key elements of the project.

1. Exposing and Sharing the “Hidden Archive”

The Zotero-IA alliance will create a “Zotero Commons” into which scholarly materials can be added simply via the Zotero client. Almost every scholar and researcher has documents that they have scanned (some of which are in the public domain), finding aids they have created, or bibliographies on topics of interest. Currently there is no easy way to share these; giving them a central home at the Internet Archive will archive them permanently (before they are lost on personal hard drives) and make them broadly available to others.

We understand that not everyone will be willing to share everything (some may not be willing to share anything, even though almost every university commencement reminds graduates that they are joining a “community of scholars”), but we believe that the Commons will provide a good place for shareable materials to reside. The architectural historian with hundreds of photographs of buildings, the researcher who has scanned in old newspapers, and scholars who wish to publish materials in an open access environment will find this a helpful addition to Zotero and the Internet Archive. Some researchers may of course deposit materials only after finishing, say, a book project; what I have called “secondary scholarly materials” (e.g., bibliographies) will perhaps be more readily shared.

But we hope the second part of the project will further entice scholars to contribute important research materials to the Commons.

2. Searching the Personal Library

Most scholars have not yet figured out how to take full advantage of the digitized riches suddenly available on their computers. Indeed, the abundance of digital documents has actually exacerbated the problems of some researchers, who now find themselves overwhelmed by the sheer quantity of available material. Moreover, the major advantage of digital research—the ability to scan large masses of text quickly—is often unavailable to scholars who have done their own scanning or copying of texts.

A critical second part to this alliance of IA and Zotero is to bring robust and seamless Optical Character Recognition (OCR) to the vast majority of scholars who lack the means or do not know how to convert their scans into searchable text. In addition, this process will let others search through such newly digitized texts. After a submission to the Commons, the Internet Archive will subsequently return an OCRed version of each donated document to enable searchability. This text will be incorporated into the donor’s local index (on the Zotero client) and thus made searchable in Zotero’s powerful quick search and advanced search panes. In short, this process will provide a tremendous incentive for scholars to donate to the Commons, since it will help them with their own research.

3. Enabling Networked References and Annotations

One of the pillars of scholarship is the ability for distributed scholars to be sure they are referencing the same text or evidence. As noted in #1, one of the great advantages of the Zotero Commons at IA will be the transport of scholarly materials currently residing on personal hard drives to a public space with stable, rather than local, addresses. These addresses will become critical as scholars begin to use, refer to, and cite items in the Commons.

Yet the IA/Zotero partnership has another benefit: as scholars begin to use not only traditional primary sources that have been digitized but also “born digital” materials on the web (blogs, online essays, documents transcribed into HTML), the possibility arises for Zotero users to leverage the resources of IA to ensure a more reliable form of scholarly communication. One of the Internet Archive’s great strengths is that it has not only archived the web but also given each page a permanent URI that includes a time and date stamp in addition to the URL.

Currently when a scholar using Zotero wishes to save a web page for their research they simply store a local copy. For some, perhaps many, purposes this is fine. But for web documents that a scholar believes will be important to share, cite, or collaboratively annotate (e.g., among a group of coauthors of an article or book) we will provide a second option in the Zotero web save function to grab a permanent copy and URI from IA’s web archive. A scholar who shares this item in their library can then be sure that all others who choose to use it will be referring to the exact same document.

Moreover, unlike most research software the sophisticated annotation tools built into Zotero—the ability to highlight passages, add virtual Post-It notes, as well as regular notes on the overall document—maintain these annotations separately from the underlying document. This presents the exciting possibility for collaborative scholarly annotation of web pages.

4. Simplifying Collaborative Sharing

Groups of scholars also have the need to create more private “commons,” e.g., for documents that they would like to share in a limited way. In addition to the fully open Zotero Commons we will establish a mechanism for such restricted sharing. Via the Zotero Server, a user will be able to create a special collection with a distinct icon that shows up in the client interface (left column) for every member of the group.

Files added to these collections will be stored on the Internet Archive but will have restricted access. We believe that having these files reside on the IA server will encourage the donation of documents at the end of a collaborative project. The administrator of a shared collection will be able to move its contents into the fully open Zotero Commons via a single click in the administrative interface on the Zotero Server.

5. Facilitating Scholarly Discovery

The multiple libraries of content created by Zotero users and the multi-petabyte digital collections of the Internet Archive are resources that can potentially be of great use to the scholarly community. We believe that neither has experienced the level of exploration and usage we believe is possible through further development and collaboration.

The combined digital collections present opportunities for scholars to find primary research materials, to discover one another’s work, to identify materials that are already available in digital form and therefore do not need to be located and scanned, to find other scholars with similar interests and to share their own insights broadly. We plan to leverage the combined strengths of the Zotero project and the Internet Archive to work on better discovery tools.

Symposium on the Future of Scholarly Communication

Thursday, November 15th, 2007

For those who missed it, between October 12 and 27, 2007, there was a very thoughtful and insightful online discussion of how the publication of scholarship is changing—or trying to change—in the digital age. Participating in the discussion were Ed Felton, David Robinson, Paul DiMaggio, and Andrew Appel from Princeton University (the symposium was hosted by the Center for Information Technology Policy at Princeton), Ira Fuchs of the Mellon Foundation, Peter Suber of the indispensable Open Access News blog (and philosophy professor at Earlham College), Stan Katz, the President Emeritus of the American Council of Learned Societies, and Laura Brown of Ithaka (and formerly the President of Oxford University Press USA).

The symposium is really worth reading from start to finish. (Alas, one of the drawbacks of hosting a symposium on a blog is that it keeps everything in reverse chronological order; it would be great if CITP could flip the posts now that the discussion has ended.) But for those of us in the humanities the most relevant point is that we are going to have a much harder transition to an online model of scholarship than in the sciences. The main reason for this is that for us the highest form of scholarship is the book, whereas in the sciences it is the article, which is far more easily put online, posted in various forms (including as pre- and e-prints), and networked to other articles (through, e.g., citation analysis). In addition, we’re simply not as technologically savvy. As Paul DiMaggio points out, “every computer scientist who received his or her Ph.D. in computer science after 1980 or so has a website” (on which they can post their scholarly production), whereas the number is about 40% for political scientists and I’m sure far less for historians and literature professors.

I’m planning a long post in this space on the possible ways for humanities professors to move from print to open online scholarship; this discussion is great food for thought.

Tony Grafton on Digital Texts and Reading

Monday, November 5th, 2007

Anthony Grafton was the first person to turn me onto intellectual history. His seminar on ideas in the Renaissance was one of the most fascinating courses I took at Princeton, and I still remember well Tony rocking in his seat, looking a bit like a young Karl Marx, making brilliant connections among a broad array of sources.

So it’s not unexpected given his wide-ranging interests but still terrific to see a scholar who has spent so much time with early books thinking deeply about “digitization and its discontents” in his article “Future Reading” in the latest issue of The New Yorker. And it’s even more gratifying to see Tony note in his online companion piece to “Future Reading,” “Adventures in Wonderland,” that “One of the best ways to get a handle on the sprawling world of digital sources is through George Mason University’s Center for History and New Media.”