Visualizing the Uniqueness, and Conformity, of Libraries

Tucked away in a presentation on the HathiTrust Digital Library are some fascinating visualizations of libraries by John Wilkin, the Executive Director of HathiTrust and an Associate University Librarian at the University of Michigan. Although I’ve been following the progress of HathiTrust closely, I missed these charts, and I want to highlight them as a novel method for revealing a library fingerprint or signature using shared metadata.

With access to the catalogs of HathiTrust member libraries, Wilkin ran some comparisons of book holdings. His ingenious idea was not only to count how many libraries held each particular work, but to create a visualization of each member library based on how widely each book in its collection is held by other libraries.

In Wilkin’s graphs for each library, the X axis is the number of libraries containing a book (including the library the visualization represents), and the Y axis is the number of books. That is, it contains columns of books from 1 (the member library is the only one with a particular book) to 41 (every library in HathiTrust has a physical copy of a book). Let’s look at an example:

Reading the chart from left to right, the University of Illinois at Urbana-Champaign library has a small number of books that it alone holds (~1,000), around 25,000 that only one other library has (the “2” column), 36,000 that two other libraries have, etc.

What’s fascinating is that the overall curvature of a graph tells us a great deal about a particular library.

There are three basic types of libraries we can speak of using this visualization technique. First, there are left-leaning libraries, which have a high number of books that do not exist in many other libraries. These libraries have spent considerable effort and resources acquiring rare volumes. For example, Harvard, which has hundreds of thousands of books that only a handful of other libraries also have:

On the other side, there are right-leaning libraries, which consist mostly of books that are nearly universally held by other libraries. These libraries generally carry only the most circulated volumes, books that are expected to be found in any academic research library. For instance, Lafayette College:

Finally, there are rounded libraries, which don’t have many popular books or many rare books, but mostly works that an average number of similar libraries have. These libraries roughly echo their cohort (in this case, large university research libraries in the United States). They could be called—my apologies—well-rounded in their collecting, likely acquiring many scholarly monographs while still remaining selective rather than comprehensive. For instance, Northwestern University:

Of course, the library curve is often highly correlated with the host institution’s age, since older universities are more likely to have rare old books or unusual (e.g., local or regional) books. This correlation is apparent in this sequence of graphs of the University of California schools, from oldest to newest:





Beyond the three basic types, there are interesting anomalies as well. The University of Virginia is, unsurprisingly, a left-leaning library, but not quite as a left-leaning as I would have expected:

Cornell is also left-leaning, but also clearly has a large, idiosyncratic collection containing works that no other library has—note the spike at position “1”:

Moreover, one could imagine using Wilkin Graphs (I’m going to go ahead and name it that to give John full credit) to analyze the relative composition of other kinds of libraries. For instance, LibraryThing has a project called Legacy Libraries, containing the records of personal libraries of famous historical figures such as Thomas Jefferson. A researcher could create Wilkin Graphs for Jefferson and other American founders (in relation to each other), or among intellectuals from the Enlightenment.

Update: Sherman Dorn suggests Wilkin Profile rather than Wilkin Graph. Sure, rolls off the tongue better: Prospective college student on a campus visit asks the tour guide, “So what’s your library’s Wilkin Profile?” According to Constance Malpas, OCLC has created such profiles for 160 libraries. These graphs can be created with the Worldcat Collection Analysis service (which, alas, is not openly available).

Clarification: John Wilkin comments below that the reason for the spike in position 1 in the Cornell Wilkin Profile is that Cornell had a digitization program that added many unique materials to HathiTrust. This made me realize, with some help from Stanford Library’s Chris Bourg and Penn State’s Mike Furlough that the numbers here are only for the shared HathiTrust collection (although that collection is very large—millions of items). Nevertheless, the general profile shapes should hold for more comprehensive datasets, although likely with occasional left and right shifts for certain libraries depending on additional unique book collections that have not been digitized. (That may explain the University of Virginia Wilkin Profile.) Note also that Google influenced the numbers here, since many of the scanned books come from the Google Books (née Google Library) project, introducing some selection bias which is only now being corrected—or worsened?—by individual institutional digitization initiatives, like Cornell’s.

Digital History at the 2013 AHA Meeting

It’s time for my annual list of digital history sessions at the American Historical Association meeting, this year in New Orleans, January 3-6, 2013. This year’s program extends last year’s surging interest in the effect digital media and technology are having on research and the profession. In addition, a special track for the 2013 meeting is entitled “The Public Practice of History in and for a Digital Age.” Looks like a good and varied program, including digital research methods (such as GIS, text mining, and network analysis), the construction and use of digital archives, the history of new media and its impact on social movements, scholarly communication, public history and writing for a general audience on the web, and practical concerns (e.g., getting grants for digital work).

Hope to see some of you there, and to interact with the rest of you about the meeting via other means. (Speaking of which, I hereby declare the hashtag to be #aha13. I know we care about exact dates, fellow historians, but we really don’t need that “20” in our hashtags.)

Thursday, January 3

9am-5pm

THATCamp (The Humanities and Technology Camp) AHA

1-3pm

Henry Morton Stanley, New Orleans, and the Contested Origins of an African Explorer: Public History and Teaching Perspectives

3:30-5:30pm

Spatial Narratives of the Holocaust: GIS, Geo-Visualization, and the Possibilities for Digital Humanities

Presidential Panel: H-Net and the Discipline: Changes and Challenges

8-10pm

Plenary Session: The Public Practice of History in and for a Digital Age

Friday, January 4

8:30-10am

Roundtable on Place in Time: What History and Geography Can Teach Each Other

Public History Meets Digital History in Post-Katrina New Orleans

“To See”: Visualizing Humanistic Data and Discovering Historical Patterns in a Digital Age

Viewfinding: A Discussion of Photography, Landscape, and Historical Memory

Scholarly Societies and Networking through H-Net

H-Net in Asia, Latin America, and the Caribbean: Building New Online Audiences

Applying to NEH Grant Programs

10:30am-noon

Self Defense, Civil Rights, and Scholarship: Panels in Honor of Gwendolyn Midlo Hall , Part 1: Gwendolyn Midlo Hall’s Africans in Colonial Louisiana Twenty Years Later

Online Reviewing: Before and After It Was de Rigueur

Gender, Sexuality, and Ethnicity: Household Space and Lived Experience in Colonial and Early National Mexico

The United States and Its Informants: The Cold War and the War on Terror

2:30-4:30pm

Front Lines: Early-Career Scholars Doing Digital History

From the March on Washington to Tahir Square and Beyond: Tactics, Technology, and Social Movements

Are There Costs to “Internationalizing” History?, Part 2: The Domestic Politics of Teaching and Outreach

Saturday, January 5

9-11am

H-Net in Africa: Building New Online Audiences

Scholarly Communications and Copyright

Oral History and Intellectual History in Conversation: Methodological Innovation in Modern South Asia

Research Support Services for History Scholars: A Study of Evolving Research Methods in History

Comparative Reflections on the History Major Capstone Experience: A Roundtable

The Power of Cartography: Remapping the Black Death in the Age of Genomics and GIS

11:30am-1:30pm

Mapping the Past: Historical Geographic Information Science (GIS)

Beyond “Plan B” for Renaissance Studies: A Roundtable

11:30am-2 – Poster Session 1

Hell Towns, Butternuts, and Spotted Cows: Bringing the History of a Small Town in the Hudson Valley into the Digital Age

2:30-4:30pm

Peer Review, History Journals, and the Future of Scholarly Research

Space, Place, and Time: GIS Technology in Ancient and Medieval European History

Factionalism and Violence across Time and Space: An Exploration of Digital Sources and Methodologies

Connecting Classroom and Community: H-Net Networks and Public History

The Deep History of Africa: New Narrative Approaches

First Steps: Getting Started as a History Professional

Renegotiating Identity: The Process of Democratization in Postauthoritarian Spain and Portugal

2:30-5pm – Poster Session 2

Digital History: Tools and Tricks to Learn the New Trade

Building the Dissertation Digitally

The Global Shipwreck

Picturing a Transnational Pulp Archive

Sunday, January 6

8:30-10:30am

Building a Swiss Army Knife: A Panel on DocTracker, a Multi-Tool for Digital Documentary Editions

11am-1pm

Teaching Digital Methods for History Graduate Students

Public History in the Federal Government: Continuing Trends and New Innovations

Using Oral History for Social Justice Activism

Generous Interfaces for Scholarly Sites

From time to time administrators ask me what I think the home page of a university website should look like. I tell them it should look like the music site The Sixty One, which simply puts a giant photograph of a musician or band in your face, stretched (or shrunk) to the size of your screen:

Menus are contextual, hidden, and modest; the focus is always on the experience of music. It’s very effective. I am not surprised, however, that university administrators have trouble with this design—what about all of those critical menus and submenus for students, faculty, staff, alumni, parents, visitors, news, views…? Of course, the design idea of a site like The Sixty One is to put engagement before information.

Universities have actually moved slightly in this direction in the past year; many of them now have a one-third slice of the screen devoted to rotating photographs: a scientist swirling blue liquid in a beaker, a string quartet bowing, a circle of students laughing on the grass. (Expect a greater rotational frequency for that classic last image, as it is the most effective anti-MOOC advertising imaginable.) But they still have all of those menus and submenus cluttering up the top and bottom, and news items running down the side, competing for attention. Information before engagement. The same is true for most cultural heritage institutions.

In a break from the normal this fall, the Rijksmuseum went all-in for The Sixty One’s philosophy in their site redesign, which fills the screen with a single image (albeit with a few key links tastefully striped across it):

As effective as it is, engagement-before-information can be an offputting design philosophy for those of us in the scholarly realm. The visual smacks of popularization, as opposed to textually rich, informationally dense designs. Yet we know that engagement can entice us to explore and discover. Home page designs like the Rijksmuseum’s should stimulate further discussion about a more visual mode for scholarly sites.

Take the standard online library catalog. (Please.) Most catalogs show textual search results with plenty of metadata but poor scannability. Full-screen visual browsing—especially using the principle of small multiples, or grids of images—can be very effective as a scholarly research aid, facilitating comparison, discovery, and serendipity.

Oddly enough, one of the first examples I know if this design concept for a research collection comes from the Hard Rock Cafe, which launched a site years ago to display thousands items from its memorabilia archive on a single screen. You can zoom in if something catches your eye—a guitar or handwritten lyrics.

Mitchell Whitelaw of the University of Canberra has been experimenting with similar ideas on his Visible Archive blog. This interface for the Manly Library uses the National Library of Australia’s Trove API to find and display archival documents in a visual-first way:

The images on the search page are categorized by topic (or date) and rotate gently over time without the researcher having to click through ten-items-to-a-page, text-heavy search results. It’s far easier to happen upon items of interest.

Whitelaw has given this model a great name—a “generous interface“:

Collection interfaces dominated by search are stingy, or ungenerous: they don’t provide adequate context, and they demand the user make the first move. By contrast, there seems to be a move towards more open, exploratory and generous ways of presenting collections, building on familiar web conventions and extending them.

I can imagine generous interfaces working extremely well for many other university, library, and museum sites.

Update: Mitchell Whitelaw let me know about another good generous interface he has worked on, Trove Mosaic:

And I should have remembered Tim Sherratt’s “Faces” interface for Invisible Australians:

Trevor Owens connects the generous interface to recent commercial services such as Pinterest. (I would add Flickr’s 2012 redesign.) Thinking about how scholarly generous interfaces are like and unlike these popular websites is important.

DPLA Audience & Participation Workshop and Hackfest at the Center for History and New Media

On December 6, 2012, the Digital Public Library of America will have two concurrent and interwoven events at the Roy Rosenzweig Center for History and New Media at George Mason University in Fairfax, VA. The Audience and Participation workstream will be holding a meeting that will be livestreamed, and next door those interested in fleshing out what might be done with the DPLA will hold a hackfest, which follows on a similar, successful event last month in Chattanooga, TN. (Here are some of the apps that were built.)

Anyone who is interested in experimenting with the DPLA—from creating apps that use the library’s metadata to thinking about novel designs to bringing the collection into classrooms—is welcome to attend or participate from afar. The hackfest is not limited to those with programming skills, and we welcome all those with ideas, notions, or the energy to collaborate in envisioning novel uses for the DPLA.

The Center for History and New Media will provide spaces for a group as large as 30 in the main hacking space, with couches, tables, whiteboards, and unlimited coffee. There will also be breakout areas for smaller groups of designers and developers to brainstorm and work. We ask that anyone who would like to attend the hackfest please register in advance via this registration form.

We anticipate that the Audience and Participation workstream and the hackfest will interact throughout the day, which will begin at 10am and conclude at 5pm EST. Breakfast will be provided at 9am, and lunch at midday.

The Center for History and New Media is on the fourth floor of Research Hall on the Fairfax campus of George Mason University. There is parking across the street in the Shenandoah Parking Garage. (Here are directions and a campus map.)

The Digital Public Library of America: Coming Together

I’m just back from the Digital Public Library of America meeting in Chicago, and like many others I found the experience inspirational. Just two years ago a small group convened at the Radcliffe Institute and came up with a one-sentence sketch for this new library:

An open, distributed network of comprehensive online resources that would draw on the nation’s living heritage from libraries, universities, archives and museums in order to educate, inform and empower everyone in the current and future generations.

In a word: ambitious. Just two short years later, out of the efforts of that steering committee, the workstream members (I’m a convening member of the Audience and Participation workstream), over a thousand people who participated in online discussions and at three national meetings, the tireless efforts of the secretariat, and the critical leadership of Maura Marx and John Palfrey, the DPLA has gone from the drawing board to an impending beta launch in April 2013.

As I was tweeting from the Chicago meeting, distant respondents asked what the DPLA is actually going to be. What follows is what I see as some of its key initial elements, though it will undoubtedly grow substantially. (One worry expressed by many in Chicago was that the website launch in April will be seen as the totality of the DPLA, rather than a promising starting point.)

The primary theme in Chicago is the double-entendre subtitle of this post: coming together. It was clear to everyone at the meeting that the project was reaching fruition, garnering essential support from public funders such as the National Endowment for the Humanities and the Institute of Museum and Library Services, and private foundations such as Sloan, Arcadia, and (most recently) Knight. Just as clear was the idea that what distinguishes the DPLA from—and means it will be complementary to—other libraries (online and off) is its potent combination of local and national efforts, and digital and physical footprints.

Ponds->Lakes->Ocean

The foundation of the DPLA will be a huge store of metadata (and potentially thumbnails), culled from hundreds of sources across America. A large part of the initial collection will come from recently freed metadata about books, videos, audio recordings, images, manuscripts, and maps from large institutions like Harvard, provided under the couldn’t-be-more-permissive CC0 license. Wisely, in my estimation (perhaps colored by the fact that I’m a historian), the DPLA has sought out local archival content that has been digitized but is languishing in places that cannot solicit a large audience, and that do not have the know-how to enable modern web services such as APIs.

As I put it on Twitter, one can think of this initial set of materials (beyond the millions of metadata records from universities) as content from local ponds—small libraries, archives, museums, and historic sites—sent through streams to lakes—state digital libraries, which already exist in 40 states (a surprise to many, I suspect)—and then through rivers to the ocean—the DPLA. The DPLA will run a sophisticated technical infrastructure that will support manifold uses of this aggregation of aggregations.

Plan Nationally, Scan Locally

Since the Roy Rosenzweig Center for History and New Media has worked with many local archives, museums, and historic sites, especially through our Omeka project (which has been selected as the software to run online exhibits for the DPLA), I was aware of the great cultural heritage materials that are out there in this country. The DPLA is right: much of this incredible content is effectively invisible, failing to reach national and international audiences. The DPLA will bring huge new traffic to local scanning efforts. Funding agencies such as the Institute of Museum and Library Services have already provided the resources to scan numerous items at the local level; as IMLS Director Susan Hildreth pointed out, their grant to the DPLA meant that they could bring that already-scanned content to the world—a multiplier effect.

In Chicago we discussed ways of gathering additional local content. My thought was that local libraries can brand a designated computer workstation with the blue DPLA banner, with a scanner and a nice screen showing the cultural riches of the community in slideshow mode. Directions and help will be available to scan in new documents from personal or community collections.

[My very quick mockup of a public library DPLA workstation; underlying Creative Commons photo by Flickr user JennieB]

Others envisioned “Antiques Roadshow”-type events, and Emily Gore, Director of Content at the DPLA, who coined the great term Scannebagos, spoke of mobile scanning units that could digitize content across the country.

The DPLA is not alone in sensing this great unmet need for public libraries and similar institutions to assist communities in the digital preservation of personal and local history. For instance, Bill LeFurgy, who works at the Library of Congress with the National Digital Information Infrastructure and Preservation Program (NDIIPP), recently wrote:

Cultural heritage organizations have a great opportunity to fulfill their mission through what I loosely refer to as personal digital archiving…Cultural heritage institutions, as preserving entities with a public service orientation, are well-positioned to help people deal with their growing–and fragile–personal digital archives. This is a way for institutions to connect with their communities in a new way, and to thrive.

I couldn’t agree more, and although Bill focused mostly on the born-digital materials that we all have in abundance today, this mission of digital preservation can easily extend back to analog artifacts from our past. As the University of Wisconsin’s Dorothea Salo has put it, let’s turn collection development inside out, from centralized organizations to a distributed model.

When Roy and I wrote Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web, we debated the merits of “preservation through digitization.” While it may be problematic for certain kinds of rare materials, there is no doubt that local and personal collections could use this pathway. Given recent (and likely forthcoming) cuts to local archives, this seems even more meritorious.

The Best of the Digital and the Physical

The core strength, and unique feature, of the DPLA is thus that it will bring together the power and reach of the digital realm with the local community and trust in the thousands of American public libraries, museums, and historical sites—an extremely compelling combination. We are going through a difficult transition from print to digital reading, in which people are buying ebooks they cannot share or pass down to their children. The ephemerality of the digital is likely to become increasingly worrisome in this transition. At the same time people are demanding of their local libraries a greater digital engagement.

Ideally the DPLA can help public libraries and vice versa. With a stable, open DPLA combined with on-the-ground libraries, we can begin to articulate a model that protects and makes accessible our cultural heritage through and beyond the digital transition. For the foreseeable future public libraries will continue to house physical materials—the continued wonders of the codex—as well as provide access to the internet for the still significant minority without such access. And the DPLA can serve as a digital attic and distribution center for those libraries.

The key point, made by DPLA board member Laura DeBonis, is that with this physical footprint in communities the DPLA can do things that Google and other dotcoms cannot. She did not mean this as a criticism of Google Books (a project she was involved with when she worked at Google), which has done impressive work in scanning over 20 million books. But the DPLA has an incredible potential local network it can take advantage of to reach out to millions of people and have them share their history—in general, to democratize the access to knowledge.

It is critical to underline this point: the DPLA will be much more than its technical infrastructure. It will succeed or fail not on its web services but on its ability to connect with localities across the United States and have them use—and contribute—to the DPLA.

A Community-Oriented Platform

Having said that, the technical infrastructure is looking solid. But here, too, the Technical Aspects workstream is keeping foremost in their mind community uses. As workstream member David Weinberger has written, we can imagine a future library as a platform, one that serves communities:

In many instances, those communities will be defined geographically, whether it’s a town’s local library or a university community; in some instances, the community will be defined by interest, not by geography. In either case, serving a defined community has two advantages. First, it enables libraries to accomplish the mission they’ve been funded to accomplish. Second, user networks depend upon and assume local knowledge, interests, and norms. While a local library platform should interoperate with the rest of the world’s library platforms, it may do best if it is distinctively local…

Just as each project created by a developer makes it easier for the next developer to create the next app, each interaction by users ought to make the library platform a little smarter, a little wiser, a little more tuned to its users interests. Further, the visible presence of neighbors and the availability of their work will not only make the library an ever more essential piece of the locality’s infrastructure, it can make the local community itself more coherent and humane.

Conceiving of the library as a platform not only opens a range of new services and provides for a continuous increase in the library’s value, it also does something libraries urgently need to do: it changes the criteria of success. A library platform should be measured less on the circulation of its works than in the circulation of the ideas and passions these works spark — from how many works are checked out to the community’s engagement with its own grappling with those works. This is not only a metric that libraries-as-platforms can excel at, it is in fact a measure of what has always been the truest value of libraries.

In that sense, by becoming a platform the library can better fulfill the abiding mission it set itself: to be a civic institution essential to democracy.

Nicely put.

New Uses for Local History

It’s not hard to imagine many apps and sites incorporating the DPLA’s aggregation of local historical content. It struck me that an easy first step is incorporation of the DPLA into existing public library apps. Here in Fairfax, Virginia, our county has an app that is fairly rudimentary but quickly becoming popular because it replaces that library card you can never find. (The app also can alert you to available holds and new titles, and search the catalog.)

I fired up the Fairfax Library app on my phone at the Chicago meeting, and although the county doesn’t know it yet, there’s already a slot for the DPLA in the app. That “local” tab at the bottom can sense where you are and direct you to nearby physical collections; through the DPLA API it will be trivial to also show people digitized items from their community or current locale.

Granted, Fairfax County is affluent and has a well-capitalized public library system that can afford a smartphone app. But my guess is the app is fairly simple and was probably built from a framework other libraries use (indeed, it may be part of Fairfax County’s ILS vendor package), so DPLA integration could happen with many public libraries in this way. For libraries without such resources, I can imagine local hackfests lending a hand, perhaps working from a base app that can be customized for different public libraries easily.

Long-time readers of this blog can identify dozens of other apps that will be hungry for DPLA content. The idea of marrying geolocation with historical materials has flourished in the last two years, with apps like HistoryPin showing how people can find out about the history around them.

Even Google has gotten into the act of location + history with its recently launched Field Trip app. I suspect countless similar projects will be enhanced by, or based on, the DPLA API.

Moreover, geolocating historical documents is but one way to use the technical infrastructure of the DPLA. As the technical working group has wisely noted, the platform exists for unintended uses as well as obvious ones. To explore the many possibilities, there will next be an “Appfest” at the Chattanooga Public Library on November 8-9, 2012. And I’m planning a DPLA hacking session here at the Roy Rosenzweig Center for History and New Media for December 6, 2012, concurrent with an Audience and Participation workstream meeting. Stay tuned for details.

The Speculative

Only hinted at in Chicago, but worthy of greater thought, is what else we might do with the combination of thousands of public libraries and the DPLA. This area is more speculative, for reasons ranging from legal considerations to the changing nature of reading. The strong fair use arguments that won the day in the Authors Guild v. HathiTrust case (the ruling was handed down the day before DPLA Midwest) may—may— enable new kinds of  sharing of digital materials within geofenced areas such as public libraries. (Chicago did not have a report from DPLA’s legal workstream, so we await their understanding of the shifting copyright and fair use landscape in the wake of landmark positive rulings in the HathiTrust and Georgia State cases.)

Perhaps the public library can achieve, in the medium term, some kind of hybrid physical-digital browsability as imagined in this video of a French bookstore from the near future, in which a simple scan of a book using a tablet transfers an e-text to the tablet. The video gets at the ongoing need for in-person reading advice and the superior browsability of physical bookshelves.

I’ve been tracking a number of these speculative exercises, such as the student projects in Harvard Graduate School of Design’s Library Test Kitchen, which experiments with media transformations of libraries. I suspect that bookfuturists will think of other potential physical/digital hybrids.

But we need not get fancy. More obvious benefits abound. The DPLA will be widely used by teachers and students, with scans being placed into syllabi and contextualized by scholars. Judging by the traffic RRCHNM’s educational sites and digital archives get, I expect a huge waiting audience for this. I can also anticipate local groups of readers and historical enthusiasts gathering in person to discuss works from the DPLA.

Momentum, but Much Left to Do

To be sure, many tough challenges still await the DPLA. Largely absent from the discussion in Chicago, with its focus on local history, is the need to see what the digital library can do with books. After all, the majority of circulations from public libraries are popular, in-copyright works, and despite great unique local content the public may expect that P in DPLA to provide a bit more of what they are used to from their local library. Finding ways to have big publishers share at least some books through the system—or perhaps start with smaller publishers willing to experiment with new models of distribution—will be an important piece of the puzzle.

As I noted at the start, the DPLA now has funding from public and private sources, but it will have to raise much, much more, not easy in these austere times. It needs a staff with the energy to match the ambition of the project, and the chops to execute a large digital project that also has in-person connections in 50 states.

A big challenge, indeed. But who wouldn’t like a public, open, digital library that draws from across the United States “to educate, inform and empower everyone”?

 

Second Year of Mason’s Digital History Doctoral Research Awards

I just wanted to remind potential doctoral students in history that George Mason University and the Roy Rosenzweig Center for History and New Media have Digital History Research Awards for students entering the History and Art History doctoral program. Students receiving these awards will get five years of fully funded studies, as follows: $20,000 research stipends in years 1 and 2; research assistantships at RRCHNM in years 3, 4, and 5. Awards include fulltime tuition waivers and student health insurance. For more information, contact Professor Cynthia A. Kierner (Director of the Ph.D. Program) at ckierner@gmu.edu, or yours truly at dcohen@gmu.edu. The deadline for applications is January 15, 2013.

The Journal of Digital Humanities Hits Full Stride

If you haven’t checked out the Journal of Digital Humanities yet, now’s the time to do so. My colleagues Joan Fragaszy Troyano, Jeri Wieringa, and Sasha Hoffman, along with our new editors-at-large and the many scholars who have taken democratic ownership of this open-access journal, have quickly gotten the production model down to a science. There’s also an art to it, as you can see from these shots of the new issue (thanks, Sasha!):

  

  

As I’ve explained in this space before, there is no formal submission process for the journal. Instead, we look to “catch the good” from across the open web, and take the very best of the good to develop into JDH on a quarterly basis. We believe this leads not only to a high-quality journal that can hold its own against submit-and-wait academic serials, but provides a better measure of what’s important to, and engaging, the entire digital humanities community.

But don’t take my word for it; judge for yourself at the Journal of Digital Humanities website, and pick your favorite format to read the journal in: HTML, ePub, iBook, or PDF.

Treading Water on Open Access

A statement from the governing council of the American Historical Association, September 2012:

The American Historical Association voices concerns about recent developments in the debates over “open access” to research published in scholarly journals. The conversation has been framed by the particular characteristics and economics of science publishing, a landscape considerably different from the terrain of scholarship in the humanities. The governing Council of the AHA has unanimously approved the following statement. We welcome further discussion…

In today’s digital world, many people inside and outside of academia maintain that information, including scholarly research, wants to be, and should be, free. Where people subsidized by taxpayers have created that information, the logic of free information is difficult to resist…

The concerns motivating these recommendations are valid, but the proposed solution raises serious questions for scholarly publishing, especially in the humanities and social sciences.

A statement from Roy Rosenzweig, the Vice President of Research of the American Historical Association, in May 2005:

Historical research also benefits directly (albeit considerably less generously [than science]) through grants from federal agencies like the National Endowment for the Humanities; even more of us are on the payroll of state universities, where research support makes it possible for us to write our books and articles. If we extend the notion of “public funding” to private universities and foundations (who are, of course, major beneficiaries of the federal tax codes), it can be argued that public support underwrites almost all historical scholarship.

Do the fruits of this publicly supported scholarship belong to the public? Should the public have free access to it? These questions pose a particular challenge for the AHA, which has conflicting roles as a publisher of history scholarship, a professional association for the authors of history scholarship, and an organization with a congressional mandate to support the dissemination of history. The AHA’s Research Division is currently considering the question of open—or at least enhanced—access to historical scholarship and we seek the views of members.

Two requests for comment from the AHA on open access, seven years apart. In 2005, the precipitating event for the AHA’s statement was the NIH report on “Enhancing Public Access to Publications Resulting from NIH-Funded Research”; yesterday it was the Finch report on “Accessibility, sustainability, excellence: how to expand access to research publications” [pdf]. History has repeated itself.

We historians have been treading water on open access for the better part of a decade. This is not a particular failure of our professional organization, the AHA; it’s a collective failure by historians who believe—contrary to the lessons of our own research—that today will be like yesterday, and tomorrow like today. Article-centric academic journals, a relatively recent development in the history of publishing, apparently have existed, and will exist, forever, in largely the same form and with largely the same business model.

We can wring our hands about open access every seven years when something notable happens in science publishing, but there’s much to be said for actually doing something rather than sitting on the sidelines. The fact is that the scientists have been thinking and discussing but also doing for a long, long time. They’ve had a free preprint service for articles since the beginning of the web in 1991. In 2012, our field has almost no experience with how alternate online models might function.

If we’re solely concerned with the business model of the American Historical Review (more on that focus in a moment), the AHA had on the table possible economic solutions that married open access with sustainability over seven years ago, when Roy wrote his piece. Since then other creative solutions have been proposed. I happen to prefer the library consortium model, in which large research libraries who are already paying millions of dollars for science journals are browbeaten into ponying up a tiny fraction of the science journal budget to continue to pay for open humanities journals. As a strong believer in the power of narcissism and shame, I could imagine a system in which libraries that pay would get exalted patron status on the home page for the journal, while free riders would face the ignominy of a red bar across the top of the browser when viewed on a campus that dropped support once the AHR went open access. (“You are welcome to read this open scholarship, but you should know that your university is skirting its obligation to the field.” The Shame Bar could be left off in places that cannot afford to pay.)

Regardless of the method and the model, the point is simply that we haven’t tried very hard. Too many of my colleagues, in the preferred professorial mode of focusing on the negative, have highlighted perceived problems with open access without actually engaging it. Yet somehow over 8,000 open access journals have flourished in the last decade. If the AHA’s response is that those journals aren’t flagship journals, well, I’m not sure that’s the one-percenter rhetoric they want to be associated with as representatives of the entire profession.

Furthermore, if our primary concern is indeed the economics of the AHR, wouldn’t it be fair game to look at the full economics of it—not just the direct costs on AHA’s side (“$460,000 to support the editorial processes”), but the other side, where much of the work gets done: the time professional historians take to write and vet articles? I would wager those in-kind costs are far larger than $460,000 a year. That’s partly what Roy was getting at in his appeal to the underlying funding of most historical scholarship. Any such larger economic accounting would trigger more difficult questions, such as Hugh Gusterson’s pointed query about why he’s being asked to give his peer-review labor for free but publishers are gating the final product in return—thanks for your gift labor, now pay up. That the AHA is a small non-profit publisher rather than a commercial giant doesn’t make this question go away.

There is no doubt that professional societies outside of the sciences are in a horrible bind between the drive toward open access and the need for sustainability. But history tells us that no institution has the privilege of remaining static. The American Historical Association can tinker with payments for the AHR as much as it likes under the assumption that the future will be like the past, just with a different spreadsheet. I’d like to see the AHA be bolder—supportive not only of its flagship but of the entire fleet, which now includes fledgling open access journals, blogs, and other nascent online genres.

Mostly, I’d like to see a statement that doesn’t read like this one does: anxious and reactive. I’d like to see a statement that says: “We stand ready to nurture and support historical scholarship whenever and wherever it might arise.”

Normal Science and Abnormal Publishing

When the Large Hadron Collider locates its elusive quarry under the sofa cushion of the universe, Nature will be there to herald the news of the new particle and the scientists who found it. But below these headline-worthy discoveries, something fascinating is going on in science publishing: the race, prompted by the hugely successful PLoS ONE and inspired by the earlier revolution of arXiv.org, to provide open access outlets for any article that is technically sound, without trying to assess impact ahead of time. These outlets are growing rapidly and are likely to represent a significant percentage of published science in the years ahead.

Last week the former head of PLoS ONE announced a new company and a new journal, PeerJ, that takes the concept one step further, providing an all-you-can-publish buffet for a minimal lifetime fee. And this week saw the launch of Scholastica, which will publish a peer-reviewed article for a mere $10. (Scholastica is accepting articles in all fields, but I suspect it will be used mostly by scientists used to this model.) As stockbrokers would say, it looks like we’re going to test the market bottom.

Yet the economics of this publishing is far less interesting than its inherent philosophy. At a steering committee meeting of the Coalition for Networked Information, the always-shrewd Cliff Lynch summarized a critical mental shift that has occurred: “There’s been a capitulation on the question of importance.” Exactly. Two years ago I wrote about how “scholars have uses for archives that archivists cannot anticipate,” and these new science journals flip that equation from the past into the future: aside from rare and obvious discoveries (the 1%), we can’t tell what will be important in the future, so let’s publish as much as possible (the 99%) and let the community of scholars rather than editors figure that out for themselves.

Lynch noted that capitulation on importance allows for many other kinds of scientific research to come to the fore, such as studies that try to reproduce experiments to ensure their validity and work that fails to prove a scientist’s hypothesis (negative outcomes). When you think about it, traditional publishing encourages a constant stream of breakthroughs, when in reality actual breakthroughs are few and far between. Rather than trumpeting every article as important in a quest to be published, these new venues encourage scientists to publish more of what they find, and in a more honest way. Some of that research may in fact prove broadly important in a field, while other research might simply be helpful for its methodological rigor or underlying data.

As a historian of science, all of this reminds me of Thomas Kuhn’s conception of normal science. Kuhn is of course known for the “paradigm shift,” a notion that, much to Kuhn’s chagrin, has escaped the bounds of his philosophy of science into nearly every field of study (and frequently business seminars as well). But to have a paradigm shift you have to have a paradigm, and just as crucial as the shifting is the not-shifting. Kuhn called this “normal science,” and it represents most of scientific endeavor.

Kuhn famously described normal science as “mopping-up operations,” but that phrase was not meant to be disparaging. “Few people who are not actually practitioners of a mature science,” he wrote in The Structure of Scientific Revolutions, “realize how much mop-up work of this sort a paradigm leaves to be done or quite how fascinating such work can prove in the execution.” Scientists often spend years or decades fleshing out and refining theories, testing them anew, applying them to new evidence and to new areas of a field.

There is nothing wrong with normal science. Indeed, it can be good science. It’s just not often the science that makes headlines. And now it has found a good match in the realm of publishing.

One on One

I’m not going to try to name it (ahem), but I do want to highlight its existence while it’s still young: a new web genre in which one person recommends one thing (often for one day). It’s another manifestation of modern web minimalism, akin to what is happening in web design. We are sick of the rococo web: the endless, illustrated, hyperlinked streams of social media, the ornate playlists, the overabundant recommendations in every corner of our screen. Too many things to look at and read.

The solution has occurred to several people at once: vastly reduce the choices for the recommender and the recommendee, the better to focus their attention. (Were I a staff writer for the New Yorker I would insert a pithy reference to Barry Schwartz’s The Paradox of Choice: Why More Is Less here.)

In music, there’s This is My Jam: one person, one song. For writing, The Listserve: one person, one message to a global audience via email. Perhaps most intriguing was the short-lived project Last Great Thing, which asked one person a day to name the most interesting, compelling work they had encountered recently. Recommendations included many websites but also novels, videos, music, and plays. As editors Jake Levine and Justin Van Slembrouck put it:

Last Great Thing was designed to take our mission to its extreme: from the endless stream of great content on the web, how would we go about creating an experience around a single compelling thing?

It’s worth reading their entire justification for the project, and what they learned. I suspect the model could be helpfully extended to other areas. The genre recaptures the advantages of scarcity that print had, in the same way that Readability and Instapaper recapture the advantages of distraction-free legibility for reading.

So, out with the rococo aesthetic, in with the Shaker aesthetic.