Category: Books

On the Response to My Atlantic Essay on the Decline in the Use of Print Books in Universities

I was not expecting—but was gratified to see—an enormous response to my latest piece in The Atlantic, “The Books of College Libraries Are Turning Into Wallpaper,” on the seemingly inexorable decline in the circulation of print books on campus. I’m not sure that I’ve ever written anything that has generated as much feedback, commentary, and hand-wringing. I’ve gotten dozens of emails and hundreds of social media messages, and The Atlantic posted (and I responded in turn to) some passionate letters to the editor. Going viral was certainly not my intent: I simply wanted to lay out an important and under-discussed trend in the use of print books in the libraries of colleges and universities, and to outline why I thought it was happening. I also wanted to approach the issue both as the dean of a library and as a historian whose own research practices have changed over time.

I think the piece generated such a large response because it exposed a significant transition in the way that research, learning, and scholarship happens, and what that might imply for the status of books and the nature of libraries—topics that often touch a raw nerve, especially at a time when popular works extol libraries—I believe correctly—as essential civic infrastructure.

But those works focus mostly on public libraries, and this essay focused entirely on research libraries. People are thankfully still going to and extensively using libraries, both research and public (there were over a billion visits to public libraries in the U.S. last year), but they are doing so in increasingly diversified ways.

The key to my essay were these lines:

“The decline in the use of print books at universities relates to the kinds of books we read for scholarly pursuits rather than pure pleasure…A positive way of looking at these changes is that we are witnessing a Great Sorting within the [research] library, a matching of different kinds of scholarly uses with the right media, formats, and locations.”

Although I highlighted statistics from Yale and the University of Virginia (which, alas, was probably not very kind to my friends at those institutions, although I also used stats from my own library at Northeastern University), the trend I identified seems to be very widespread. Although I only mentioned specific U.S. research libraries, my investigations showed that the same decline in the use of print collections is happening globally, albeit not necessarily universally. In most of the libraries I examined, or from data that was sent to me by colleagues at scores of universities, the circulation of print books within research libraries is declining at about 5-10% per year per student (or FTE).

For example, in the U.K. and Ireland, over the three years between the 2013-14 school year and the 2016-17 school year, the circulation of print books per student declined by 27%, according to the Society of College, National and University Libraries (SCONUL), which represents all university libraries in the U.K. and Ireland. Meanwhile, SCONUL reports that visits to these libraries have actually increased during this period. (SCONUL’s other core metric, print circulations per student visit to the library, has thus declined even more, by 33% over three years.) Similarly, the Canadian Association of Research Libraries (CARL), which maintains the statistics for university libraries in Canada, notes that during these same three years, the average yearly print circulation at their member libraries dropped from 200,000 to 150,000 books, and their per-student circulation number also dropped by 25%.

Again, this is just over three recent years. The decline becomes even more severe as one goes further back in time. In the 2005-6 school year, the average Canadian research library circulated 30 books per student, which slid to 25 in 2008-9; by 2016-17 that number was just 5. Readers of my article were shocked that UVA students had only checked out 60,000 books last year, compared to 238,000 a decade ago, but had I gone all the way back in the UVA statistics to two decades ago, the comparison would have been even more stark. The total circulation of books in the UVA library system was 1,085,000 in 1999-2000 and 207,000 in 2016-17. Here’s the overall graph of print circulation (in “initial circs,” which do not include renewals) from the Association of Research Library (U.S.), showing a 58% decline between 1991 and 2015, but an even larger decline since Peak Book and an even larger decline on a per student basis, since during this same period the student body at these universities increased 40%.

These longer time frames underline how this is an ongoing, multi-decade shift in the ways that students and faculty interact with and use the research library. All research libraries are experiencing such forces and pressing additional demands—the need for new kinds of services and spaces as well as the surging use of digital resources and data—while at the same time continuing to value physical artifacts (archives and special collections) and printed works. It’s a very complicated, heterogeneous environment for learning and scholarship. Puzzling through the correct approach to these shifts, rather than ignoring them and sticking more or less with the status quo, was what I was trying to prod everyone to think about in the essay, and if I was at all successful, that’s hopefully all to the good.

June 6, 2019

Kathleen Fitzpatrick’s Generous Thinking

Generosity and thoughtfulness are not in abundance right now, and so Kathleen Fitzpatrick‘s important new book, Generous Thinking: A Radical Approach to Saving the University, is wholeheartedly welcome. The generosity Kathleen seeks relates to lost virtues, such as listening to others and deconstructing barriers between groups. As such, Generous Thinking can be helpfully read alongside of Alan Jacobs’s How to Think, as both promote humility and perspective-taking as part of a much-needed, but depressingly difficult, re-socialization. Today’s polarization and social media only make this harder.

Fitzpatrick’s analysis of the university’s self-inflicted wounds is painful to acknowledge for those of us in the academy, but undoubtedly true. Scholars are almost engineered to cast a critical eye on all that passes before them, and few articulate their work well to broader audiences. Administrators are paying less attention than in the past to the communities that surround their campuses. Perhaps worst of all, the incentive structures of universities, such as the tenure process and college rankings, strongly reinforce these issues.

I read Generous Thinking in a draft form last year and thought an appropriate alternate title might be The Permeable University. Many of Fitzpatrick’s prescriptions involve dissolving the membrane of the academy so that it can integrate in a mutually beneficial way with the outside world, on an individual and institutional level. You will be unsurprised to hear that I agree completely with many of her suggestions, such as open access to scholarly resources and the importance of scholars engaging with the public. Like Fitzpatrick, I have had a career path that has alternated between the nonprofit and academic worlds in the pursuit of platforms and initiatives that try to maximize those values.

With universities currently receiving withering criticism from both the right and left, it is critical for all of us in the academy to take Generous Thinking seriously, and to think about other concrete steps we can take to open our doors and serve the wider public. The deep incentive structures will be very hard to change, but we can all take more modest steps such as thinking about how new media like podcasts can play a role in a more publicly approachable and helpful university, or how we might be able to provide services (e.g., archival services) to local communities. Fitzpatrick’s Humanities Commons, a site for scholars to connect not just with each other but with the public, is another venue for making the generosity she seeks a reality.

Much more needs to be done on this front, and so I encourage you to read Kathleen Fitzpatrick’s new book.

February 28, 2019

Humility and Perspective-Taking: A Review of Alan Jacobs’s How to Think

In Alan Jacobs’s important new book How to Think: A Survival Guide for a World at Odds, he locates thought within our social context and all of the complexities that situation involves: our desire to fit into our current group or an aspirational in-group, our repulsion from other groups, our use of a communal (but often invisibly problematic) shorthand language, our necessarily limited interactions and sensory inputs. With reference to recent works in psychology, he also lays bare our strong inclination to bias and confusion.

However, Jacobs is not by trade a social scientist, and having obsessed about many of the same works as him (Daniel Kahneman’s Thinking, Fast and Slow looms large for both of us), it’s a relief to see a humanist address the infirmity of the mind, with many more examples from literature, philosophy, and religion, and with a plainspoken synthesis of academic research, popular culture, and politics.

How to Think is much more fun than a book with that title has the right to be. Having written myself about the Victorian crisis of faith, I am deeply envious of Jacobs’s ability to follow a story about John Stuart Mill’s depression with one about Wilt Chamberlain’s manic sex life. You will enjoy the read.

But the approachability of this book masks only slightly the serious burden it places on its readers. This is a book that seeks to put us into uncomfortable positions. In fact, it asks us to assume a position from which we might change our positions. Because individual thinking is inextricably related to social groups, this can lead to exceedingly unpleasant outcomes, including the loss of friends or being ostracized from a community. Taking on such risk is very difficult for human beings, the most social of animals. In our age of Twitter, the risk is compounded by our greater number of human interactions, interactions that are exposed online for others to gaze upon and judge.

So what Jacobs asks of us is not at all easy. (Some of the best passages in How to Think are of Jacobs struggling with his own predisposition to fire off hot takes.) It can also seem like an absurd and unwise approach when the other side shows no willingness to put themselves in your shoes. Our current levels of polarization push against much in this book, and the structure and incentives of social media are clearly not helping.

Like any challenge that is hard and risky, overcoming it requires a concerted effort over time. Simple mental tricks will not do. Jacobs thus advocates for, in two alliterative phrases that came to mind while reading his book, habits of humility and practices of perspective-taking. To be part of a healthy social fabric—and to add threads to that fabric rather than rend it—one must constantly remind oneself of the predisposition to error, and one must repeatedly try to pause and consider, if only briefly, the source of other views you are repulsed by. (An alternative title for this book could have been How to Listen.)

Jacobs anticipates some obvious objections. He understands that facile calls for “civility,” which some may incorrectly interpret as Jacobs’ project, is often just repression in disguise. Jacobs also notes that you can still hold strong views, or agree with your group much of the time, in his framing. It’s just that you need to have a modicum of flexibility and ability to see past oneself and one’s group. Disagreements can then be worked out procedurally rather than through demonization.

Indeed, those who accept Jacobs’s call may not actually change their minds that often. What they will have achieved instead, in Jacobs’s most memorable phrase, is “a like-hearted, rather than like-minded,” state that allows them to be more neighborly with those around them and beyond their group. Enlarging the all-too-small circle of such like-hearted people is ultimately what How to Think seeks.

October 25, 2017 1 Comment

What’s the Matter with Ebooks: An Update

In an earlier post I speculated about the plateau in ebook adoption. According to recent statistics from publishers we are now actually seeing a decline in ebook sales after a period of growth (and then the leveling off that I discussed before). Here’s my guess about what’s going on—an educated guess, supported by what I’m hearing from my sources and network.

First, re-read my original post. I believe it captured a significant part of the story. A reminder: when we hear about ebook sales we hear about the sales from (mostly) large publishers and I have no doubt that ebooks are a troubled part of their sales portfolio. But there are many other ebooks than those reported by the publishers that release their stats, and ways to acquire them, and thus there’s a good chance that there’s considerable “dark reading” (as I called it) that accounts for the disconnect between the surveys that say that e-reading is growing while sales (again, from the publishers that reveal these stats) are declining.

The big story I now perceive is a bifurcation of the market between what used to be called high and low culture. For genre fiction (think sexy vampires) and other genres where there is a lot of self-publishing, readers seem to be moving to cheap (often 99 cent) ebooks from Amazon’s large and growing self-publishing program. Amazon doesn’t release its ebook sales stats, but we know that they already have 65% of the ebook market and through their self-publishing program may reach a disturbing 90% in a few years. Meanwhile, middle- and high-brow books for the most part remain at traditional publishers, where advances still grease the wheels of commerce (and writing).

Other changes I didn’t discuss in my last post are also happening that impact ebook adoption. Audiobook sales rose by an astonishing 40% over the last year, a notable story that likely impacts ebook growth—for the vast majority of those with smartphones, they are substitutes (see also the growth in podcasts). In addition, ebooks have gotten more expensive in the past few years, while print (especially paperback) prices have become more competitive; for many consumers, a simple Econ 101 assessment of pricing accounts for the ebook stall.

I also failed to account in my earlier post for the growing buy-local movement that has impacted many areas of consumption—see vinyl LPs and farm-to-table restaurants—and is, in part, responsible for the turnaround in bookstores—once dying, now revived—an encouraging trend pointed out to me by Oren Teicher, the head of the American Booksellers Association. These bookstores were clobbered by Amazon and large chains late last decade but have recovered as the buy-local movement has strengthened and (more behind the scenes, but just as important) they adopted technology and especially rapid shipping mechanisms that have made them more competitive.

Personally, I continue to read in both print and digitally, from my great local public library and from bookstores, and so I’ll end with an anecdotal observation: there’s still a lot of friction in getting an ebook versus a print book, even though one would think it would be the other way around. Libraries still have poor licensing terms from publishers that treat digital books like physical books that can only be loaned to one person at a time despite the affordances of ebooks; ebooks are often not that much cheaper, if at all, than physical books; and device-dependency and software hassles cause other headaches. And as I noted in my earlier post, there’s still not a killer e-reading device. The Kindle remains (to me and I suspect many others) a clunky device with a poor screen, fonts, etc. In my earlier analysis, I probably also underestimated the inertial positive feeling of physical books for most readers—which I myself feel as a form of consumption that reinforces the benefits of the physical over the digital.

It seems like all of these factors—pricing, friction, audiobooks, localism, and traditional physical advantages—are combining to restrict the ebook market for “respectable” ebooks and to shift them to Amazon for “less respectable” genres. It remains to be seen if this will hold, and I continue to believe that it would be healthy for us to prepare for, and create, a better future with ebooks.

July 12, 2016 2 Comments

What’s the Matter with Ebooks?

[As you may have noticed, I haven’t posted to this blog for over a year. I’ve been extraordinarily busy with my new job. But I’m going to make a small effort to reinvigorate this space, adding my thoughts on evolving issues that I’d like to explore without those thoughts being improperly attributed to the Digital Public Library of America. This remains my personal blog, and you should consider these my personal views. I will also be continuing to post on DPLA’s blog, as I have done on this topic of ebooks.]

Over the past two years I’ve been tracking ebook adoption, and the statistics are, frankly, perplexing. After Amazon released the Kindle in 2007, there was a rapid growth in ebook sales and readership, and the iPad’s launch three years later only accelerated the trend.

Then something odd happened. By most media accounts, ebook adoption has plateaued at about a third of the overall book market, and this stall has lasted for over a year now. Some are therefore taking it as a Permanent Law of Reading: There will be electronic books, but there will always be more physical books. Long live print!

I read both e- and print books, and I appreciate the arguments about the native advantages of print. I am a digital subscriber to the New York Times, but every Sunday I also get the printed version. The paper feels expansive, luxuriant. And I do read more of it than the daily paper on my iPad, as many articles catch my eye and the flipping of pages requires me to confront pieces that I might not choose to read based on a square inch of blue-tinged screen. (Also, it’s Sunday. I have more time to read.) Even though I read more ebooks than printed ones at this point, it’s hard not to listen to the heart and join the Permanent Law chorus.

But my mind can’t help but disagree with my heart. Yours should too if you run through a simple mental exercise: jump forward 10 or 20 or 50 years, and you should have a hard time saying that the e-reading technology won’t be much better—perhaps even indistinguishable from print, and that adoption will be widespread. Even today, studies have shown that libraries that have training sessions for patrons with iPads and Kindles see the use of ebooks skyrocket—highlighting that the problem is in part that today’s devices and ebook services are hard to use. Availability of titles, pricing (compared to paperback), DRM, and a balkanization of ebook platforms and devices all dampen adoption as well.

But even the editor of the New York Times understands the changes ahead, despite his love for print:

How long will print be around? At a Loyola University gathering in New Orleans last week, the executive editor [of the Times], Dean Baquet, noted that he “has as much of a romance with print as anyone.” But he also admitted, according to a Times-Picayune report, that “no one thinks there will be a lot of print around in 40 years.”

Forty years is a long time, of course—although it is a short time in the history of the book. The big question is when the changeover will occur—next year, in five years, in Baquet’s 2055?

The tea leaves, even now, are hard to read, but I’ve come to believe that part of this cloudiness is because there’s much more dark reading going on than the stats are showing. Like dark matter, dark reading is the consumption of (e)books that somehow isn’t captured by current forms of measurement.

For instance, usually when you hear about the plateauing of ebook sales, you are actually hearing about the sales of ebooks from major publishers in relation to the sales of print books from those same publishers. That’s a crucial qualification. But sales of ebooks from these publishers is just a fraction of overall e-reading. By other accounts, which try to shine light on ebook adoption by looking at markets like Amazon (which accounts for a scary two-thirds of ebook sales), show that a huge and growing percentage of ebooks are being sold by indie publishers or authors themselves rather than the bigs, and a third of them don’t even have ISBNs, the universal ID used to track most books.

The commercial statistics also fail to account for free e-reading, such as from public libraries, which continues to grow apace. The Digital Public Library of America and other sites and apps have millions of open ebooks, which are never chalked up as a sale.

Similarly, while surveys of the young continue to show their devotion to paper, yet other studies have shown that about half of those under 30 read an ebook in 2013, up from a quarter of Millennials in 2011—and that study is already dated. Indeed, most of the studies that highlight our love for print over digital are several years old (or more) at this point, a period in which large-format, high-resolution smartphone adoption (much better for reading) and new all-you-can-read ebook services, such as Oyster, Scribd, and Kindle Unlimited, have emerged. Nineteen percent of Millennials have already subscribed to one of these services, a number considered low by the American Press Institute, but which strikes me as remarkably high, and yet another contributing factor to the dark reading mystery.

I’m a historian, not a futurist, but I suspect that we’re not going to have to wait anywhere near forty years for ebooks to become predominant, and that the “plateau” is in part a mirage. That may cause some hand-wringing among book traditionalists, an emotion that is understandable: books are treasured artifacts of human expression. But in our praise for print we forget the great virtues of digital formats, especially the ease of distribution and greater access for all—if done right.

March 24, 2015

A Conversation with Data: Prospecting Victorian Words and Ideas

[An open access, pre-print version of a paper by Fred Gibbs and myself for the Autumn 2011 volume of Victorian Studies. For the final version, please see Victorian Studies at Project MUSE.]

Introduction

“Literature is an artificial universe,” author Kathryn Schulz recently declared in the New York Times Book Review, “and the written word, unlike the natural world, can’t be counted on to obey a set of laws” (Schulz). Schulz was criticizing the value of Franco Moretti’s “distant reading,” although her critique seemed more like a broadside against “culturomics,” the aggressively quantitative approach to studying culture (Michel et al.). Culturomics was coined with a nod to the data-intensive field of genomics, which studies complex biological systems using computational models rather than the more analog, descriptive models of a prior era. Schulz is far from alone in worrying about the reductionism that digital methods entail, and her negative view of the attempt to find meaningful patterns in the combined, processed text of millions of books likely predominates in the humanities.

Historians largely share this skepticism toward what many of them view as superficial approaches that focus on word units in the same way that bioinformatics focuses on DNA sequences. Many of our colleagues question the validity of text mining because they have generally found meaning in a much wider variety of cultural artifacts than just text, and, like most literary scholars, consider words themselves to be context-dependent and frequently ambiguous. Although occasionally intrigued by it, most historians have taken issue with Google’s Ngram Viewer, the search company’s tool for scanning literature by n-grams, or word units. Michael O’Malley, for example, laments that “Google ignores morphology: it ignores the meanings of words themselves when it searches…[The] Ngram Viewer reflects this disinterest in meaning. It disambiguates words, takes them entirely out of context and completely ignores their meaning…something that’s offensive to the practice of history, which depends on the meaning of words in historical context.” (O’Malley)

Such heated rhetoric—probably inflamed in the humanities by the overwhelming and largely positive attention that culturomics has received in the scientific and popular press—unfortunately has forged in many scholars’ minds a cleft between our beloved, traditional close reading and untested, computer-enhanced distant reading. But what if we could move seamlessly between traditional and computational methods as demanded by our research interests and the evidence available to us?

In the course of several research projects exploring the use of text mining in history we have come to the conclusion that it is both possible and profitable to move between these supposed methodological poles. Indeed, we have found that the most productive and thorough way to do research, given the recent availability of large archival corpora, is to have a conversation with the data in the same way that we have traditionally conversed with literature—by asking it questions, questioning what the data reflects back, and combining digital results with other evidence acquired through less-technical means.

We provide here several brief examples of this combinatorial approach that uses both textual work and technical tools. Each example shows how the technology can help flesh out prior historiography as well as provide new perspectives that advance historical interpretation. In each experiment we have tried to move beyond the more simplistic methods made available by Google’s Ngram Viewer, which traces the frequency of words in print over time with little context, transparency, or opportunity for interaction.

The Victorian Crisis of Faith Publications

One of our projects, funded by Google, gave us a higher level of access to their millions of scanned books, which we used to revisit Walter E. Houghton’s classic The Victorian Frame of Mind, 1830-1870 (1957). We wanted to know if the themes Houghton identified as emblematic of Victorian thought and culture—based on his close reading of some of the most famous works of literature and thought—held up against Google’s nearly comprehensive collection of over a million Victorian books. We selected keywords from each chapter of Houghton’s study—loaded words like “hope,” “faith,” and “heroism” that he called central to the Victorian mindset and character–and queried them (and their Victorian synonyms, to avoid literalism) against a special data set of titles of nineteenth-century British printed works.

The distinction between the words within the covers of a book and those on the cover is an important and overlooked one. Focusing on titles is one way to pull back from a complete lack of context for words (as is common in the Google Ngram Viewer, which searches full texts and makes no distinction about where words occur), because word choice in a book’s title is far more meaningful than word choice in a common sentence. Books obviously contain thousands of words which, by themselves, are not indicative of a book’s overall theme—or even, as O’Malley rightly points out, indicative of what a researcher is looking for. A title, on the other hand, contains the author’s and publisher’s attempt to summarize and market a book, and is thus of much greater significance (even with the occasional flowery title that defies a literal description of a book’s contents). Our title data set covered the 1,681,161 books that were published in English in the UK in the long nineteenth century, 1789-1914, normalized so that multiple printings in a year did not distort the data. (The public Google Ngram Viewer uses only about half of the printed books Google has scanned, tossing—algorithmically and often improperly—many Victorian works that appear not to be books.)

Our queries produced a large set of graphs portraying the changing frequency of thematic words in titles, which were arranged in grids for an initial, human assessment (fig. 1). Rather than accept the graphs as the final word (so to speak), we used this first, prospecting phase to think through issues of validity and significance.

Fig. 1. A grid of search results showing the frequency of a hundred words in the titles of books and their change between 1789 and 1914. Each yearly total is normalized against the total number of books produced that year, and expressed as a percentage of all publications.

Upon closer inspection, many of the graphs represented too few titles to be statistically meaningful (just a handful of books had “skepticism” in the title, for instance), showed no discernible pattern (“doubt” fluctuates wildly and randomly), or, despite an apparently significant trend, were unhelpful because of the shifting meaning of words over time.

However, in this first pass at the data we were especially surprised by the sharp rise and fall of religious words in book titles, and our thoughts naturally turned to the Victorian crisis of faith, a topic Houghton also dwelled on. How did the religiosity and then secularization of nineteenth-century literature parallel that crisis, contribute to it, or reflect it? We looked more closely at book titles involving faith. For instance, books that have the words “God” or “Christian” in the title rise as a percentage of all works between the beginning of the nineteenth century and the middle of the century, and then fall precipitously thereafter. After appearing in a remarkable 1.2% of all book titles in the mid-1850s, “God” is present in just one-third of one percent of all British titles by the first World War (fig. 2). “Christian” titles peak at nearly one out of fifty books in 1841, before dropping to one out of 250 by 1913 (fig. 3). The drop is particularly steep between 1850 and 1880.

Fig. 2. The percentage of books published in each year in English in the UK from 1789-1914 that contain the word “God” in their title.

Fig. 3. The percentage of books published in each year in English in the UK from 1789-1914 that contain the word “Christian” in their title.

These charts are as striking as any portrayal of the crisis of faith that took place in the Victorian era, an important subject for literary scholars and historians alike. Moreover, they complicate the standard account of that crisis. Although there were celebrated cases of intellectuals experiencing religious doubt early in the Victorian age, most scholars believe that a more widespread challenge to religion did not occur until much later in the nineteenth century (Chadwick). Most scientists, for instance, held onto their faith even in the wake of Darwin’s Origin of Species (1859), and the supposed conflict of science and religion has proven largely illusory (Turner). However, our work shows that there was a clear collapse in religious publishing that began around the time of the 1851 Religious Census, a steep drop in divine works as a portion of the entire printed record in Britain that could use further explication. Here, publishing appears to be a leading, rather than a lagging, indicator of Victorian culture. At the very least, rather than looking at the usual canon of books, greater attention by scholars to the overall landscape of publishing is necessary to help guide further inquiries.

More in line with the common view of the crisis of faith is the comparative use of “Jesus” and “Christ.” Whereas the more secular “Jesus” appears at a relatively constant rate in book titles (fig. 4, albeit with some reduction between 1870 and 1890), the frequency of titles with the more religiously charged “Christ” drops by a remarkable three-quarters beginning at mid-century (fig. 5).

Fig. 4. The percentage of books published in each year in English in the UK from 1789-1914 that contain the word “Jesus” in their title.

Fig. 5. The percentage of books published in each year in English in the UK from 1789-1914 that contain the word “Christ” in their title.

Open-ended Investigations

Prospecting a large textual corpus in this way assumes that one already knows the context of one’s queries, at least in part. But text mining can also inform research on more open-ended questions, where the results of queries should be seen as signposts toward further exploration rather than conclusive evidence. As before, we must retain a skeptical eye while taking seriously what is reflected in a broader range of printed matter than we have normally examined, and how it might challenge conventional wisdom.

The power of text mining allows us to synthesize and compare sources that are typically studied in isolation, such as literature and court cases. For example, another text-mining project focused on the archive of Old Bailey trials brought to our attention a sharp increase in the rate of female bigamy in the late nineteenth century, and less harsh penalties for women who strayed. (For more on this project, see http://criminalintent.org.) We naturally became curious about possible parallels with how “marriage” was described in the Victorian age—that is, how, when, and why women felt at liberty to abandon troubled unions. Because one cannot ask Google’s Ngram Viewer for adjectives that describe “marriage” (scholars have to know what they are looking for in advance with this public interface), we directly queried the Google n-gram corpus for statistically significant descriptors in the Victorian age. Reading the result set of bigrams (two-word couplets) with “marriage” as the second word helped us derive a more narrow list of telling phrases. For instance, bigrams that rise significantly over the nineteenth century include “clandestine marriage,” “forbidden marriage,” “foreign marriage,” “fruitless marriage,” “hasty marriage,” “irregular marriage,” “loveless marriage,” and “mixed marriage.” Each bigram represents a good opportunity for further research on the characterization of marriage through close reading, since from our narrowed list we can easily generate a list of books the terms appear in, and many of those works are not commonly cited by scholars because they are rare or were written by less famous authors. Comparing literature and court cases in this way, we have found that descriptions of failed marriages in literature rose in parallel with male bigamy trials, and approximately two decades in advance of the increase in female bigamy trials, a phenomenon that could use further analysis through close reading.

To be sure, these open-ended investigations can sometimes fall flat because of the shifting meaning of words. For instance, although we are both historians of science and are interested in which disciplines are characterized as “sciences” in the Victorian era (and when), the word “science” retained its traditional sense of “organized knowledge” so late into the nineteenth century as to make our extraction of fields described as a “science”—ranging from political economy (368 occurrences) and human [mind and nature] (272) to medicine (105), astronomy (86), comparative mythology (66), and chemistry (65)—not particularly enlightening. Nevertheless, this prospecting arose naturally from the agnostic searching of a huge number of texts themselves, and thus, under more carefully constructed conditions, could yield some insight into how Victorians conceptualized, or at least expressed, what qualified as scientific.

Word collocation is not the only possibility, either. Another experiment looked at what Victorians thought was sinful, and how those views changed over time. With special data from Google, we were able to isolate and condense the specific contexts around the phrase “sinful to” (50 characters on either side of the phrase and including book titles in which it appears) from tens of thousands of books. This massive query of Victorian books led to a result set of nearly a hundred pages of detailed descriptions of acts and behavior Victorian writers classified as sinful. The process allowed us to scan through many more books than we could through traditional techniques, and without having to rely solely on opaque algorithms to indicate what the contexts are, since we could then look at entire sentences and even refer back to the full text when necessary.

In other words, we can remain close to the primary sources and actively engage them following computational activity. In our initial read of these thousands of “snippets” of sin (as Google calls them), we were able to trace a shift from biblically freighted terms to more secular language. It seems that the expanding realm of fiction especially provided space for new formulations of sin than did the more dominant devotional tracts of the early Victorian age.

Conclusion

Experiments such as these, inchoate as they may be, suggest how basic text mining procedures can complement existing research processes in fields such as literature and history. Although detailed exegeses of single works undoubtedly produce breakthroughs in understanding, combining evidence from multiple sources and multiple methodologies has often yielded the most robust analyses. Far from replacing existing intellectual foundations and research tactics, we see text mining as yet another tool for understanding the history of culture—without pretending to measure it quantitatively—a means complementary to how we already sift historical evidence. The best humanities work will come from synthesizing “data” from different domains; creative scholars will find ways to use text mining in concert with other cultural analytics.

In this context, isolated textual elements such as n-grams aren’t universally unhelpful; examining them can be quite informative if used appropriately and with its limitations in mind, especially as preliminary explorations combined with other forms of historical knowledge. It is not the Ngram Viewer or Google searches that are offensive to history, but rather making overblown historical claims from them alone. The most insightful humanities research will likely come not from charting individual words, but from the creative use of longer spans of text, because of the obvious additional context those spans provide. For instance, if you want to look at the history of marriage, charting the word “marriage” itself is far less interesting than seeing if it co-occurs with words like “loving” or “loveless,” or better yet extracting entire sentences around the term and consulting entire, heretofore unexplored works one finds with this method. This allows for serendipity of discovery that might not happen otherwise.

Any robust digital research methodology must allow the scholar to move easily between distant and close reading, between the bird’s eye view and the ground level of the texts themselves. Historical trends—or anomalies—might be revealed by data, but they need to be investigated in detail in order to avoid conclusions that rest on superficial evidence. This is also true for more traditional research processes that rely too heavily on just a few anecdotal examples. The hybrid approach we have briefly described here can help scholars discover exactly which books, chapters, or pages to focus on, without relying solely on sophisticated algorithms that might filter out too much. Flexibility is crucial, as there is no monolithic digital methodology that can applied to all research questions. Rather than disparage the “digital” in historical research as opposed to the spirit of humanistic inquiry, and continue to uphold a false dichotomy between close and distant reading, we prefer the best of both worlds for broader and richer inquiries than are possible using traditional methodologies alone.

Bibliography

Chadwick, Owen. The Victorian Church. New York: Oxford University Press, 1966.

Houghton, Walter Edwards. The Victorian Frame of Mind, 1830-1870. New Haven: Published for Wellesley College by Yale University Press, 1957.

Schulz, Kathryn. “The Mechanic Muse – What Is Distant Reading?” The New York Times 24 Jun. 2011, BR14.

Michel, Jean-Baptiste et al. “Quantitative Analysis of Culture Using Millions of Digitized Books.” Science 331.6014 (2011): 176 -182.

O’Malley, Michael. “Ngrammatic.” The Aporetic, December 21, 2010, http://theaporetic.com/?p=1369.

Turner, Frank M. Between Science and Religion; the Reaction to Scientific Naturalism in Late Victorian England. New Haven: Yale University Press, 1974.

May 30, 2012 10 Comments

Reading and Believing

Rather than focusing on a new technology or website in our year-end review on the Digital Campus podcast, I chose reading as the big story of 2011. Surely 2011 was the year that digital reading came of age, with iPad and Kindle sales skyrocketing, apps for reading flourishing, and sites for finding high-quality long-form writing proliferating. It was apropos that Alan Jacobs‘s wonderful book The Pleasures of Reading in an Age of Distraction was published in 2011.

Indeed, the relationship between reading and distraction was one of the things that caught my eye reading Daniel Kahneman‘s essential Thinking, Fast and Slow. Kahneman speaks of two systems in the mind—he eschews “intuition” and “reason” for the more neutral “System 1” and “System 2″—with the first making quick, unconscious assessments and the second engaging in much more studious, and laborious, calculations. Since our minds (like our bodies) are naturally lazy, we prefer to stick with System 1’s judgments as much as possible, unless jarred out of it into the grumpier System 2.

In the fifth chapter of Thinking, Fast and Slow, Kahneman addresses the act of reading, and the impulse—even in what is normally thought of as the most cerebral of human acts—to fall back on System 1, to associate the ease of reading with the truth of what is read:

How do you know that a statement is true? If it is strongly linked by logic or association to other beliefs or preferences you hold, or comes from a source you trust and like, you will feel a sense of cognitive ease. The trouble is that there may be other causes for your feeling of ease—including the quality of the font and the appealing rhythm of the prose—and you have no simple way of tracing your feelings to their source.

Thus the context writing exists in and other aspects unrelated to the actual content are critical to the reception that writing receives. In addition to studies on the effects of different fonts on credibility, Kahneman also cites experiments that show the importance of the quality of paper (for printed materials), of the contrast between a font and its background, and of the presence of distractions that reduce the cognitive ease of reading. In short, environments that make it easy to read also make it easy to believe what is being read. Perhaps the most unsettling aspect of this mixture of context and content is that is it extremely difficult for you to separate the two.

So legibility and the absence of distractions are not just design niceties; when a reader chooses to move an article into an app like Instapaper, they are strongly increasing the odds that they will like what they read and agree with it. And since readers often make that relocation at the recommendation of a trusted source, the written work is additionally “framed” as worthy even before the act of reading has begun.

Commercial publishers may not like the use of Instapaper or Readability, which strip the distractions otherwise known as ads from a cluttered website to focus solely on the text at hand, but they are an unalloyed good for writers.

February 7, 2012 9 Comments

Some Thoughts on the Hacking the Academy Process and Model

I’m delighted that the edited version of Hacking the Academy is now available on the University of Michigan’s DigitalCultureBooks site. Here are some of my quick thoughts on the process of putting the book together. (For more, please read the preface Tom Scheinfeldt and I wrote.)

1) Be careful what you wish for. Although we heavily promoted the submission process for HTA, Tom and I had no idea we would receive over 300 contributions from nearly 200 authors. This put an enormous, unexpected burden on us; it obviously takes a long time to read through that many submissions. Tom and I had to set up a collaborative spreadsheet for assessing the contributions, and it took several months to slog through the mass. We also had to make tough decisions about what kind of work to include, since we were not overly prescriptive about what we were looking for. A large number of well-written, compelling pieces (including many from friends of ours) had to be left out of the volume, unfortunately, because they didn’t quite match our evolving criteria, or didn’t fit with other pieces in the same chapter.

2) Set aside dedicated time and people. Other projects that have crowdsourced volumes, such as Longshot Magazine, have well-defined crunch times for putting everything together, using an expanded staff and a lot of coffee. I think it’s fair to say (and I hope not haughty to say) that Tom and I are incredibly busy people and we had to do the assembly and editing in bits and pieces. I wish we could have gotten it done much sooner to sustain the energy of the initial week. We probably could have included others in the editing process, although I think we have good editorial consistency and smooth transitions because of the more limited control.

3) Get the permissions set from the beginning. One of the delays on the edited volume was making sure we had the rights to all of the materials. HTA has made us appreciate even more the importance of pushing for Creative Commons licenses (especially the simple CC-BY) in academia; many of our contributors are dedicated to open access and already had licensed their materials under a permissive reproduction license, but we had to annoy everyone else (and by “we,” I mean the extraordinary helpful and capable Shana Kimball at MPublishing). This made the HTA process a little more like a standard publication, where the press has to hound contributors for sign-offs, adding friction along the way.

4) Let the writing dictate the form, not vice versa. I think one of the real breakthroughs that Tom and I had in this process is realizing that we didn’t need to adhere to a standard edited-volume format of same-size chapters. After reading through odd-sized submissions and thinking about form, we came up with an array of “short, medium, long” genres that could fit together on a particular theme. Yes, some of the good longer pieces could stand as more-or-less standard essays, but others could be paired together or set into dialogues. It was liberating to borrow some conventions from, e.g., magazines and the way they handle shorter pieces. In some cases we also got rather aggressive about editing down articles so that they would fit into useful spaces.

5) This is a model that can be repeated. Sure, it’s not ideal for some academic cases, and speed is not necessarily of the essence. But for “state of the field” volumes, vibrant debates about new ideas, and books that would benefit from blended genres, it seems like an improvement upon the staid “you have two years to get me 8,000 words for a chapter” model of the edited book.

September 8, 2011 12 Comments

Using WordPress as a Book-Writing Platform

I’ve had a few people ask about the writing environment I’m using for The Ivory Tower and the Open Web (introduction posted a couple of days ago). I’m writing the book entirely in WordPress, which really has matured into a terrific authoring platform. Some notes:

1) The addition of the TinyMCE WYSIWYG text-editing tools made WordPress today’s version of the beloved Word 5.1, the lean, mean, writing machine that Word used to be before Microsoft bloated it beyond recognition.

2) WordPress 3.2 joined the distraction-free trend mainstreamed by apps like Scrivener and Instapaper, where computer administrative debris (as Edward Tufte once called the layers of eye-catching controls that frame most application windows) fades away. If you go into full-screen mode in the editor everything disappears but your text. WordPress devs even thoughtfully added a zen “Just write” prompt to get you going. Go full-screen in your browser for extra zen.

3) For footnotes, I’m using the excellent WP-Footnotes plugin, which is not only easy to use but (perhaps critically for the future) degrades gracefully into parenthetical embedded citations outside of WordPress.

4) I’m of course using Zotero to insert and format those footnotes, using one of the features that makes Zotero better (IMHO) than other research managers: the ability to drag and drop formatted citations right from the Zotero interface into a textarea in the browser. (WP-Footnotes handles the automatic numbering.)

5) I’ve done a few tweaks to WordPress’s wp-admin CSS to customize the writing environment (there’s an “editorcontainer” that styles the textarea). In particular, I found the default width too wide for comfortable writing or reading. So I resized it to 500 pixels, which is roughly the line width of a standard book.

July 28, 2011 11 Comments

The Ivory Tower and the Open Web: Burritos, Browsers, and Books

In the summer of 2007, Nate Silver decided to conduct a rigorous assessment of the inexpensive Mexican restaurants in his neighborhood, Chicago’s Wicker Park. Figuring that others might be interested in the results of his study, and that he might be able to use some feedback from an audience, he took his project online.

Silver had no prior experience in such an endeavor. By day he worked as a statistician and writer at Baseball Prospectus—an innovator, to be sure, having created a clever new standard for empirically measuring the value of players, an advanced form of the “sabermetrics” vividly described by Michael Lewis in Moneyball. ((Nate Silver, “Introducing PECOTA,” in Gary Huckabay, Chris Kahrl, Dave Pease et al., eds., Baseball Prospectus 2003 (Dulles, VA: Brassey’s Publishers, 2003): 507-514. Michael Lewis, Moneyball: The Art of Winning an Unfair Game (New York: W. W. Norton & Company, 2004).)) But Silver had no experience as a food critic, nor as a web developer.

In time, his appetite took care of the former and the open web took care of the latter. Silver knit together a variety of free services as the tapestry for his culinary project. He set up a blog, The Burrito Bracket, using Google’s free Blogger web application. Weekly posts consisted of his visits to local restaurants, and the scores (in jalapeños) he awarded in twelve categories.

: Home page of Nate Silver’s Burrito Bracket

: Ranking system (upper left quadrant)

Being a sports geek, he organized the posts as a series of contests between two restaurants. Satisfying his urge to replicate March Madness, he modified another free application from Google, generally intended to create financial or data spreadsheets, to produce the “bracket” of the blog’s title.

: Google Spreadsheets used to create the competition bracket

Like many of the savviest users of the web, Silver started small and improved the site as he went along. For instance, he had started to keep a photographic record of his restaurant visits and decided to share this documentary evidence. So he enlisted the photo-sharing site Flickr, creating an off-the-rack archive to accompany his textual descriptions and numerical scores. On August 15, 2007, he added a map to the site, geolocating each restaurant as he went along and color-coding the winners and losers.

: Flickr photo archive for The Burrito Bracket (flickr.com)

: Silver’s Google Map of Chicago’s Wicker Park (shaded in purple) with the location of each Mexican restaurant pinpointed

Even with its do-it-yourself enthusiasm and the allure of carne asada, Silver had trouble attracting an audience. He took to Yelp, a popular site for reviewing restaurants to plug The Burrito Bracket, and even thought about creating a Super Burrito Bracket, to cover all of Chicago. ((Frequently Asked Questions, The Burrito Bracket, http://burritobracket.blogspot.com/2007/07/faq.html)) But eventually he abandoned the site following the climactic “Burrito Bowl I.”

With his web skills improved and a presidential election year approaching, Silver decided to try his mathematical approach on that subject instead—”an opportunity for a sort of Moneyball approach to politics,” as he would later put it. ((http://www.journalism.columbia.edu/system/documents/477/original/nate_silver.pdf)) Initially, and with a nod to his obsession with Mexican food, he posted his empirical analyses of politics under the chili-pepper pseudonym “Poblano,” on the liberal website Daily Kos, which hosts blogs for its engaged readers.

Then, in March 2008, Silver registered his own web domain, with a title that was simultaneously and appropriately mathematical and political: fivethirtyeight.com, a reference to the total number of electors in the United States electoral college. He launched the site with a slight one-paragraph post on a recent poll from South Dakota and a summary of other recent polling from around the nation. As with The Burrito Bracket it was a modest start, but one that was modular and extensible. Silver soon added maps and charts to bolster his text.

: FiveThirtyEight two months after launch, in May 2008

Nate Silver’s real name and FiveThiryEight didn’t remain obscure for long. His mathematical modeling of the competition between Barack Obama and Hillary Clinton for the Democratic presidential nomination proved strikingly, almost creepily, accurate. Clear-eyed, well-written, statistically rigorous posts began to be passed from browsers to BlackBerries, from bloggers to political junkies to Beltway insiders. From those wired early subscribers to his site, Silver found an increasingly large audience of those looking for data-driven, deeply researched analysis rather than the conventional reporting that presented political forecasting as more art than science.

FiveThiryEight went from just 800 visitors a day in its first month to a daily audience of 600,000 by October 2008. ((Adam Sternbergh, The Spreadsheet Psychic, New York, Oct 12, 2008, http://nymag.com/news/features/51170/)) On election day, FiveThiryEight received a remarkable 3  million  visitors, more than most daily newspapers . ((http://www.journalism.columbia.edu/system/documents/477/original/nate_silver.pdf))

All of this attention for a site that most media coverage still called, with a hint of deprecation, a “blog,” or “aggregator” of polls, despite Silver’s rather obvious, if latent, journalistic skills. (Indeed, one of his roads not taken had been an offer, straight out of college, to become an assistant at The Washington Post. ((http://www.journalism.columbia.edu/system/documents/477/original/nate_silver.pdf)) ) An article in the Colorado Daily on the emergent genre represented by FiveThirtyEight led with Ken Bickers, professor and chair of the political science department at the University of Colorado, saying that such sites were a new form of “quality blogs” (rather than, evidently, the uniformly second-rate blogs that had previously existed). The article then swerved into much more ominous territory, asking whether reading FiveThirtyEight and similar blogs was potentially dangerous, especially compared to the safe environs of the traditional newspaper. Surely these sites were superficial, and they very well might have a negative effect on their audience:

Mary Coussons-Read, a professor of psychology at CU Denver, says today’s quick turnaround of information helps to make it more compelling.

“Information travels so much more quickly,” she says. “(We expect) instant gratification. If people have a question, they want an answer.”

That real-time quality can bring with it the illusion that it’s possible to perceive a whole reality by accessing various bits of information.

“There’s this immediacy of the transfer of information that leads people to believe they’re seeing everything … and that they have an understanding of the meaning of it all,” she says.

And, Coussons-Read adds, there is pleasure in processing information.

“I sometimes feel like it’s almost a recreational activity and less of an information-gathering activity,” she says.

Is it addiction?

[Michele] Wolf says there is something addicting about all that data.

“I do feel some kind of high getting new information and being able to process it,” she says. “I’m also a rock climber. I think there are some characteristics that are shared. My addiction just happens to be information.”

While there’s no such mental-health diagnosis as political addiction, Jeanne White, chemical dependency counselor at Centennial Peaks Hospital in Louisville, says political information seeking could be considered an addictive process if it reaches an extreme. ((Cindy Sutter, “Hooked on information: Can political news really be addicting?” The Colorado Daily, November 3, 2008, http://www.coloradodaily.com/ci_13105998))

This stereotype of blogs as the locus of “information” rather than knowledge, of “recreation” rather than education, was—and is—a common one, despite the wide variety of blogs, including many with long-form, erudite writing. Perhaps in 2008 such a characterization of FiveThirtyEight was unsurprising given that Silver’s only other credits to date were the Player Empirical Comparison and Optimization Test Algorithm (PECOTA) and The Burrito Bracket. Clearly, however, here was an intelligent researcher who had set his mind on a new topic to write about, with a fresh, insightful approach to the material. All he needed was a way to disseminate his findings. His audience appreciated his extraordinarily clever methods—at heart, academic techniques—for cutting through the mythologies and inadequacies of standard political commentary. All they needed was a web browser to find him.

A few journalists saw past the prevailing bias against non-traditional outlets like FiveThirtyEight. In the spring of 2010, Nate Silver bumped into Gerald Marzorati, the editor of the New York Times Magazine, on a train platform in Boston. They struck up a conversation, which eventually turned into a discussion about how FiveThirtyEight might fit into the universe of the Times, which ultimately recognized the excellence of his work and wanted FiveThirtyEight to enhance their political reporting and commentary. That summer, a little more than two years after he had started FiveThirtyEight, Silver’s “blog” merged into the Times under a licensing deal. ((Nate Silver, “FiveThirtyEight to Partner with New York Times, http://www.fivethirtyeight.com/2010/06/fivethirtyeight-to-partner-with-new.html)) In less time than it takes for most students to earn a journalism degree, Silver had willed himself into writing for one of the world’s premier news outlets, taking a seat in the top tier of political analysis. A radically democratic medium had enabled him to do all of this, without the permission of any gatekeeper.

: FiveThirtyEight on the New York Times website, 2010

* * *

The story of Nate Silver and FiveThirtyEight has many important lessons for academia, all stemming from the affordances of the open web. His efforts show the do-it-yourself nature of much of the most innovative work on the web, and how one can iterate toward perfection rather than publishing works in fully polished states. His tale underlines the principle that good is good, and that the web is extraordinarily proficient at finding and disseminating the best work, often through continual, post-publication, recursive review. FiveThirtyEight also shows the power of openness to foster that dissemination and the dialogue between author and audience. Finally, the open web enables and rewards unexpected uses and genres.

Undoubtedly it is true that the path from The Burrito Bracket to The New York Times may only be navigated by an exceptionally capable and smart individual. But the tools for replicating Silver’s work are just as open to anyone, and just as powerful. It was with that belief, and the desire to encourage other academics to take advantage of the open web, that Roy Rosenzweig and I wrote Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web. ((Daniel J. Cohen and Roy Rosenzweig, Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web (University of Pennsylvania Press, 2006).)) We knew that the web, although fifteen years old at the time, was still somewhat alien to many professors, graduate students, and even undergraduates (who might be proficient at texting but know nothing about HTML), and we wanted to make the medium more familiar and approachable.

What we did not anticipate was another kind of resistance to the web, based not on an unfamiliarity with the digital realm or on Luddism but on the remarkable inertia of traditional academic methods and genres—the more subtle and widespread biases that hinder the academy’s adoption of new media. These prejudices are less comical, and more deep-seated, than newspapers’ penchant for tales of internet addiction. This resistance has less to do with the tools of the web and more to do with the web’s culture. It was not enough for us to conclude Digital History by saying how wonderful the openness of the web was; for many academics, this openness was part of the problem, a sign that it might be like “playing tennis with the net down,” as my graduate school mentor worriedly wrote to me. ((http://www.dancohen.org/2010/11/11/frank-turner-on-the-future-of-peer-review/))

In some respects, this opposition to the maximal use of the web is understandable. Almost by definition, academics have gotten to where they are by playing a highly scripted game extremely well. That means understanding and following self-reinforcing rules for success. For instance, in history and the humanities at most universities in the United States, there is a vertically integrated industry of monographs, beginning with the dissertation in graduate school—a proto-monograph—followed by the revisions to that work and the publication of it as a book to get tenure, followed by a second book to reach full professor status. Although we are beginning to see a slight liberalization of rules surrounding dissertations—in some places dissertations could be a series of essays or have digital components—graduate students infer that they would best be served on the job market by a traditional, analog monograph.

We thus find ourselves in a situation, now more than two decades into the era of the web, where the use of the medium in academia is modest, at best. Most academic journals have moved online but simply mimic their print editions, providing PDF facsimiles for download and having none of the functionality common to websites, such as venues for discussion. They are also largely gated, resistant not only to access by the general public but also to the coin of the web realm: the link. Similarly, when the Association of American University Presses recently asked its members about their digital publishing strategies, the presses tellingly remained steadfast in their fixation on the monograph. All of the top responses were about print-on-demand and the electronic distribution and discovery of their list, with a mere footnote for a smattering of efforts to host “databases, wikis, or blogs.” ((Association of American University Presses, “Digital Publishing in the AAUP Community; Survey Report: Winter 2009-2010,” http://aaupnet.org/resources/reports/0910digitalsurvey.pdf, p. 2)) In other words, the AAUP members see themselves almost exclusively as book publishers, not as publishers of academic work in whatever form that may take. Surveys of faculty show comfort with decades-old software like word processors but an aversion to recent digital tools and methods. ((See, for example, Robert B. Townsend, “How Is New Media Reshaping the Work of Historians?”, Perspectives on History, November 2010, http://www.historians.org/Perspectives/issues/2010/1011/1011pro2.cfm)) The professoriate may be more liberal politically than the most latte-filled ZIP code in San Francisco, but we are an extraordinarily conservative bunch when in comes to the progression and presentation of our own work. We have done far less than we should have by this point in imagining and enacting what academic work and communication might look like if it was digital first.

To be sure, as William Gibson has famously proclaimed, “The future is already here—it’s just not very evenly distributed.” ((National Public Radio, “Talk of the Nation” radio program, 30 November 1999, timecode 11:55, http://discover.npr.org/features/feature.jhtml?wfId=1067220)) Almost immediately following the advent of the web, which came out of the realm of physics, physicists began using the Los Alamos National Laboratory preprint server (later renamed ArXiv and moved to arXiv.org) to distribute scholarship directly to each other. Blogging has taken hold in some precincts of the academy, such as law and economics, and many in those disciplines rely on web-only outlets such as the Social Science Research Network. The future has had more trouble reaching the humanities, and perhaps this book is aimed slightly more at that side of campus than the science quad. But even among the early adopters, a conservatism reigns. For instance, one of the most prominent academic bloggers, the economist Tyler Cowen, still recommends to students a very traditional path for their own work. ((“Tyler Cowen: Academic Publishing,” remarks at the Institute for Humane Studies Summer Research Fellowship weekend seminar, May 2011, http://vimeo.com/24124436)) And far from being preferred by a large majority of faculty, quests to open scholarship to the general public often meet with skepticism. ((Open access mandates have been tough sells on many campuses, passing only by slight majorities or failing entirely. For instance, such a mandate was voted down at the University of Maryland, with evidence of confusion and ambivalence. http://scholarlykitchen.sspnet.org/2009/04/28/umaryland-faculty-vote-no-oa/))

If Digital History was about the mechanisms for moving academic work online, this book is about how the digital-first culture of the web might become more widespread and acceptable to the professoriate and their students. It is, by necessity, slightly more polemical than Digital History, since it takes direct aim at the conservatism of the academy that twenty years of the web have laid bare. But the web and the academy are not doomed to an inevitable clash of cultures. Viewed properly, the open web is perfectly in line with the fundamental academic goals of research, sharing of knowledge, and meritocracy. This book—and it is a book rather than a blog or stream of tweets because pragmatically that is the best way to reach its intended audience of the hesitant rather than preaching to the online choir—looks at several core academic values and asks how we can best pursue them in a digital age.

First, it points to the critical academic ability to look at any genre without bias and asks whether we might be violating that principle with respect to the web. Upon reflection many of the best things we discover in scholarship are found by disregarding popularity and packaging, by approaching creative works without prejudice. We wouldn’t think much of the meandering novel Moby-Dick if Carl Van Doren hadn’t looked past decades of mixed reviews to find the genius in Melville’s writing. Art historians have similarly unearthed talented artists who did their work outside of the royal academies and the prominent schools of practice. As the unpretentious wine writer Alexis Lichine shrewdly said in the face of fancy labels and appeals to mythical “terroir”: “There is no substitute for pulling corks.” ((Quoted in Frank J. Prial, “Wine Talk,” New York Times, 17 August 1994, http://www.nytimes.com/1994/08/17/garden/wine-talk-983519.html.))

Good is good, no matter the venue of publication or what the crowd thinks. Scholars surely understand that on a deep level, yet many persist in the valuing venue and medium over the content itself. This is especially true at crucial moments, such as promotion and tenure. Surely we can reorient ourselves to our true core value—to honor creativity and quality—which will still guide us to many traditionally published works but will also allow us to consider works in some nontraditional venues such as new open access journals or articles written and posted on a personal website or institutional repository, or digital projects.

The genre of the blog has been especially cursed by this lack of open-mindedness from the academy. Chapter 1, “What is a Blog?”, looks at the history of the blog and blogging, the anatomy and culture of a genre that is in many ways most representative of the open web. Saddled with an early characterization as being the locus of inane, narcissistic writing, the blog has had trouble making real inroads in academia, even though it is an extraordinarily flexible form and the perfect venue for a great deal of academic work. The chapter highlights some of the best examples of academic blogging and how they shape and advance arguments in a field. We can be more creative in thinking about the role of the blog within the academy, as a venue for communicating our work to colleagues as well as to a lay audience beyond the ivory tower.

This academic prejudice against the blog extends to other genres that have proliferated on the open web. Chapter 2, “Genres and the Open Web,” examines the incredible variety of those new forms, and how, with a careful eye, we might be able to import some of them profitably into the academy. Some of these genres, like the wiki, are well-known (thanks to Wikipedia, which academics have come to accept begrudgingly in the last five years). Other genres are rarer but take maximal advantage of the latitude of the open web: its malleability and interactivity. Rather than imposing the genres we know on the web—as we do when we post PDFs of print-first journal articles—we would do well to understand and adopt the web’s native genres, where helpful to scholarly pursuits.

But what of our academic interest in validity and excellence, enshrined in our peer review system? Chapter 3, “Good is Good,” examines the fundamental requirements of any such system: the necessity of highlighting only a minority of the total scholarly output, based on community standards, and of disseminating that minority of work to communities of thought and practice. The chapter compares print-age forms of vetting with native web forms of assessment and review, and proposes ways that digital methods can supplement—or even replace—our traditional modes of peer review.

“The Value, and Values, of Openness,” Chapter 4, broadly examines the nature of the web’s openness. Oddly, this openness is both the easiest trait of the web to understand and its most complex, once one begins to dig deeper. The web’s radical openness not only has led to calls for open access to academic work, which has complicated the traditional models of scholarly publishers and societies; it has also challenged our academic predisposition toward perfectionism—the desire to only publish in a “final” format, purged (as much as possible) of error. Critically, openness has also engendered unexpected uses of online materials—for instance, when Nate Silver refactored poll numbers from the raw data polling agencies posted.

Ultimately, openness is at the core of any academic model that can operate effectively on the web: it provides a way to disseminate our work easily, to assess what has been published, and to point to what’s good and valuable. Openness can naturally lead—indeed, is leading—to a fully functional shadow academic system for scholarly research and communication that exists beyond the more restrictive and inflexible structures of the past.

July 26, 2011 40 Comments

Older posts →