Books – Page 3 – Dan Cohen

Sol LeWitt and the Soul of Creative and Intellectual Work

I won’t get there until the summer, but I’m already looking forward to experiencing the Sol LeWitt retrospective at the always entertaining and often thought-provoking Massachusetts Museum of Contemporary Art, better known as MASS MoCA. (For previous thoughts provoked by MASS MoCA, see my post “The Artistic and the Digital.”)

For those who can’t make it to the retrospective—and really, you have no excuse, since its limited engagement runs through 2033—the museum has just put online a terrific website for the retrospective (one that exhibits many of the principles of good design, including the use of small multiples):

The site also has mesmerizing timelapse films showing how some of the giant works of wall art were created. This being LeWitt, the works were of course created not by him but by a team of (sixty-five) artists, including many students. LeWitt died last year, but his wall drawings were always made in this way. He “merely” created the plan for a wall drawing; others carried it out, and most of the works at MASS MoCA have been produced multiple times, on walls of different sizes and in different contexts.

Among LeWitt’s many innovations was this utter disdain toward a particular instance of a creative or intellectual work. The “artwork” was not what was on the wall (or the many walls a specific design had been placed on); it was in the ideas and feelings the artist had and the communication of these ideas and feelings to the viewer. The notion of a nicely framed work of art, a work of art that gained its value from its trappings or price or uniqueness, seemed hopelessly traditional, sentimental, and superficial. It missed the point of art.

My thoughts naturally turned to Sol LeWitt and the lessons we might learn from him as I mulled over the future of books and music this weekend. On an interesting listserv I’m subscribed to a debate raged about ebooks and the joys (the heft, the feel, the smell, the cover) of physical books; at the same time, the New York Times lionized Gabriel Roth, who is recreating classic soul and funk by eschewing digital technology and who speaks of the joys (the heft, the feel, the smell, the cover) of vinyl records.

My musical tastes happen to run toward classic soul and funk, but even I can’t help but feel that in Roth’s yearning for “real” vinyl and that rare 45 and book lovers’ similar idealization of hardcovers and that rare edition there isn’t something odd going on that LeWitt would have instantly recognized and scorned: the fetishization of the object rather than its underlying ideas, a nostalgia that improperly finds authenticity in packaging.

When Gabriel Roth tells Cliff Driver, a 75-year-old keyboardist, to replace his electronic Roland with an upright piano, Driver calls him “an old, traditional type” and the Times reporter notes that “Driver and his peers would just as well leave [such analog sound] in the past with their Afros and bell-bottoms.”

The soul of soul isn’t in the vinyl; it’s in the talent and creativity of its makers. The soul of books isn’t in their format; it’s in the ideas of their authors. Sol LeWitt understood that.

December 7, 2008 1 Comment

Digital Campus #33 – Classroom Action Settlement

After an unplanned month off (our apologies, things have been more than a little busy around here), the Digital Campus podcast triumphantly returns to the airwaves with a discussion of the recent Google Book Search settlement. Also up for analysis are Microsoft’s move to the cloud, the new Google phone, and, as always, recommendations from Tom, Mills, and me about helpful sites, tools, and publications. [Subscribe to this podcast.]

November 3, 2008 Add Comment

First Impressions of the Google Books Settlement

Just announced is the settlement of the class action lawsuit that the Authors Guild, the Association of American Publishers and individual authors and publishers filed against Google for its Book Search program, which has been digitizing millions of books from libraries. (Hard to believe, but the lawsuit was first covered on this blog all the way back in November 2005.) Undoubtedly this agreement is a critical one not only for Google and the authors and publishers, but for all of us in academia and others who care about the present and future of learning and scholarship.

It will obviously take some time to digest this agreement; indeed, the Google post on it is fairly sketchy and we still need to hear details, such as the cost structure for full access the agreement now provides for. But my first impressions of some key points:

The agreement really focuses on in-copyright but out-of-print books. That is, books that can’t normally be copied but also can’t be purchased anywhere. Highlighting these books (which are numerous; most academic books, e.g., are out-of-print and have virtually no market) was smart for Google since it seems to provide value without stepping on publishers’ toes.

A second (also smart, but probably more controversial) focus is on access to the Google Books collection via libraries:

We’ll also be offering libraries, universities and other organizations the ability to purchase institutional subscriptions, which will give users access to the complete text of millions of titles while compensating authors and publishers for the service. Students and researchers will have access to an electronic library that combines the collections from many of the top universities across the country. Public and university libraries in the U.S. will also be able to offer terminals where readers can access the full text of millions of out-of-print books for free.

Again, we need to hear more details about this part of the agreement. We also need to begin thinking about how this will impact libraries, e.g., in terms of their own book acquisition plans and their subscriptions to other online databases.

Finally, and perhaps most interesting and surprising to those of us in the digital humanities, is an all-too-brief mention of computational access to these millions of books:

In addition to the institutional subscriptions and the free public access terminals, the agreement also creates opportunities for researchers to study the millions of volumes in the Book Search index. Academics will be able to apply through an institution to run computational queries through the index without actually reading individual books.

For years in this space I have been arguing for the necessity of such access (first envisioned, to give due credit, by Cliff Lynch of CNI). Inside Google they have methods for querying and analyzing these books that we academics could greatly benefit from, and that could enable new kinds of digital scholarship.

Update: The Association of American Publishers now has a page answering frequently asked questions about the agreement (have we had time to ask?).

October 28, 2008 12 Comments

Digital Campus #29 – Making It Count

Tom, Mills, and I take up the much–debated issue of whether and how digital work should count toward promotion and tenure on this episode of the podcast. We also examine the significance of university presses putting their books on Amazon’s Kindle device, and the release of better copyright records. [Subscribe to this podcast.]

Happy 4th of July!

July 4, 2008 2 Comments

Mass Digitization of Books: Exit Microsoft, What Next?

So Microsoft has left the business of digitizing millions of books—apparently because they saw it as no business at all.

This leaves Microsoft’s partner (and our partner on the Zotero project), the Internet Archive, somewhat in the lurch, although Microsoft has done the right thing and removed the contractual restrictions on the books they digitized so they may become part of IA’s fully open collection (as part of the broader Open Content Alliance), which now has about 400,000 volumes. Also still on the playing field is the Universal Digital Library (a/k/a the Million Books Project), which has 1.5 million volumes.

And then there’s Google and its Book Search program. For those keeping score at home, my sources tell me that Google, which coyly likes to say it has digitized “over a million books” so far, has actually finished scanning five million. It will be hard for non-profits like IA to catch up with Google without some game-changing funding or major new partnerships.

Foundations like the Alfred P. Sloan Foundation have generously made substantial (million-dollar) grants to add to the digital public domain. But with the cost of digitizing 10 million pre-1923 books at around $300 million, where might this scale of funds and new partners come from? To whom can the Open Content Alliance turn to replace Microsoft?

Frankly, I’ve never understood why institutions such as Harvard, Yale, and Princeton haven’t made a substantial commitment to a project like OCA. Each of these universities has seen its endowment grow into the tens of billions in the last decade, and each has the means and (upon reflection) the motive to do a mass book digitization project of Google’s scale. $300 million sounds like a lot, but it’s less than 1% of Harvard’s endowment and my guess is that the amount is considerably less than all three universities are spending to build and fund laboratories for cutting-edge sciences like genomics. And a 10 million public-domain book digitization project is just the kind of outrageously grand project HYP should be doing, especially if they value the humanities as much as the sciences.

Moreover, Harvard, Yale, and Princeton find themselves under enormous pressure to spend more of their endowment for a variety of purposes, including tuition remission and the public good. (Full and rather vain disclosure: I have some relationship to all three institutions; I complain because I love.) Congress might even get into the act, mandating that universities like HYP spend a more generous minimum percentage of their endowment every year, just like private foundations who benefit (as does HYP, though in an indirect way) from the federal tax code.

In one stroke HYP could create enormous good will with a moon-shot program to rival Google’s: free books for the world. (HYP: note the generous reaction to, and the great press for, MIT’s OpenCourseWare program.) And beyond access, the project could enable new forms of scholarship through computational access to a massive corpora of full texts.

Alas, Harvard and Princeton partnered with Google long ago. Princeton has committed to digitizing about one million volumes with Google; Harvard’s number is unclear, but probably smaller. The terms of the agreement with Google are non-exclusive; Harvard and Princeton could initiate their own digitization projects or form other partnerships. But I suspect that would be politically difficult since the two universities are getting free digitization services from Google and would have to explain to their overseers why they want to replace free with very expensive. (The answer sounds like Abbott and Costello: the free program produces something that’s not free, while the expensive one is free.)

If Google didn’t exist, Harvard would probably be the most obvious candidate to pull off the Great Digitization of Widener. Not only does it have the largest endowment; historian Robert Darnton, a leader in thinking about the future (and the past) of the book, is now the director of the Harvard library system. Harvard also recently passed an open access mandate for the publications of its faculty.

Princeton has the highest per-student endowment of any university, and could easily undertake a mass digitization project of this scale. Perhaps some of the many Princeton alumni who went on to vast riches on the Web, such as EBay‘s Meg Whitman (who has already given $100 million to Princeton) or Amazon‘s Jeff Bezos, could pitch in.

But Harvard’s and Princeton’s Google “non-exclusive” partnership makes these outcomes unlikely, as does the general resistance in these universities to spending science-scale funds outside of the sciences (unless it’s for a building).

That leaves Yale. Yale chose Microsoft last year to do its digitization, and has now been abandoned right in the middle of its project. Since Microsoft is apparently leaving its equipment and workflow in place at partner institutions, Yale could probably pick up the pieces with an injection of funding from its endowment or from targeted alumni gifts. Yale just spent an enormous amount of money on a new campus for the sciences, and this project could be seen as a counterbalance for the humanities.

Or, HYP could band together and put in a mere $100 million each to get the job done.

Is this likely to happen? Of course not. HYP and other wealthy institutions are being asked to spend their prodigious endowments on many other things, and are reluctant to up their spending rate at all. But I believe a HYP or HYP-like solution is much more likely than public funding for this kind of project, as the Human Genome Project received.

May 29, 2008 9 Comments

Still Waiting for a Real Google Book Search API

For years on this blog, at conferences, and even in direct conversations with Google employees I have been agitating for an API (application programming interface) for Google Book Search. (For a summary of my thoughts on the matter, see my imaginatively titled post, “Why Google Books Should Have an API.”) With the world’s largest collection of scanned books, I thought such an API would have major implications for doing research in the humanities. And I looked forward to building applications on top of the API, as I had done with my Syllabus Finder.

So why was I disappointed when Google finally released an API for their book scanning project a couple of weeks ago?

My suspicion began with the name of the API itself. Even though the URL for the API is http://code.google.com/apis/books/, suggesting that this is the long-awaited API for the kind of access to Google Books that I’ve been waiting for, the rather prosaic and awkward title of the API suggests otherwise: The Google Book Search Book Viewability API. From the API’s home page:

The Google Book Search Book Viewability API enables developers to:

Link to Books in Google Book Search using ISBNs, LCCNs, and OCLC numbers

Know whether Google Book Search has a specific title and what the viewability of that title is

Generate links to a thumbnail of the cover of a book

Generate links to an informational page about a book

Generate links to a preview of a book

These are remarkably modest goals. Certainly the API will be helpful for online library catalogs and other book services (such as LibraryThing) that wish to embed links to Google’s landing pages for books and (when copyright law allows) links to the full texts. The thumbnails of book covers will make OPACs look prettier.

But this API does nothing to advance the kind of digital scholarship I have advocated for in this space. To do that the API would have to provide direct access to the full OCRed text of the books, to provide the ability to mine these texts for patterns and to combine them with other digital tools and corpora. Undoubtedly copyright concerns are part of the story here, hobbling what Google can do. But why not give full access to pre-1923 books through the API?

I’m not hopeful that there are additional Google Book Search APIs coming. If that were the case the URL for the viewability API would be http://code.google.com/apis/books/viewability/. The result is that this API simply seems like a way to drive traffic to Google Books, rather than to help academia or to foster a external community of developers, as other Google APIs have done.

March 31, 2008 6 Comments

Google Book Search Begins Adding Quality Control Measures

As predicted in this space six months ago, Google has added the ability for users to report missing or poorly scanned pages in their Book Search. (From my post “Google Books: Champagne or Sour Grapes?“: “Just as they have recently added commentary to Google News, they could have users flag problematic pages.”)

I’ll say it again: criticism of Google Book Search that focuses on quality chases a red herring—something that Google can easily fix. Let’s focus instead on more substantive issues, such as the fact that Google’s book archive is not truly open.

February 20, 2008 Add Comment

The Case for Open Access Books

Open Book This month’s First Monday has one of the most pragmatic, sensible articles I’ve read about the promise and perils of open access books. In “Open access book publishing in writing studies: A case study,” by Charles Bazerman, David Blakesley, Mike Palmquist, and David Russell, the authors describe their experience deciding to eschew a traditional publication arrangement with an academic press (what supposedly gives our monographs the sheen of value and gets us tenure). Instead they publish an edited volume straight to the web.

Along the way the authors discover that many of the concerns that humanities scholars have about publishing in a free and open way are either overblown or simply myths. Only one junior scholar (out of the 20 scholars asked to contribute) worries about promotion and tenure. And indeed all of the scholars who contribute to the edited volume receive credit for their chapters. More important, the editors and contributors are surprised to discover that the book makes its way rapidly and powerfully into the consciousness of their field:

[The] initial reaction [to the book] did not prepare us for the acceptance the book ultimately received from the academic communities to which it was addressed.

Since its publication, the Writing Selves/Writing Societies Web page has been visited more than 85,000 times by more than 36,000 unique visitors. The trend, interestingly, has been a steady increase in visits over the past four years, with more than 30,000 occurring in the past 12 months. Since its publication, the book has been downloaded in its entirety more than 36,000 times. Individual essays have been downloaded more than 108,000 times. In terms of perceived quality of the scholarly work in the collection, the book has been well received by the field. Within six months of publication, the book was positively reviewed by four journals: two print and two electronic. One year after its publication, in the keynote address to the Conference on College Composition and Communication, the major annual conference in writing studies, Kathleen Blake Yancey quoted extensively from chapters in the book. And the book has continued to figure prominently in scholarly work subsequently published in the field of composition and rhetoric.

According to a search of Google Scholar, which indexes scholarly publications available on the Web (29 September 2006), the book or individual chapters in it has been cited 68 times, according to a search of Google Scholar. Although we do not have comprehensive comparison data for print publications, we suspect that this is a higher rate. A print–only collection with about the same number of chapters (15) published in the same year as Writing Selves/Writing Societies (and winner of a best book award given by a leading journal in the field), had far fewer citations: 10. Our experience suggests that open access scholarly books follow a pattern of citation similar to journals, which indicate that open access journal articles in a wide range of fields are both more likely to be cited and likely to be cited more quickly. Our experience with Writing Selves/Writing Societies supports this…

Overall, Writing Selves/Writing Societies appears to have entered into the system of book publishing neatly, in spite of the fact that it was not published by a traditional academic publisher and was being offered at no charge.

Beyond the questions of business models, scholarly influence, and promotion and tenure, there is also the nagging question Roy Rosenzweig posed in “Should Historical Scholarship Be Free?” At the time Roy was the Vice President for Research at the American Historical Association, and was pushing for open access to the American Historical Review. (Ultimately he got the powers that be to agree to put AHR articles online for free, although the book reviews remain behind gates.)

Besides the ethical good of publishing in an open access model—sharing educational and scholarly materials—Roy noted that the work of most scholars is funded, directly or indirectly, by the public. Noting the National Institutes of Health‘s recent mandate that grantees share their work openly with the public, Roy wrote:

The new policy affects few historians, but its implications ought to give us serious pause. After all, historical research also benefits directly (albeit considerably less generously) through grants from federal agencies like the National Endowment for the Humanities; even more of us are on the payroll of state universities, where research support makes it possible for us to write our books and articles. If we extend the notion of “public funding” to private universities and foundations (who are, of course, major beneficiaries of the federal tax codes), it can be argued that public support underwrites almost all historical scholarship.

Do the fruits of this publicly supported scholarship belong to the public? Should the public have free access to it?

Roy, of course, thought this meant that like NIH grantees we should provide open access to our articles, such as those in the AHR. But doesn’t the same argument hold true for books?

[Postscript: Some scientists have been wondering the same thing.]

[Image credit]

January 24, 2008 3 Comments

MacEachern and Turkel, The Programming Historian

Bill Turkel, the always creative mind behind Digital History Hacks (logrolling disclosure: Bill is a friend of CHNM, a collaborator on various fronts, and was the thought-provoking guest on Digital Campus #9; still, he deserves the compliments), and his colleague at the University of Western Ontario, Alan MacEachern, are planning to write a book entitled The Programming Historian. Better yet, the book will be open access and hosted on the Network in Canadian History & Environment (NiCHE) site. Bill’s summary of the book on his blog sounds terrific. Can’t wait to read it and use it in my classes.

January 14, 2008 Add Comment

The Digital Critique of “To Read or Not To Read”

More healthy debate about the NEA’s jeremiad To Read or Not To Read is happening on the Institute for the Future of the Book’s blog. Let me try to summarize my critique of the NEA report, and you should be sure to read the whole report so as not to be swiftly criticized by the evidently touchy authors and their supporters.

I have no doubt that book reading is declining. My offense at the report has to do with the second-class status of the digital realm throughout. Sunil Iyengar, the Director of Research & Analysis for the NEA states on p. 23 of the report:

Unless “book-reading” is speciﬁcally mentioned, study results on voluntary reading should be taken as referencing all varieties of leisure reading (e.g., magazines, newspapers, online reading), and not books alone. [my emphasis]

But the rest of the report makes it almost impossible to see how “online reading” was actually included as “voluntary reading” and lauded as such. While there are indeed charts about “book reading,” most charts are at best ambiguous about what “reading” means and at worst seem to make the online world devoid of words. For example, Table 3E, on p. 40, lists the “Weekly Average Hours and/or Minutes Spent on Various Activities by American Children, 2002-3.” But bizarrely “computer activities” (2:45) are distinct from “reading” (1:17), as if no reading occurred during those online hours.

More generally—and this is what I think many of us in the digital humanities are reacting to—the report is suffused with the nostalgic view of armchair leisure book reading (a nostalgia I share, by the way, and indeed deeply yearn for as an overstretched father of young children with a very busy day job). The report thus belittles the work of all of us trying to move serious reading and scholarship where it will surely go in the coming decades—online. As a historian, it reminds me of the early modern disparagements of writing and reading in the vernacular, back when only Latin would do for “serious” study and scholarship.

The double standard for digital reading versus paper reading can be seen in a letter to the Chronicle of Higher Ed by Mark Bauerlein. Bauerlein’s retort to Matt Kirschenbaum is to look at “what eye-tracking technology reveals about how users scan Web pages.” I assume his point is that these studies reveal the ADHD that we “votaries of online and screen reading” have, skimming and grazing rather than “really” reading. But can you imagine what would be revealed in eye tracking studies of readers of newspapers and magazines? Ad agencies have long known—indeed, it is the first principle of graphic design in advertising—that most pages are glanced at for mere seconds or even a fraction of a second, not “read.” (The report, sensing this potential criticism and keeping to its theme, emphasizes on p. 51 that teenagers are more likely to “skim” the “news sections.”)

But what of books? I’m sure I’m not the only academic who would like to strap eye trackers onto the heads of the book prize committees for professional academic organizations, who are supposed to read dozens or hundreds of books in short order—but surely skim (or worse). Matt Kirschenbaum and many others are simply making what, upon reflection, is a rather commonsensical point: that “reading” has always included multiple styles, including deep linear styles and more flighty ones. As Roy Rosenzweig and I point out in our book Digital History, we academics should be finding ways to encourage long-form reading on the screen (where all reading will ultimately head anyway) rather than, in our bookish nostalgia, ceding the medium to web usability specialists who encourage blurb writing for short attention spans.

Ultimately, To Read or Not To Read seems strangely dated in 2008. On its pages it remains obsessed with TV just at the point when kids’ leisure time pursuits are moving swiftly online. In an age when an “academic blog” is no longer an oxymoron, the report inexplicably mentions “blogs”—the source of so much online reading and writing and now even part of so many classrooms—on a single page out of 98, and only to dismiss them as pseudo-reading and writing in a worn critique that resorts to quoting from Sven Birkerts’ early-Web Gutenberg Elegies (1994). The report also oddly dismisses the exponential rise in online newspaper readership while lamenting the 2 or 3 percent yearly decline in paper “subscribers.”

After reading the civics portion of the report (pp. 86-92), which particularly emphasizes the importance of book reading (see pp. 88-89), a question came to mind: might email, IM, texting, social networking and other online pursuits enhance “civic engagement” and understanding more than reading a good thick policy treatise? The smartphone-bearing, Facebook-using teenagers currently working (often virtually) on the presidential primaries in the United States have little time for leisure reading, and a good number of them are probably not “voluntary readers” of the Platonic sort envisioned in To Read or Not To Read. But they are learning—and doing, and reading—much more in the digital realm than this myopic report can conceive.

January 10, 2008 9 Comments