Copyright – Dan Cohen

Digital Campus #29 – Making It Count

Tom, Mills, and I take up the much–debated issue of whether and how digital work should count toward promotion and tenure on this episode of the podcast. We also examine the significance of university presses putting their books on Amazon’s Kindle device, and the release of better copyright records. [Subscribe to this podcast.]

Happy 4th of July!

July 4, 2008 2 Comments

Digital Campus #24 – Running from the Law

On the first podcast of our second year of the Digital Campus podcast, we discuss some of the legal constraints and threats that academic content providers and digital tool builders face—namely, an increasingly confusing and nightmarish patchwork of regulations from copyright to patents. We talk about the ways in which we have tried to pursue fair use and new technology without getting sued. In the news roundup we cover the launch of offline Google Docs and Internet safety classes for kids. [Subscribe to this podcast.]

April 8, 2008 1 Comment

A Closer Look at the National Archives-Footnote Agreement

I’ve spent the past two weeks trying to get a better understanding of the agreement signed by the National Archives and Footnote, about which I raised several concerns in my last post. Before making further (possibly unfounded) criticisms I thought it would a good idea to talk to both NARA and Footnote. So I picked up the phone and found several people eager to clarify things. At NARA, Jim Hastings, director of access programs, was particularly helpful in explaining their perspective. (Alas, NARA’s public affairs staff seemed to have only the sketchiest sense of key details.) Most helpful—and most eager to rebut my earlier post—were Justin Schroepfer and Peter Drinkwater, the marketing director and product lead at Footnote. Much to their credit, Justin and Peter patiently answered most of my questions about the agreement and the operation of the Footnote website.

Surprisingly, everyone I spoke to at both NARA and Footnote emphasized that despite the seemingly set-in-stone language of the legal agreement, there is a great deal of latitude in how it is executed, and they asked me to spread the word about how historians and the general public can weigh in. It has received virtually no publicity, but NARA is currently in a public comment phase for the Footnote (a/k/a iArchives) agreement. Scroll down to the bottom of the “Comment on Draft Policy” page at NARA’s website and you’ll find a request for public comment (you should email your thoughts to Vision@nara.gov). It’s a little odd to have a request for comment after the ink is dry on an agreement or policy, and this URL probably should have been included in the press release of the Footnote agreement, but I do think after speaking with them that both NARA and Footnote are receptive to hearing responses to the agreement. Indeed, in response to this post and my prior post on the agreement, Footnote has set up a web page, “Finding the Right Balance,” to receive feedback from the general public on the issues I’ve raised. They also asked me to round up professional opinion on the deal.

I assume Footnote will explain their policies in greater depth on their blog, but we agreed that it would be helpful to record some important details of our conversations in this space. Here are the answers Justin and Peter gave to a few pointed questions.

When I first went to the Footnote site, I was unpleasantly surprised that it required registration even to look at “milestone” documents like Lincoln’s draft of the Gettysburg Address. (Unfortunately, Footnote doesn’t have a list of all of its free content yet, so it’s hard to find such documents.) Justin and Peter responded that when they launched the site there was an error in the document viewer, so they had to add authentication to all document views. A fix was rolled out on January 23, and it’s now possible to view these important documents without registering.

You do need to register, however, to print or download any document, whether it’s considered “free” or “premium.” Why? Justin and Peter candidly noted that although they have done digitization projects before, the National Archives project, which contains millions of critical—and public domain—documents, is a first for them. They are understandably worried about the “leakage” of documents from their site, and want to take it one step at a time. So to start they will track all downloads to see how much escapes, especially in large batches. I noted that downloading and even reusing these documents (even en masse) very well might be legal, despite Footnote’s terms of service, because the scans are “slavish” copies of the originals, which are not protected by copyright. Footnote lawyers are looking at copyright law and what other primary-source sites are doing, and they say that they view these initial months as a learning experience to see if the terms of service can or should change. Footnote’s stance on copyright law and terms of usage will clearly be worth watching.

Speaking of terms of usage, I voiced a similar concern about Footnote’s policies toward minors. As you’ll recall, Footnote’s terms of service say the site is intended for those 18 and older, thus seeming to turn away the many K-12 classes that could take advantage of it. Justin and Peter were most passionate on this point. They told me that Footnote would like to give free access to the site for the K-12 market, but pointed to the restrictiveness of U.S. child protection laws. Because the Footnote site allows users to upload documents as well as view them, they worry about what youngsters might find there in addition to the NARA docs. These laws also mandate the “over 18” clause because the site captures personal information. It seems to me that there’s probably a technical solution that could be found here, similar to the one PBS.org uses to provide K-12 teaching materials without capturing information from the students.

Footnote seems willing to explore such a possibility, but again, Justin and Peter chalked up problems to the newness of the agreement and their inexperience running an interactive site with primary documents such as these. Footnote’s lawyers consulted (and borrowed, in some cases) the boilerplate language from terms of service at other sites, like Ancestry.com. But again, the Footnote team emphasized that they are going to review the policies and look into flexibility under the laws. They expect to tweak their policies in the coming months.

So, now is your chance to weigh in on those potential changes. If you do send a comment to either Footnote or NARA, try to be specific in what you would like to see. For instance, at the Center for History and New Media we are exploring the possibility of mining historical texts, which will only be possible to do on these millions of NARA documents if the Archives receives not only the page images from Footnote but also the OCRed text. (The handwritten documents cannot be automatically transcribed using optical character recognition, of course, but there are many typescript documents that have been converted to machine-readable text.) NARA has not asked to receive the text for each document back from Footnote—only the metadata and a combined index of all documents. There was some discussion that NARA is not equipped to handle the flood of data that a full-text database would entail. Regardless, I believe it would be in the best interest of historical researchers to have NARA receive this database, even if they are unable to post it to the web right away.

February 5, 2007 Add Comment

The Flawed Agreement between the National Archives and Footnote, Inc.

I suppose it’s not breaking news that libraries and archives aren’t flush with cash. So it must be hard for a director of such an institution when a large corporation, or even a relatively small one, comes knocking with an offer to digitize one’s holdings in exchange for some kind of commercial rights to the contents. But as a historian worried about open access to our cultural heritage, I’m a little concerned about the new agreement between Footnote, Inc. and the United States National Archives. And I’m surprised that somehow this agreement has thus far flown under the radar of all of those who attacked the troublesome Smithsonian/Showtime agreement. Guess what? From now until 2012 it will cost you $100 a year, or even more offensively, $1.99 a page, for online access to critical historical documents such as the Papers of the Continental Congress.

This was the agreement signed by Archivist of the United States Allen Weinstein and Footnote, Inc., a Utah-based digital archives company, on January 10, 2007. For the next five years, unless you have the time and money to travel to Washington, you’ll have to fork over money to Footnote to take a peek at Civil War pension documents or the case files of the early FBI. The National Archives says this agreement is “non-exclusive”—I suppose crossing their fingers that Google will also come along and make a deal—but researchers shouldn’t hold their breaths for other options.

Footnote.com, the website that provide access to these millions of documents, charges for anything more than viewing a small thumbnail of a page or photograph. Supposedly the value-added of the site (aside from being able to see detailed views of the documents) is that it allows you to save and annotate documents in your own library, and share the results of your research (though not the original documents). Hmm, I seem to remember that there’s a tool being developed that will allow you to do all of that—for free, no less.

Moreover, you’ll also be subject to some fairly onerous terms of usage on Footnote.com, especially considering that this is our collective history and that all of these documents are out of copyright. (For a detailed description of the legal issues involved here, please see Chapter 7 of Digital History, “Owning the Past?”, especially the section covering the often bogus claims of copyright on scanned archival materials.) I’ll let the terms speak for themselves (plus one snide aside): “Professional historians and others conducting scholarly research may use the Website [gee, thanks], provided that they do so within the scope of their professional work, that they obtain written permission from us before using an image obtained from the Website for publication, and that they credit the source. You further agree that…you will not copy or distribute any part of the Website or the Service in any medium without Footnote.com’s prior written authorization.”

Couldn’t the National Archives have at least added a provision to the agreement with Footnote to allow students free access to these documents? I guess not; from the terms of usage: “The Footnote.com Website is intended for adults over the age of 18.” What next? Burly bouncers carding people who want to see the Declaration of Independence?

January 15, 2007 2 Comments

Understanding the 2006 DMCA Exemptions

If Emerson was correct that genius is the ability to hold two contradictory ideas in the mind simultaneously, the American legal system just gained enough IQ points to join Mensa. Already, our collective legal mind was showing its vast intelligence trying to square the liberties of the people with the demands of government and industry. For instance, in Alaska you can possess up to an ounce of marijuana legally, but can be charged with a felony for possessing more than four ounces or for selling the “illegal” drug. (Lesson: don’t buy in bulk.) If you’re gay you can legally join the United States military, but you can’t talk about being gay, because that’s illegal and you will be discharged. And now, more pretzel logic: as of last week, it is illegal to break the copy protection on a DVD or distribute “circumvention” technologies, but if you’re a film or media studies professor you can break the copy protection for pedagogical uses. But how, you might ask, would a film or media studies professor with no background in encryption, programming, and hacking crack the copy protection on a DVD?

Good question. It was the first question I posed last weekend to Peter Decherney as my addled brain tried to grasp the significance of the new exemptions to the DMCA granted by the Librarian of Congress, James Billington. Peter is a professor at the University of Pennsylvania and deserves all of our thanks for spearheading the effort to put some cracks into the DMCA. (Full disclosure: Peter is a very good friend. But I still think—objectively—that he deserves an enormous amount of praise for persevering in the face of the MPAA’s lawyers to get the exemption for film professors. He told me the MPAA doggedly fights every proposed exemption, reasonable or not, so this was a long way from a trivial exercise.) It’s unfortunate to see many initial reactions to the new exemptions lamenting that they are only for three years or that they merely enshrine the DMCA’s destruction of fair use principles.

Well, sure. These new exemptions are indeed limited in scope and in an ideal world Peter and his colleagues should not have had to ask for these rights or fight for months to get them. (And then do the process all over again in 2009.) But there are a few bright spots here for those of us who believe that the balance between the rights of copyright owners and users of their content has swung much too far in the direction of the former.

First, as Peter pointed out to me, the exemption for film and media studies professors is the first time an exemption has been carved out for a class of people. It’s not hard to imagine how this opens the door for other groups of people to evade the strict rules of the DMCA. Most obviously, many of my colleagues in the History and Art History department at George Mason University use film clips in their courses. Shouldn’t they be exempt too? Shouldn’t a psychology professor who wants to store clips from films on her hard drive to show in class as illustrations of mental phenomena be allowed to do so? The MPAA will undoubtedly say no every step of the way, but you can see how a well-reasoned and reasonable march of exemptions will begin to restore some sanity to the copyright regime. Academia could merely be the beachhead.

Second, and related to the first point, getting a DMCA exemption is a daunting task, especially for those of us without legal training. Peter and his colleagues have provided a blueprint for academics seeking other exemptions in the future. It would be good if they could pass along their wisdom. Thankfully, they have already set up a website that will serve as a clearinghouse of information for the “educational use of media” exemption. A plainspoken description of how they got the exemption in the first place would be helpful as well.

Finally, the new exemptions have raised the odd contradiction I mentioned in the introduction to this piece, a contradiction that helpfully highlights the absurdity of current law. Film professors can now legally proceed in their work (saving clips from DVDs for their classes), except that they have to break the law to do this legal work (by encouraging and participating in an illegal market for cracking software). Similar absurdities abound in the digital realm; recently the MPAA went after a company that fills iPods with video from DVDs the iPod owners have bought.

So now the question becomes: Does our legal system follow the dictates of Emerson’s genius, or of common sense? And how do those moderate pot smokers in Alaska get their marijuana, anyway?

December 3, 2006 Add Comment

Doing Digital History June 2006 Workshop

If your work deals in some way with the history of science, technology, or industry, and you would like to learn how to create online history projects, the Echo Project at the Center for History and New Media is running another one of our free, week-long workshops. The workshop covers the theory and practice of digital history; the ways that digital technologies can facilitate the research, teaching, writing and presentation of history; genres of online history; website infrastructure and design; document digitization; the process of identifying and building online history audiences; and issues of copyright and preservation.

As one of the teachers for this workshop, I can say somewhat immodestly that it’s really a great way to get up to speed on the many (sometimes complicated) elements necessary for website development. Unfortunately space is limited, so be sure to apply online by March 10, 2006. The workshop will take place from June 12-16, 2006, at George Mason University’s Arlington campus, right outside of Washington, DC. It is co-sponsored by the American Historical Association and the National History Center, and funded by the Alfred P. Sloan Foundation. There is no registration fee, and a limited number of fellowships are available to defray the costs of travel and lodging for graduate students and young scholars. Hope to see you there!

February 14, 2006 Add Comment

Impact of Field v. Google on the Google Library Project

I’ve finally had a chance to read the federal district court ruling in a case, Field v. Google, that has not been covered much (except in the technology press), but which has obvious and important implications for the upcoming battle over the legality of Google’s library digitization project. The case, Field v. Google, involved a lawyer who dabbles in some online poetry, and who was annoyed that Google’s spider cached a version of his copyrighted ode to delicious tea (“Many of us must have it iced, some of us take it hot and combined with milk, and others are not satisfied unless they know that only the rarest of spices and ingredients are contained therein…”). Field sued Google for copyright infringement; Google argued fair use. Field lost the case, with most of his points rejected by the court. The Electronic Frontier Foundation has hailed Google’s victory as a significant one, and indeed there are some very good aspects of the ruling for the book copying case. But there also seem to be some major differences between Google’s wholesale copying of websites and its wholesale copying of books that the court implicitly recognized. The following seem to be the advantages and disadvantages of this ruling for Google, the University of Michigan, and others who wish to see the library project reach completion.

Courts have traditionally used four factors to determine fair use—the purpose of the copying, the nature of the work, the extent of the copying, and the effect on the market of the work.

On purpose, the court ruled that Google’s cache was not simply a copy of that work, but added substantial value that was important to users of Google’s search engine. Users could still read Field’s poetry even if his site was down; they could compare Google’s cache with the original site to see if any changes had been made; they could see their search terms highlighted in the page. Furthermore, with a clear banner across the top Google tells its users that this is a copy and provides a link to the original. It also provides methods for website owners to remove their pages from the cache. This emphasis on opt out seems critical, since Google has argued that book publishers can simply tell them if they don’t want their books digitized. Also, the court ruled that the Google’s status as a commercial enterprise doesn’t matter here. Advantage for Google et al.

On the nature of the work, the court looked less at the quality of Field’s writing (“Simple flavors, simple aromas, simple preparation…”) than at Field’s intentions. Since he “sought to make his works available to the widest possible audience for free” by posting his poems on the Internet, and since Field was aware that he could (through the robots.txt file) exclude search engines from indexing his site, the court thought Field’s case with respect to this fair use factor was weakened. But book publishers and authors fighting Google will argue that they do not intend this free and wide distribution. Disadvantage for Google et al.

One would think that the third factor, the extent of the copying, would be a clear loser for Google, since they copy entire web pages as a matter of course. But the Nevada court ruled that because Google’s cache serves “multiple transformative and socially valuable purposes…that could not be effectively accomplished by using only portions” of web pages, and because Google points users to the original texts, this wholesale copying was OK. You can see why Google’s lawyers are overjoyed by this part of the ruling with respect to the book digitization project. Big advantage for Google et al.

Perhaps the cruelest part of the ruling had to do with the fourth factor of fair use, the effect on the market of the work. The court determined from its reading of Field’s ode to tea that “there is no evidence of any market for Field’s works.” Ouch. But there is clearly a market for many books that remain in copyright. And since the Google library project has just begun we don’t have any economic data about Google Book Search’s impact on the market for hard copies. No clear winner here.

In additional, the Nevada court added a critical fifth factor for determining fair use in this case: “Google’s Good Faith.” By providing ways to include and exclude materials from its cache, by providing a way to complain to the company, and by clearly spelling out its intentions in the display of the cache, the court determined that Google was acting in good faith—it was simply trying to provide a useful service and had no intention to profit from Field’s obsession with tea. Google has a number of features that replicate this sense of good faith in its book program, like providing links to libraries and booksellers, methods for publishers and authors to complain, and techniques for preventing user copies of copyrighted works. Advantage for Google et al.

A couple of final points that may work against Google. First, the court made a big deal out of the fact that the cache copying was completely automated, which the Google book project is clearly not. Second, the ruling constantly emphasizes the ability of Field to opt out of the program, but upset book publishers and authors believe this should be opt in, and it’s quite possible another court could agree with that position, which would weaken many of the points made above.

February 9, 2006 Add Comment

Google, the Khmer Rouge, and the Public Good

Like Daniel into the lion’s den, Mary Sue Coleman, the President of the University of Michigan, yesterday went in front of the Association of American Publishers to defend her institution’s participation in Google’s massive book digitization project. Her speech, “Google, the Khmer Rouge and the Public Good,” is an impassioned defense of the project, if a bit pithy at certain points. It’s worth reading in its entirety, but here are some highlights with commentary.

In two prior posts, I wondered what will happen to those digital copies of the in-copyright books the university receives as part of its deal with Google. Coleman obviously knew that this was a major concern of her audience, and she went overboard to satisfy them: “Believe me, students will not be reading digital copies of ‘Harry Potter’ in their dorm rooms…We will safeguard the entirety of this archive with the same diligence we accord our most sensitive materials at the University: medical records, Defense Department data, and highly infectious disease agents used in research.” I’m not sure if books should be compared to infectious disease agents, but it seems clear that the digital copies Michigan receives are not likely to make it into “the wild” very easily.

Coleman reminded her audience that for a long time the books in the Michigan library did not circulate and were only accessible to the Board of Regents and the faculty (no students allowed, of course). Finally Michigan President James Angell declared that books were “not to be locked up and kept away from readers, but to be placed at their disposal with the utmost freedom.” Coleman feels that the Google project is a natural extension of that declaration, and more broadly, of the university’s mission to disseminate knowledge.

Ultimately, Coleman turns from more abstract notions of sharing and freedom to the more practical considerations of how students learn today: “When students do research, they use the Internet for digitized library resources more than they use the library proper. It’s that simple. So we are obligated to take the resources of the library to the Internet. When people turn to the Internet for information, I want Michigan’s great library to be there for them to discover.” Sounds about right to me.

February 7, 2006 Add Comment

2006: Crossroads for Copyright

The coming year is shaping up as one in which a number of copyright and intellectual property issues will be highly contested or resolved, likely having a significant impact on academia and researchers who wish to use digital materials in the humanities. In short, at stake in 2006 are the ground rules for how professors, teachers, and students may carry out their work using computer technology and the Internet. Here are three major items to follow closely.

Item #1: What Will Happen to Google’s Massive Digitization Project?

The conflict between authors, publishers, and Google will probably reach a showdown in 2006, with either the beginning of court proceedings or some kind of compromise. Google believes it has a good case for continuing to digitize library books, even those still under copyright; some authors and most publishers believe otherwise. So far, not much in the way of compromise. Indeed, if you have been following the situation carefully, it’s clear that each side is making clever pre-trial maneuvers to bolster their case. Google cleverly changed the name of its project to Google Book Search from Google Print, which emphasizes not the (possibly illegal) wholesale digitization of printed works but the fact that the program is (as Google’s legal briefs assert) merely a parallel project to their indexing of the web. The implication is that if what they’re doing with their web search is OK (for which they also need to make copies, albeit of born-digital pages), then Google Book Search is also OK. As Larry Lessig, Siva Vaidhyanathan, and others have highlighted, if the ruling goes against Google given this parallelism (“it’s all in the service of search”), many important web services might soon be illegal as well.

Meanwhile, the publishers have made some shrewd moves of their own. They have announced a plan to work with Amazon to accept micropayments for a few page views from a book (e.g., a recipe). And HarperCollins recently decided to embark on its own digitization program, ostensibly to provide book searches through its website. If you look at the legal basis of fair use (which Google is professing for its project), you’ll understand why these moves are important to the publishers: they can now say that Google’s project hurts the market for their works, even if Google shows only a small amount of a copyrighted book. In addition, a judge can no longer rule that Google is merely providing a service of great use to the public that the publishers themselves are unable or unwilling to provide. And I thought the only smart people in this debate were on Google’s side.

If you haven’t already read it, I recommend looking at my notes on what a very smart lawyer and a digital visionary have to say about the impending lawsuits.

Item #2: Chipping Away at the DMCA

In the first few months of 2006, the Copyright Office of the United States will be reviewing the dreadful Digital Millenium Copyright Act—one of the biggest threats to scholars who wish to use digital materials. The DMCA has effectively made many researchers, such as film studies professors, criminals, because they often need to circumvent rights management protection schemes on devices like DVDs to use them in a classroom or for in-depth study (or just to play them on certain kinds of computers). This circumvention is illegal under the law, even if you own the DVD. Currently there are only four minor exemptions to the DMCA, so it is critical that other exemptions for teachers, students, and scholars be granted. If you would like to help out, you can go to the Copyright Office’s website in January and sign your name to various efforts to carve out exemptions. One effort you can join, for instance, is spearheaded by Peter Decherney and others at the University of Pennsylvania. They want to clear the way for fully legal uses of audiovisual works in educational settings. Please contact me if you would like to add your name to that important effort.

Item #3: Libraries Reach a Crossroads

In an upcoming post I plan to discuss at length a fascinating article (to be published in 2006) by Rebecca Tushnet, a Georgetown law professor, that highlights the strange place at which libraries have arrived in the digital age. Libraries are the center of colleges and universities (often quite literally), but their role has been increasingly challenged by the Internet and the protectionist copyright laws this new medium has engendered. Libraries have traditionally been in the long-term purchasing and preservation business, but they increasing spend their budgets on yearly subscriptions to digital materials that could disappear if their budgets shrink. They have also been in the business of sharing their contents as widely as possible, to increase knowledge and understanding broadly in society; in this way, they are unique institutions with “special concerns not necessarily captured by the end-consumer-oriented analysis with which much copyright scholarship is concerned,” as Prof. Tushnet convincingly argues. New intellectual property laws (such as the DMCA) threaten this special role of libraries (aloof from the market), and if they are going to maintain this role, 2006 will have to be the year they step forward and reassert themselves.

December 29, 2005 Add Comment

Clifford Lynch and Jonathan Band on Google Book Search

The topic for the November 2005 Washington DC Area Forum on Technology and the Humanities focused on “Massive Digitization Programs and Their Long-Term Implications: Google Print, the Open Content Alliance, and Related Developments.” The two speakers at the forum, Clifford Lynch and Jonathan Band, are among the most intelligent and thought-provoking commentators on the significance of Google’s Book Search project (formerly known as Google Print, with the Google Print Library Project being the company’s attempt to digitize millions of books at the University of Michigan, Stanford, Harvard, Oxford, and the New York Public Library). These are my notes from the forum, highlighting not the basics of the project, which have been covered well in the mainstream media, but angles and points that may interest the readers of this blog.

Clifford Lynch has been the Director of the Coalition for Networked Information (CNI) since July 1997. CNI, jointly sponsored by the Association of Research Libraries and Educause, includes about 200 member organizations concerned with the use of information technology and networked information to enhance scholarship and intellectual productivity. Prior to joining CNI, Lynch spent 18 years at the University of California Office of the President, the last 10 as Director of Library Automation. Lynch, who holds a Ph.D. in Computer Science from the University of California, Berkeley, is an adjunct professor at Berkeley’s School of Information Management and Systems.

Jonathan Band is a Washington-based attorney who helps shape the laws governing intellectual property and the Internet through a combination of legislative and appellate advocacy. He has represented library and technology clients with respect to the drafting of the Digital Millennium Copyright Act (DMCA), database protection legislation, and other statutes relating to copyrights, spam, cybersecurity, and indecency. He received his BA from Harvard College and his JD from Yale Law School. He worked in the Washington, D.C. office of Morrison & Foerster for nearly 20 years before opening his own law firm earlier this year.

Clifford Lynch

one of things that have made conversion of back runs of journals easy is the concentration of copyright in the journal owners, rather than the writers of articles
contrast this with books, where copyrights are much more elusive
strange that the university presses of these same univs. in the google print library project were among the first complainers about the project
there’s a lot more to the availability of out of copyright material than copyright law—for instance, look at the policies of museums, which don’t let you take photographs of their out of copyright paintings
same thing will likely happen with google print
while there has been a lot of press about the dynamic action plan for european digitization, it is probably a plan w/o a budget
important to remember that there has been a string of visionary literature—e.g., H.G. Wells’s “worldbrain”—promoting making the world’s knowledge accessible to everyone—knowledge’s power to make people’s lives better—not a commercial view—this feeling was also there at the beginning of the Internet
legal justifications have been made for policy decisions that are really bad
large scale open access corpora are now showing great value, using data mining applications: see the work of the intelligence community, pharmaceutical industry—will the humanities follow with these large digitization projects
we are entering an era that will give new value to ontologies, gazetteers, etc., to aid in searching large corpora
if google loses this case, search engines might be outlawed [Lawrence Lessig makes this point on his blog too —DC]
because of insane copyright law like sonny bono act there might be a bifurcation of the world into the digitized world of pre-1923 and the copyrighted, gated post-1923 world

Jonathan Band

fair use is at base about economics and morality—thus the cases (authors, publishers) against google are interesting cases in a broad social sense, not just pure law
only 20% of the books being digitized are out of copyright (approx.)
for certain works, like a dictionary, where even a snippet would have an economic impact on the copyright holder, google will probably not make even a snippet available
copyright owners say copyright is opt-in, not opt-out (as Google is making it in their progam)—it seems dumb, but this is a big legal issue for these cases
owners are correct that copyright is normally an opt-in experience—the owner must be contacted first before you make a use of their work, except when it’s fair use—then you don’t need to ask
thus the case will really be about fair use
key precendent: kelly vs. arribasoft: image search, found in favor of the search engine; kelly was a cantankerous photographer of the West who posted his photos on his website but didn’t want them copied by arribasoft (2 years ago; ended in 9th circuit); court found that search engine was a transformative use and useful for the public, even though it’s commercial use; court couldn’t find any negative economic impact on the market for kelly’s work [this case is covered in chapter 7 of Digital History —DC]
google’s case compares very favorably with arribasoft
publishers have weaker case because they are now saying that putting something on the web means that you’re giving an implied license to copy (no implied license for books)—but they’ve argued before that copyright applies just as strongly on the web
bot exclusion headers (robots.txt)—respected by search enginesvbut that sounds like opt-out, not opt-in—so publishers also probably shouldn’t be pointing to that in their case
publishers are also pointing to the google program for publishers, in which publishers allow google to scan their books and then they share in revenues—publishers are saying that the google library program is undermining this market, where publishers license their material; transaction costs of setting up a similar program for library books would be enormous–indeed it can’t be done: google is probably spending $750 million to scan 30 mil. books (at $25/bk); it would probably cost $1000/bk if you had to clear rights for scanning; no one would ever be able to pay for clearing rights like this, so what google is doing is broad and shallow vs. deep but narrow, which is what you could do if you cleared rights—many of these other digitization projects (e.g., Microsoft) are only doing 100K books at most
if google doesn’t succeed at this project, no one else will be able to do it—so if we agree that this book search project is a useful thing, then as a social matter Google should be allowed to do it under fair use
what’s the cost to the authors other than a little loss of control?

November 28, 2005 1 Comment