Open Source – Dan Cohen

A Conversation with Richard Stallman about Open Access

[An email exchange with Richard Stallman, father of free software, copyleft, GNU, and the GPL, reprinted here in redacted form with Stallman’s permission. Stallman tutors me in the important details of open access and I tutor him in the peculiarities of humanities publishing.]

RS: [Your] posting [“Open Access Publishing and Scholarly Values”] doesn’t specify which definition of “open access” you’re arguing for — but that is a fundamental question.

When the Budapest Declaration defined open access, the crucial condition was that users be free to redistribute copies of the articles. That is an ethical imperative in its own right, and a requisite for proper and safe archiving of the work.

People paid more attention to the other condition specified in the Budapest Declaration: that the publication site allow access by anyone. This is a good thing, but need not be explicitly required, because the other condition (freedom to redistribute) will have this as a consequence. Many universities and labs to set up mirror sites, and everyone will thus have access.

More recently, some have started using a modified definition of “open access” which omits the freedom to redistribute. As a result, “open access” is no longer a clear rallying point. I think we should now campaign for “redistributable publication.”

What are your thoughts on this?

DC: I probably should have been clearer in my post that I’m for the maximal access—and distribution—of which you speak. Alas, the situation is actually worse than you imagine, especially in the humanities, where I work, and which is about a decade behind the sciences in open access. Beyond the muddying of the waters through terms like “Green OA” and “Gold OA” is the fact that academic publishing is horribly wrapped up (again, more so in the humanities) with structural problems related to reputation, promotion, and tenure. So my colleagues worry more about truly open publications “counting” vs. publications that are simply open to reading on a commercial publisher’s website. That is why I think the big question is not the licensing or the technology of decentralized publishing, posting and free distribution of papers, etc., but the social realm in which academic publishing sits. I’m working now on pragmatic ways to change that very conservative realm.

Put another way: when software developers write good (open) code, other developers recognize that quality, independent of where the code resides; in humanities publishing, packaging (including the imprimatur of a press, the sense that a work has jumped some (often mythical) peer-review hurdle) counts for too much right now.

RS: [“Green OA” and “Gold OA”] are new to me — can you tell me what they mean?

So my colleagues worry more about truly open publications “counting” vs. publications that are simply open to reading on a commercial publisher’s website.

I don’t understand that sentence.

That is why I think the big question is not the licensing or the technology of decentralized publishing, posting and free distribution of papers, etc., but the social realm in which academic publishing sits.

Ethically speaking, what matters is the license used. That’s what determines whether the publishing is ethical or not. Are you saying that the social realm contains the obstacle to the adoption of ethical publication methods?

Put another way: when software developers write good (open) code, other developers recognize that quality, independent of where the code resides.

Programmers can tell if code is well-written, assuming they are allowed to read it, but how does that relate? Are you saying that in the humanities people often judge work based on where it is published, and have no other way to determine what is good or bad?

DC: Green O[pen] A[ccess] = when a professor deposits her finished article in a university repository after it is published. Theoretically that article will then be available (if people can find the website for the institution’s repository), even if the journal keeps it gated.

Gold OA = ~~when an author pays a journal (often around $1-3K) to make their submission open access~~. when the journal itself (rather than the repository) is open access; may involve the author paying a submission fee. Still probably doesn’t have a redistribution license, but it’s not behind a publisher’s digital gates.

Counting = counting in the academic promotion and tenure process. Much of the problem here is (I believe misplaced) concern about the effect of open access on one’s career.

Are you saying that the social realm contains the obstacle to the adoption of ethical publication methods?

Correct. And much of it has to do with the meekness of academics (especially in the humanities, bastion of liberalism in most other ways) to challenge the system to create a more ethical publication system, one controlled by the community of scholars rather than commercial publishers who profit from our work.

Are you saying that in the humanities people often judge work based on where it is published, and have no other way to determine what is good or bad?

Amazing as it may sound, many academics do indeed judge a work that way, especially in tenure and promotion processes. There are some departments that actually base promotion and tenure on the number of pages published in the top (mostly gated) journals.

RS: [Terms like “Green OA” and “Gold OA” provides] even more reason to reject the term “open access” and demand redistributable publication.

Maybe some leading scholars could be recruited to start a redistributable journal. Their names would make it prestigious.

DC: That’s what PLoS did (http://plos.org) in the sciences. Unclear if the model is replicable in the humanities, but I’m trying.

UPDATE: This was an off-hand conversation with Stallman, and my apologies for the quick (and poor) descriptions of a couple of open access options. But I think the many commenters below who are focusing on the fine differences between kinds of OA are missing the central themes of this conversation.

November 23, 2010 13 Comments

Idealism and Pragmatism in the Free Culture Movement

[A review of Gary Hall’s Digitize This Book! The Politics of New Media, or Why We Need Open Access Now (University of Minnesota Press, 2009). Appeared in the May/June 2009 issue of Museum.]

Beginning in the late 1970s with Richard Stallman’s irritation at being unable to inspect or alter the code of software he was using at MIT, and accelerating with 22-year-old Linus Torvalds’s release of the whimsically named Linux operating system and the rise of the World Wide Web in the early 1990s, with its emphasis on openly available, interlinked documents, the free software and open access movements are among the most important developments of our digital age.

These movements can no longer be considered fringe. Two-thirds of all websites run on open source software, and although many academic resources remain closed behind digital gates, the Directory of Open Access Journals reports that nearly 4,000 publications are available to anyone via the Web, a number that grows rapidly each year. In the United States, the National Institutes of Health mandated recently that all articles produced under an NIH grant—a significant percentage of current medical research—must be available for free online.

But if the movement toward shared digital openness seems like a single groundswell, it masks an underlying tension between pragmatism and idealism. If Stallman was a seer and the intellectual justifier of “free software” (“free” meaning “liberated”), it was Torvalds’s focus on the practical as well as a less radical name—“open source”—that convinced tech giant IBM to commit billions of dollars to Linux starting in the late 1990s. Similarly, open access efforts like the science article sharing site arXiv.org have flourished because they provide useful services—including narcisstic ones such as establishing scientific precedent—while furthering idealistic goals. Successful movements need both Stallmans and Torvalds, as uneasily as they may coexist.

Gary Hall’s Digitize This Book! clearly falls more on the idealistic side of today’s open movements than the pragmatic side. Although he acknowledges the importance of practice—and he has practiced open access himself—Hall emphasizes that theory must be primary, since unlike any particular website or technology theory contains the full potential of what digitization might bring. He pursues this idealism by drawing from the critical theory—and the critical posture—of cultural studies, one of the most vociferous antagonists to traditional structures in higher education and politics.

Hall’s book is less accessible than others on the topic because of long stretches involving this cultural theory, with some chapters rife with the often opaque language developed by Jacques Derrida and his disciples. Digitize This Book! gets its name, of course, from Abbie Hoffman’s 1971 hippie classic, Steal This Book, which provided practical advice on a variety of uniformly shady (and often illegal) methods for rebelling against The Man. But Digitize This Book! reads less like a Hoffmanesque handbook for the digital age and more like a throw-off-your-chains political manifesto couched in academic lingo.

Those unaccustomed to the lingo and associated theoretical constructions might find the book offputting, but its impressive intellectual ambition makes Digitize This Book! an important addition to a growing literature on the true significance of digital openness. Hall imagines open access not merely in terms of the goods of universal availability and the greater dissemination of knowledge, but as potentially leading to energetic opposition to the “marketization and managerialization of the university,” that is, the growing approach by administrations to treat universities as businesses rather than as places of learning and free intellectual exchange—a development that has upset many, including well beyond cultural studies departments. Similar worries, of course, cloud cultural heritage institutions such as museums and libraries.

Despite his emphasis on theory, Hall knows that any positive transformation must ultimately come from effective action in addition to advocacy. As Stallman unhappily discovered after starting the Free Software Foundation in 1985 and working for many years on his revolutionary software called GNU, it was Torvalds, a clever tactician and amiable community builder rather than theoretician or firebrand, who helped (along with others of similar disposition) to break open source into the mainstream by finding pathways for his Linux operating system to insinuate itself into institutions and companies that normally might have rejected the mere idea of it out of hand.

Hall does understand this pragmatism, and much to his credit he has real experience with creating open access materials rather than simply thinking about how they might affect the academy. He is a co-founder of the Open Humanities Press, a founder and co-editor of the open access journal Culture Machine, and is director of CSeARCH, an arXiv.org for cultural studies.

Yet Hall sees his efforts as ongoing “experiments,” not the final (digital) word. Indeed, he worries that his compatriots in the open access and open source software movements are congratulating themselves too early, and for accomplishing lesser goals. Yes, open source software has made significant inroads, Hall acknowledges, but it has also been “coopted” by the giants of industry, as the IBM investment shows. (The book would have benefited from a more comprehensive analysis of open source, especially in the Third World, where free software is more radically challenging the IBMs and Microsofts.) Similarly, Hall claims, open access journals are flourishing, but too often these journals merely bring online the structures and strictures of traditional academia.

Here is where Hall’s true radicalism comes to the fore, building toward a conclusion with more expansive aims (and more expansive words, such as “hypercyberdemocracy” and “hyperpolitics”). He believes that open access provides a rare opportunity to completely rethink and remake the university, including its internal and external relationships. Paper journals ratified what and who was important in ways we may not want to replicate online, Hall argues. Even if one disagrees with his (hyper)politics, Hall’s insight that new media forms are often little more than unimaginative digital reproductions of the past, which bring forward old conventions and inequities, seems worthy of consideration.

A wag might note at this point that Digitize This Book! is oddly not itself available as a digital reproduction. (As part of the research for this review, I looked in the shadier parts of the Internet but could not locate a free electronic download of the book, even in the shadows.) Other recent books on the open access movement are available for free online (legally), including James Boyle’s The Public Domain: Enclosing the Commons of the Mind (Yale University Press) and John Willinsky’s The Access Principle: The Case for Open Access to Research and Scholarship (MIT Press). Drawing attention to this disconnect is less a cheap knock against Hall than a recognition that the actualization of open access and its transformative potential are easier said than done.

Assuming things will not change overnight and that few professors, curators, or librarians are ready to move, like Abbie Hoffman, to a commune (though many might applaud the lack of administrators there), the key questions are, How does one take concrete steps toward a system in which open access is the normal mode of publishing? Which structures must be dissolved and which created, and how to convince various stakeholders to make this transition together?

These are the kinds of practical—political—questions that advocates of open access must address. Gary Hall has helpfully provided the academic purveyors of open access much food for thought. Now comes the difficult work of crafting recipes to reach the future he so richly imagines.

May 12, 2009 7 Comments

Omeka Wins $50,000 MATC Award

FAIRFAX, Va., December 8, 2008 — The Center for History and New Media at George Mason University received a $50,000 Mellon Award for Technology Collaboration (MATC) for Omeka, a software project that greatly simplifies and beautifies the online publication of collections and exhibits. The award was given at the Coalition for Networked Information meeting Dec. 8 in Washington, D.C.

MATC awards recognize not-for-profit organizations that are making substantial contributions of their own resources toward the development of open source software and the fostering of collaborative communities to sustain open source development.

Omeka is a free and open source web publishing platform for scholars, librarians, archivists, museum professionals, educators and cultural enthusiasts. Its “five-minute setup” makes launching an online exhibition as easy as launching a blog. Omeka is designed with non-IT specialists in mind, allowing users to focus on content and interpretation rather than programming. It brings Web 2.0 technologies and approaches to academic and cultural web sites to foster user interaction and participation. It makes top-shelf design easy with a simple and flexible templating system. Its robust open-source developer and user communities underwrite Omeka’s stability and sustainability.

“Until now, scholars and cultural heritage professionals looking to publish collections-based research and online exhibitions required either extensive technical skills or considerable funding for outside vendors,” said Tom Scheinfeldt, project co-lead and managing director of CHNM. “By making standards-based, serious online publishing easy, Omeka puts the power and reach of the web in the hands of academics and cultural professionals themselves.”

Scheinfeldt accepted the award from Vinton Cerf, vice president and chief Internet evangelist at Google, who chaired the blue-ribbon prize committee. The committee also included Tim Berners-Lee, creator of the World Wide Web; John Gage, chief researcher and director of the Science Office at Sun Microsystems, Inc.; Mitchell Baker, CEO of the Mozilla Corporation; Tim O’Reilly, founder and CEO of O’Reilly Media; John Seely Brown, former chief scientist at Xerox Corp.; Ira Fuchs, vice president of the Andrew W. Mellon Foundation; and Donald J. Waters, program officer in the Program in Scholarly Communication at the Mellon Foundation.

December 8, 2008 3 Comments

Digital Campus #26 – Free for All

On this episode of the Digital Campus podcast we wrestle with how to keep open access/open source educational resources and tools sustainable for the long run. Mills elaborates on some of his ideas about a “freemium” business model for higher ed, and Tom and I explain the dilemma from the perspective of large academic software projects. We also debate whether laptops are a distraction in the classroom, among other topics in the news roundup and picks of the week. [Subscribe to this podcast.]

May 8, 2008 Add Comment

Washington Post on Zotero, Open Academia

It was nice to see the Zotero project covered on the front page of the Washington Post yesterday in the article “Internet Access Is Only Prerequisite For More and More College Classes.” Also nice to see a quotation at the end of the article from yours truly about the movement in higher ed toward open tools and resources.

January 1, 2008 Add Comment

The Strange Dynamics of Technology Adoption and Promotion in Academia

Kudos to Bruce D’Arcus for writing the blog post I’ve been meaning to write for a while. Bruce notes with some amazement the resistance that free and open source projects like Zotero meet when they encounter the institutional buying patterns and tech evangelism that is all too common in academia. The problem here seems to be that the people doing the purchasing of software are not the end users (often the libraries at colleges and universities for reference managers like EndNote or Refworks and the IT departments for course management systems) nor do they have the proper incentives to choose free alternatives.

As Roy Rosenzweig and I noted in Digital History, the exorbitant yearly licensing fee for Blackboard or WebCT (loathed by every professor I know) could be exchanged for an additional assistant professor–or another librarian. But for some reason a certain portion of academic technology purchasers feel they need to buy something for each of these categories (reference managers, CMS), and then, because they have invested the time and money and long-term contracts on those somethings, they feel they need to exclusively promote those tools without listening to the evolving needs and desires of the people they serve. Nor do they have the incentive to try new technologies or tools.

Any suggestions on how to properly align these needs and incentives? Break out the technology spending in students’ bills (“What, my university is spending that much on Blackboard?”)?

November 5, 2007 18 Comments

Nineteenth-Century Open Source

Shaker chair Near where we’re staying on vacation there is a small but excellent Shaker museum. As a historian who in part studies nineteenth-century religion, I know a bit about the Shakers, one of the more remarkable and unusual revival Christian sects. (Note to those wishing to create a new sect that flourishes: eschew celibacy, even if you do make amazing furniture.) It is easy to think of the Shakers as from another age (or perhaps world), living in massive “families” of 50 to 100 “brothers and sisters” and focusing on the simple life of agriculture and crafts (in addition to very serious and often ecstatic forms of worship). But the museum brings to life the Shakers’ less well-known technological sophistication. They were innovators of the first order, constantly refining the efficiency of their families’ production (the simple lines of Shaker furniture made them easier to clean, important when your dining room seats 100).

What really struck me was their patented technologies. That’s right, the sect occasionally took advantage of U.S. patent law. The Shaker family near us invented a massive, semi-automated washing machine, among other things. And what they did with their patents is most interesting. They patented these machines so that no one would steal the designs, and then they licensed the designs for free to other Shaker communities, which did the same in return with their innovations. Sound familiar?

[Photograph of a Shaker chair by chrisjfry.]

August 2, 2007 4 Comments

2007 Mellon Awards for Technology Collaboration

The Andrew W. Mellon Foundation has launched the nominating process for the second annual Mellon Awards for Technology Collaboration (MATC). The awards, given by tech luminaries such as Tim Berners-Lee and Vint Cerf, honor not-for-profit organizations for leadership in the collaborative development of open source software tools with particular application to higher education and not-for-profit activities.

March 14, 2007 Add Comment

Creating a Blog from Scratch, Part 8: Full Feeds vs. Partial Feeds

One seemingly minor aspect of blogs I failed to consider carefully when I programmed this site was the composition of its feed. (Frankly, I was more concerned with the merely technical question of how to write code that spits out a valid RSS or Atom feed.) Looking at a lot of blogs and their feeds, I just assumed that the standard way of doing it was to put a small part of the full post in the feed—e.g., the first 50 words or the first paragraph—and then let the reader click through to the full post on your site. I noticed that some bloggers put their entire blog in their feed, but as a new blogger—one who had just spent a lot of time redesigning his old website to accommodate a blog—I couldn’t figure out why one would want to do that since it rendered your site irrelevant. It may seem minor, but a year later I’ve realized that there is, in part, a philosophical difference between a full and partial feed. Choosing which type of feed you are going to use means making a choice about the nature of your blog—and, surprisingly, the nature of your ego too. Subscribers to this blog’s feed have probably noticed that as of my last post I’ve switched from a partial feed to a full feed, so you already know the outcome of the debate I’ve had in my head about this distinction, but let me explain my reasoning and the advantages and disadvantages of full and partial feeds.

Putting the entire content of your blog into your feed has many practical advantages. Most obviously, it saves your readers the extra step of clicking on a link in their feed reader to view your full post. They can read your blog offline as well as online, and more easily access it on a non-computer device like a cell phone. Machine audiences can also take advantage of the full feed, searching it for keywords desired by other machines or people. For instance, most blog search engines allow you to set up feeds for posts from any blogger that contain certain words or phrases.

More important, providing a full feed conforms better with a philosophy I’ve tried to promote in this space, one of open access and the sharing of knowledge. A full feed allows for the easy redistribution of your writing and the combination of your posts with others on similar topics from other bloggers. A full feed is closer to “open source” than a feed that is tied to a particular site. For this reason, until the advent of in-feed advertising, most professional bloggers had partial feeds so readers would have to view advertising next to the full text of a post.

Even from the perspective of a non-commercial blogger—or more precisely the perspective of that blogger’s ego—full feeds can be slightly problematic. A liberated, full feed is less identifiably from you. As literary theorists know well, reading environments have a significant impact on the reception of a text. A full feed means that most of your blog’s audience will be reading it without the visual context of your site (its branding, in ad-speak), instead looking at the text in the homogenized reading environment of a feed reader. I’ve just switched from NetNewsWire to Google Reader to browse other blogs, and I especially like the way that Google’s feed reader provides a seamless stream of blog posts, one after the other, on a scrolling web page. I’m able to scan the many blogs I read quickly and easily. That reading style and context, however, makes me much less aware of specific authors. It makes the academic blogosphere seem like a stream of posts by a collective consciousness. Perhaps that’s fine from an information consumption standpoint, but it’s not so wonderful if you believe that individual voices and perspectives matter a great deal. Of course, some writers cut through the clutter and make me aware of their distinctive style and thoughts, but most don’t.

At the Center for History and New Media, we’ve been thinking a lot about the blog as a medium for academic conversation and publication—and even promotion and tenure—and the homogenized feed reader environment is a bit unsettling. Yes, it can be called academic narcissism, but maintaining authorial voice and also being able to measure the influence of individual voices is important to the future of academic blogging.

I’ve already mentioned in this space that I would like to submit this blog as part of my tenure package, for my own good, of course, but also to make a statement that blogs can and should be a part of the tenure review process and academic publication in general. But tenure committees, which generally focus on peer-reviewed writing, will need to see some proof of a blog’s use and impact. Right now the best I can do is to provide some basic stats about the readership of this blog, such as subscriptions to the feed.

But with a full feed, you can slowly loose track of your audience. Providing your entire posts in the feed allows anyone to resyndicate it, aggregate it, mash it up, or simply copy it. I must admit, I am a little leery of this possibility. To be sure, there are great uses for aggregation and resyndication. This blog is resyndicated on a site dedicated to the future of the academic cyberinfrastructure, and I’m honored that someone thought to include this modest blog among so many terrific blogs charting the frontiers of libraries, technology, and research. On the other hand, even before I started this blog I had experiences where content from my site appeared somewhere else for less virtuous reasons. I don’t have time to tell the full story here, but in 2005 an unscrupulous web developer used text from my website and a small trick called a “302 redirect” to boost the Google rankings of one of his clients. It was more amusing than infuriating—for a while a dentist in Arkansas had my bio instead of his. More seriously, millions of spam blogs scrape content from legitimate blogs, a process made much easier if you provide a full feed. And there are dozens of feed aggregators that will create a website from other people’s content without their permission. Regardless of the purpose, above board or below, I have no way of knowing about readers or subscribers to my blog when it appears in these other contexts.

But these concerns do not outweigh the spirit and practical advantages of a full feed. So enjoy the new feed—unless you’re that Arkansas dentist.

Part 9: The Conclusion

January 11, 2007 Add Comment