Thursday, November 6, 2008

Achieving Community Goals in Our Decentralized Environment

Roger Schonfeld of Ithaka spoke about the American academic community, and what might be called impediments to achieve common goals. The American system of higher education is highly decentralized. There are private and state universities, there are various consortia, but there is no federal, top-down oversight, as there is in many other countries. Nonetheless, shared community goals of the academic community are:

- improve access to higher education
- maximize impact of research output
- preserve information necessary for scholarship

Improve access to higher education:

Traditionally, this meant expanding access to higher education through financial aid.

Digitally, this means distribution of educational materials more broadly via the Internet, reaching new communities via OpenCourseWare, Berkeley iTunes, and related initiatives.

In India, several sci/tech universities are collaborating to share a single curriculum online: the best economics 101, the best geology 101, etc. The goal is to make the best educational opportunities available online, and the effort is funded by the Indian government.

Maximize impact of research output:

Traditionally, this meant document sharing through ILL and other means.

Digitally, this means increasing accessibility by using low pricing or open publishing platforms.

Ithaka conducted a survey of faculty preferences for where they publish their research output (in order of preference):

- the journal must be widely read in their field
- no cost to authors to publish
- preservation of work is assured
- the journal is highly selective
- accessible in the developing world
- available for free

Making their research freely available is the least important factor.

Preserve information necessary for scholarship:

Traditionally, this meant research material was purchased, retained, and stored.

Digitally, this means licensing e-collections, participation in digital preservation, and hoping that print collections are retained somewhere (else).

Preservation of print: how many copies do we need? Ithaka's research indicates that:

- 22 light copies that serve as use copies (light, i.e., verified at the volume level)
- 6 dark copies ( verified at the page level. JStor is doing this)

If we have somewhere between 6 and 22 copies, we can reliably say we are preserving print.

Common Theme: in a decentralized environment, such as the US, the incentives to achieve community goals don't always line up with the realities of higher education.

To improve access to higher education:

- community wide course dissemination
- significant central funding (impossible in the US)

To maximize the impact of research output:

- enforceable mandates seem required to counter the pervasive, competing incentives faculty face in their publishing strategies

To preserve scholarship for future research:

LOCKSS and Portico participation is voluntary. They represent an effort to develop new social norms around preservation. Social norms have yet to trump the "free-ride" problem (signing on to LOCKSS or Portico can be viewed as being good library citizens). Will new taxes (on scholarship) or central funding be necessary to preserve digital output?

Know Logo: Brand, Trust, and Developing the Epistemic Infrastructure of Scholarly Communication

Geoff Bilder, who is the director of strategic initiatives for CrossRef, gave a great talk on logos and branding as a way to communicate many aspects of a product (including scholarly communication). I'm afraid my notes won't do this session justice.

We have the Internet Trust Problem: phishing, spam, urban myths, dodgy content. Using a logo or branding system could signify quality or indicate that the content is from a trustworthy site.

We already have a system of signifiers: websites that have a URL with a tilde (~) in them are automatically suspect, as they signify that it's some individual's website, not necessarily backed by a trusted institution.

The Internet Anti-Trust Pattern (how sites evolve from "good" to "bad:"

A digital community starts with a self-selected group of core specialists. The system is touted as "authority-less" and non-hierarchical (but that's not really true). The unwashed begin using the system, and it nearly breaks down under the strain of all those untrustworthy users. A regulatory system is put in place (a moderator, e.g.). The system is again touted as "authority-less" and non-hierarchical.

EBay developed a trust metric through its feedback rating system. Amazon developed its review system. Slashdot developed a system of voting on the quality of feedback and comments. More votes indicate a more trusted and useful post.

Google's trust metric is link counting. But this is flawed (in that same way that citation counting is flawed) because links often deliberately lead to "bad" or poor quality sites.

In scholarly communication, we have a paucity of heuristics. "We have to figure out ways to help people NOT have to read." Reading takes time, and no one has it. If we had symbols to indicate "peer reviewed," or "this is a quality website," for example, we could save the reader's time.

The logo of Penguin Press tells you something about the quality of the content. The journals Nature, Cell, and PNAS convey the same message: this publication, by dint of reputation, contains good stuff. We need to extend this system to websites and other digital objects.

Bilder suggested something he called the "crossmark," which would signify that information about an article was available. A user would click on the crossmark logo, and she could see that the paper was peer reviewed and who funded the research; are there any retractions, does the paper cite retractions, etc. A widely known logo could be used to convey a wealth of information about an article. And save the reader's time.

Pat Schroeder - News from the Publishing World

The most interesting part of Pat Schroeder's talk was about the Google settlement. The case was many years in the making, and the settlement is so complex, with ramifications for publishers, authors, Google, and libraries. Schroeder believes this case will be studied in law schools for years to come.

It's a win-win for all constituencies, she believes. The settlement opens up the content of about 7 million books that are still under copyright but not commercially available (1920's forward). Most books in this category are considered out-of-print, so copyright has reverted to the author.

Under this agreement, authors get $60 per book digitized. They can choose to take the $60 and opt out; if they do so their book will be digitized but it cannot be displayed to users. Or, authors can choose to stay in, which means they will allow their titles to be viewed, printed, and sold. Also, their titles may be sold via subscriptions, and advertising may be sold and placed around the book (not within it). The expectation is that most rights holders will leave their content in and make it accessible.

Public access license: one license for every public library in the U.S, and one license to every college and university. The license allows the printing of books, and licensees must charge page charges. Google will facilitate the money handling. (The mechanics clearly have to be worked out.) [Not clear about "one license" concept: what will it mean for universities? What does it mean that a public library has "one license?"]

A registry will be established to manage fees. Google will absorb the costs of establishing the registry. The registry will hire an executive director, and be run by a board of four authors and four publisher representatives.

Important point: The digitized books will NOT include photographs or illustrations, as their copyright owners were not part of this class action suit.

They're still working on provisions for orphan works. If it is determined that there is no rights holder, then Google will pay $60 into the registry, and the title will be digitized. If no one claims the rights within five years, the registry will "own" the rights.

The settlement applies only to the United States; digitized works can only be displayed within the States. A title by a British author (who opts in) held by a US library cannot be displayed in the UK.

A "fairness" hearing during the Summer of 2009 will finalize the settlement.