Re: Call for Commentary: http://www.text-e.org/debats/

From: Stevan Harnad <harnad_at_cogprints.soton.ac.uk>
Date: Thu, 22 Nov 2001 04:39:57 +0000

Problems, Solutions, and Non-Problems http://www.text-e.org/debats/

Stevan Harnad

SMITH: "The problem has existed since the 1970s, when journal prices
began to escalate... I do wonder if the change can take place
completely within the existing journal publication infrastructure."

The problem we are addressing directly is a much, much more important
one than the library serials budget crisis. It is that we have reached
a technical and practical state when it is at last possible to remove
all access barriers to refereed research on-line (by self-archiving
it).

A side-effect of immediately doing what is within researchers' power
to do -- which is to free their own refereed research by
self-archiving it online in their own institutional Eprint Archives
immediately -- will be an eventual easing of the library serials
budget crisis.

But please, let the (library-budget-problem) tail not try to wag the
(research-access-problem) dog at this point, particularly when the dog
is itself still in a confused tailspin, confusing its tail with its
elbow!

What Lorre Smith is contemplating is older scenarios for bringing
eventual relief for library budgets (change journals' policies, find
alternative journals or alternatives to journals). What I am proposing
is a new strategy http://www.arl.org/sc/subversive/ for
bringing immediate relief to research and researchers -- relief from a
problem they hardly realized they had, not yet having worked out in
their minds the implications of the old Gutenberg toll-based
access/impact blockage, nor the (no-longer-so-new) PostGutenberg
capability of putting an end to it at last online (through
self-archiving in OAI-compliant, hence interoperable, Eprint
Archives).

Nor is my strategy so original or new: The authors of 150,000 physics
papers have been successfully (and instinctively) deploying it for 10
years http://arxiv.org and the authors of over 1000 cognitive science
papers for 3 http://cogprints.soton.ac.uk. The authors of over 500,000
papers in computer science have likewise (but less successfully,
because not OAI-compliantly) been deploying a similar strategy thanks
to http://citeseer.nj.nec.com/cs and so have (still less successfully,
because likewise not OAI-compliantly, and without the harvesting help
of NEC) many other researchers in many other disciplines on their own
institutional websites.

The proposal now is merely to standardize, systematize and
universalize this practise for all the annual papers in all 20,000
refereed journals worldwide by creating and filling OAI-compliant
institutional Eprint Archives (with the help of the free
http://www.eprints.org archive-creating software, adaptable to all
languages, disciplines, and institutions).

None of this calls for any change in journal policy. It is Zeno's
Paralysis of the worst kind to sit worrying about possible untoward
consequences when at least a million papers have already been
self-archived with no untoward consequences whatsoever, only positive
ones:

Lawrence, S. (2001) Online or Invisible? Nature 411 (6837): 521.
http://www.neci.nec.com/~lawrence/papers/online-nature01/

SMITH: "I'm not talking about reforming the system to the extent that
it destroys the essence of peer review, but reforming the format, so
to speak, so that peer review can take place in a variety of
technologies."

Peer review is already being implemented online by most active
journals (see, for example, http://www.bbsonline.org), so that is old
news. But again, apart from untested peer review reform, exactly what
is being contemplated here?

A refereed journal, apart from providing an obsolescent PRODUCT, the
text (be it on-paper or on-line), is also the provider of an essential
SERVICE: peer review, and its certification (by the journal name).

Now this service is medium-independent (and, as I said, it is in any
case being much more efficiently implemented on-line these days, by
unreconstructed journals); so, again, what is the specific point
here?

But whoever is the provider of that service, and however they
implement it (on-paper or on-line), that service-provider is a
refereed journal!

There are 20,000 established service-providers of this sort at the
moment. Most of them also happen to sell a product, the text (on-paper
and on-line) and the access-tolls for that product are used in part to
finance the essential service, the refereeing.

Now Lorre, please spell out what changes you have in mind: Persuading
the 20,000 service providers to lower their tolls? persuading them to
give away their texts on-line? creating competing service-providers
that will do it in their place? Please be specific.

SMITH: "It is my opinion that quality control truly is the remaining
difficulty, or unknown. Researchers must trust that aspect of the
system before any change in paradigm will take place and the
literature, as you say, is "freed"."

I don't understand at all. 20,000 journals are practising that quality
control right now, and certifying it with their journal names. That
is the literature we are talking about freeing online access to here.
Now, what unknown are you referring to?

SMITH: "researchers must be prepared to deal with publisher refusal to
play along in the way that you describe."

You mean refusing to change their copyright transfer policy? But that
has already been discussed explicitly in the paper under discussion.
Please say specifically what the "refusal to play along" problem is
that you have in mind, in light of the specific strategy that has
already been proposed:

6. HOW TO GET AROUND RESTRICTIVE COPYRIGHT LEGALLY
http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#Harnad/Oppenheim

SMITH: "That copyright policy is no small part of your whole scheme."

Indeed. But it is for you now to show how and why it is inadequate --
particularly as the authors of 150,000 papers have been practising it
successfully for 10 years.

See: THE LOS ALAMOS LEMMA: "If you think you know an alleged obstacle
to public self-archiving -- let us call the obstacle "X" [X could be
copyright, preservation, plagiarism, whatever], an obstacle that must
allegedly be overcome before we can self-archive, and yet X did NOT
stop Los Alamos, then X is not an obstacle to public self-archiving."
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0471.html

SMITH: "journal publishers do currently also provide indexing and
abstracting services"

Yes some do (mostly this is done by so-called secondary publishers,
but never mind). So what?

We are talking about freeing access to the full-text contents of the
20,000. You are now talking about indexing and abstracting. Fine, we
can talk about that too: The Eprint Archives also happen to provide
some basic indexing and abstracting services (q.v.
http://cogprints.soton.ac.uk/perl/search).

But, ready for deployment and development as the many Eprint Archives
fill are new generations of Open Archives Service Providers
http://www.openarchives.org/service/listproviders.html such as
http://cite-base.ecs.soton.ac.uk/help/index.php3 that are ready to
provide all sorts of new and powerful navigational capabilities -- all
they are waiting for is the full-text literature to apply them to!

Meanwhile, you seem to be recommending that the online-archiving of
that full-text literature should be waiting... for what?

And why? The for-fee full-text literature already exists, on-paper and
on-line, for those who can afford it. So do the indexing and
abstracting services. We are talking about SUPPLEMENTING all these
toll-based goodies with for-free ones, by self-archiving the full
texts. This is not even a SUBSTITUTE, yet, but merely a supplement.

Now exactly what is the force of the indexing/abstracting worry with
regard to the desirability, feasibility, or optimality of
self-archiving?

I ask this to reduce the possibility that we may be talking at
cross-purposes, as you focus on easing the library budget crisis and I
focus on freeing online access to the refereed literature.

SMITH: "editors can influence the publisher role in access to the
significant body of literature of the past, both recent and distant."

Yes, perhaps. But the immediate problem is the immediate literature,
the present, not the past.

Besides, the past will probably take care of itself, as not much
revenue is at stake there for journal publishers. Indeed, some are
already announcing new policies of freeing on-line access to their
full-text contents 6-12 months after publication:

NEJM's NEW WEBSITE AND NEW POlICY
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1334.html

But for research and researchers, this is all too little and too late:
Researchers do not give away their research reports for PUBLICation in
order to have them embargoed behind financial firewalls for 6-12
months:

Harnad, S. (2001) AAAS's Response: Too Little, Too Late. Science
dEbates [online] 2 April 2001.
http://www.sciencemag.org/cgi/eletters/291/5512/2318b Fuller version:
http://www.ecs.soton.ac.uk/~harnad/Tp/science2.htm

SMITH: "By giving up the distribution rights to publishers, and
standing by for all these decades while publishers have essentially
broken the system of scholarly communication (by price increases),
researchers have got themselves into a pickle. How can researchers
have influence on the Post-Gutenberg Anomaly? [by] expanding our
responsibilities and awareness not only in self-archiving, but in our
editing roles, and in the way our societies handle the publication of
our journals"

I regret that I cannot follow this. The journal pricing/budget problem
started in the Gutenberg era, and if we were still in the Gutenberg
era, I think our prospects of getting out of the "pickle" would have
been quite slim.

But we are in the PostGutenberg era -- and we are not talking about
the journal pricing/budget problem! We are talking about eliminating
obsolete and unnecessary access barriers by self-archiving all
refereed research online.

Now it seems to me that self-archiving is all that's needed there.
Exactly what (apart from updating copyright transfer policy, which
would be a convenience, but not a necessity) is it about the "handling
of publication" that you would like researchers to exert their
influence to change, and why?

SMITH: "a careful approach to inflicting your proposal on publishers
may be in order. How many prestigious journals can disappear before a
given field is in disarray? After that point, there is nowhere for
anyone to publish. Shouldn't researchers be prepared even a little bit
for this possibility?"

I am not trying to inflict anything on publishers! I am trying to wake
up researchers to what is in the best interest of their own research,
and how to go about making it happen.

You have a hypothesis here. You think self-archiving will make
journals disappear and cast fields into disarray. What evidence do you
have for that worry? Papers have been self-archived in physics for 10
years and the journals seem to be doing fine.

You are perhaps worrying about what journal publishers will eventually
do if and when the literature is free online and if and when
user-preference for the free versions causes cancellations and revenue
shrinkage? This was discussed in the paper under discussion too. See:

4.2 HYPOTHETICAL SEQUEL:
http://www.publications.parliament.uk/pa/cm200304/cmselect/cmsctech/399/399we152.htm
worry, by the way, is #17 of the 22 Prima-Facie FaQs for Overcoming
Zeno's Paralysis:

17. PUBLISHERS' FUTURE
http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#17.Publishers

"I worry about self-archiving because of what it might do to journal
publishers' future."

See the replies about PAYING THE PIPER (8
http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#8), DOWNSIZING
(9
http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#9.Downsizing),
and CAPITALISM (14
http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#14.Capitalism).

Those journal publishers who are willing and able to scale down to
their new PostGutenberg niche can do so. New online-only journal
publishers are ready to take over the titles in the cases where they
are not. The remaining QC/C service costs per submitted paper can be
paid for by the author-institution out of 10% of its annual 100% S/L/P
savings. And refereed journal publication is only a small portion of
publication, most of the rest of which, being non-give-away, will
proceed on-line much the way it does on-paper.

There is no rational deterrent to immediate self-archiving in worries
about publishers' future.

SMITH: "I'm not so sure, as you can see, that the existing
publications will hold together under the pressure of change."

Maybe they will, maybe they won't. If they don't, their editorial
boards and titles can migrate to publishers that will (i.e.,
publishers ready to fill the downsized niche of being peer-review and
certification service-providers). http://www.biomedcentral.com/

But at the moment, the worry is how to get researchers to put content
up there (as the physicists are doing).

SMITH: "To convince publishers to continue titles with highly eroded
profit value may be utterly impossible."

Fine. It will be then, if/when they no longer want to hold onto the
titles, that new publishers can take them over. Not now, when there is
no reason for the editorial boards or authors to switch. Now is the
time to free the literature, through self-archiving.

SMITH: "I still contend that you underplay the costs of OAI-compliant
archives... Let's add in a few more costs: consistent and effective
indexing, creating metadata, network maintainance, search engine
maintenance and development and hardware. Let's also add in the costs
of archiving data for more than a few years. Hardware must be
upgraded. New technologies emerge and data and software must be
migrated. For a single document real costs can be as large as $10.00
US per year."

These costs are either nonexistent, irrelevant, or already born by
other existing online services at every university. (Archiving a
university's annual refereed research paper output is a mere flea on
the tail of its online dog.) The direct costs of the archiving itself
are negligible, especially as they are also an investment in
reciprocal access, enhanced impact, and perhaps even eventual library
subscription savings!

Regarding the preservation worry, see:

http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#1.Preservation

SMITH: "It's very easy to think of the individual researcher, but
self-archiving sits upon a very large infrastructure that incurs very
high costs."

Most of it already fully resourced, so the added flea carries
virtually no marginal weight.

SMITH: "Intellectual access to individual researcher documents is not
a simple matter."

But it would be, if researchers simply self-archived it all.

SMITH: "OAI-compliant metadata may guarantee a certain level of rough
retrieval, but how long are researchers really going to put up with
the retrieval of thousands of irrelevant documents for any given
search of the networked archives? Intellectual access questions are
very difficult to resolve when speaking of archives that contain
millions upon millions of items. An OAI architecture forms only the
most rough basis for retrieval. Any effective engine for relevance
involves significant labor cost, either in automated index functions
or human labor."

The very same kinds of search tools that make Medline, or
Web-of-Science, or Inspec work, will work even better, and spawn even
better navigation tools, once the corpus of 20,000 is online and
freely accessible.

Try:

http://cite-base.ecs.soton.ac.uk/cgi-bin/search or
http://arc.cs.odu.edu/ or http://opcit.eprints.org

And imagine if it really did range across the millions of items that
Medline or Inspec range over! Even an OAI-specific google is
feasible:

An OAI Gateway Service for Web Crawlers:
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1666.html

SMITH: "I applaud your call for action, but commend you to consider
further the retrieval problems of self-archiving."

I appreciate the applause but would be grateful now if you would
consider the solutions I have proposed (as well as the suggestions
that some of these problems may be non-problems), both in this reply
and in the paper under discussion.

Stevan Harnad, mercredi 21 novembre 23:24 (heure de Paris)
http://www.text-e.org/debats/
Received on Thu Nov 22 2001 - 04:40:29 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:46:18 GMT