Re: Central vs. Distributed Archives from Stevan Harnad on 2003-04-16 (American-Scientist-Open-Access-Forum)

From: Stevan Harnad <harnad_at_ecs.soton.ac.uk>
Date: Wed, 16 Apr 2003 21:13:06 +0100

Subject Threads:
         http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1583.html
         http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0293.html

> From: [identity removed]
>
> What I wish to emphasize... is the big difference between posting
> one's production on line in one's personal site, and sending it to an
> international server such as ArXiv...

Yes, you are quite right that there is this difference. See:

    "Open Letter to Philip Campbell, Editor, Nature"
    http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2601.html

in which this point is explicitly discussed. Let me point out that
this point (about central-disciplinary versus distributed-institutional
self-archiving) is one of the three reasons I switched my own support
several years ago from central, discipline-based archiving (back) to local,
institution-based archiving (where I had started:
http://www.arl.org/sc/subversive/ ).

My three reasons for switching back were:

(1) OAI-interoperability has made central and distributed self-archiving
interoperable, hence jointly harvestable, searchable and navigable,
hence equivalent.

(2) Researchers and their institutions share a common interest in
maximizing their (shared) research impact (and its rewards), whereas
researchers and their disciplines do not. Institutions are hence in a
position to use "publish or perish" carrots and sticks to encourage
institutional self-archiving. Disciplines cannot (although of course any
disciplinary "culture" of self-archiving can be equally directed toward
central or institutional self-archiving). Hence institutional
self-archiving, once it catches on, can grow far faster than
disciplinary self-archiving.
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
http://www.ecs.soton.ac.uk/~harnad/Temp/Ariadne-RAE.htm

(3) Institutional self-archiving is truly *self*-archiving -- by the
author, of his own institutional research output, in his own
institution's research archive. And it is restricted *only* to the
output from researchers of that institution, made openly accessible
purely to maximize its impact. It is hence in a position to benefit from
the growing number of progressive self-archiving policies on the part of
publishers:
http://www.lboro.ac.uk/departments/ls/disresearch/romeo/Romeo%20Publisher%20Policies.htm

In contrast, a central, 3rd-party archive runs the risk of falling under
the (understandable) efforts of the publisher not to let *other*
publishers re-publish the work to which the original publisher has added
the value. (Of course, in the online and interoperable age this is moot
for give-away open-access research, because if something is openly
accessible to one and all on the web, it makes no difference whatosever
whether it is openly accessible from this website or from that
website! But central, 3rd-party archives are a psychological deterrent
because, being 3rd-party rather than "self," as the author's institution
is, it makes them -- in principle, but so far of course never in practise
-- open to publishers' claims of 3rd-party copyright-infringement by
a rival publisher. The author himself (and hence his own institution)
is immune to this, and hence can be the beneficiary of the retention of
the *self-archiving* right where a 3rd-party, central archive is not.

Anyway, since all OAI archives are interoperable and equivalent, I see
no reason at a time when self-archiving is still growing much too
slowly (compared to what would so easily be possible) to retard its
growth in any unnecessary way: Focussing on central discipline-based
archives and self-archiving is no longer necessary. Distributed
institution-based archives and self-archiving achieve the exact same end,
with at least one fewer obstacle (and at least one more incentive).

> Yes, as you say, most publishers allow authors to do the first thing
> [institution-based but not central self-archiving]: the APS, for instance,
> changed its copyright transfer form a few years ago to make this perfectly
> legal. I think that EPS did the same. But sending a document to a more
> general server such as ArXiv is another matter, and this is not permitted
> - at least for the moment (APS does not allow it for instance).

APS does not (yet) allow their *PDF files* to be
self-archived in ArXiv, but it does allow the final, revised
text to be self-archived. So this problem is trivial.
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0749.html
http://forms.aps.org/author/copytrnsfr.pdf

What is less trivial (because it is *perceived* by authors as a
deterrent) is publishers' expressed opposition to 3rd-party (i.e.,
central) "self"-archiving. The simple and obvious solution is distributed
institutional self-archiving, linked by the glue of OAI.

> Most private websites are not permanent; experience shows that they are
> often not updated, not stable, and that their url sometimes disappear
> after a few years. This is, by the way, why we need centralized structure
> to ensure long term preservation of all our documents.

Author home websites are not reliable or stable, but distributed
institutional/departmental OAI eprint archives certainly are, and will
be, just as stable as central disciplinary ones (if not moreso, for a
university is an enduring phsyical entity, whereas a discipline is merely
a distributed, "virtual" one!). Moreover, once the institutional eprint
archives grow, OAI harvesters will help provide backups, redundancy,
mirroring, etc. too.
http://www.ecs.soton.ac.uk/~lac/archpol.html

> Now, if we allow documents [to be deposited in {central-archive name
> deleted} even when 3rd-party-archiving is forbidden in the copyright
> transfer agreement], the publisher [might] sue [deleted] for using public
> money for unfair competition with private enterprise, and [deleted]
> will be stopped immediately. Even a mere threat of a lawsuit may be
> sufficient to put us in a difficult situation.

These are the pitfalls of 3rd-party archiving. If an archive is carrying
content other than its own institutional research output then it is (in
principle -- not in practise: look at the Physics ArXiv, unchallenged
for well over a decade now) susceptible to claims of re-publishing a
publisher's copyrighted contents. The solution is simple: Let every
research institution self-archive only its *own* research output, and
leave it to OAI-interoperability to do the (virtual) integration with the
research output of other research institutions. One needless retardant
on self-archiving successfully avoided!

> When I started [central archive-name, deleted], many objections came
> from all sorts of directions, sometimes contradictory. Now that I
> have something which exists, I do not want to take risks to see it
> cancelled. This is why I am careful. As an individual, I would take a
> totally different attitude.

If your archive consisted instead of just a network of independent
individual, modular institutional (indeed *departmenal*) OAI eprint
archives, each one reserved for self-archiving by that institute's
researchers alone, no one would be vulnerable, and nothing would be lost.
http://software.eprints.org/

But don't worry too much anyway. All a publisher can do is request
that a given paper or papers be removed, if they have in hand a signed
contract that permits only self-archiving for those papers. The authors
are the ones who archived the paper, so the archive itself is merely
an ISP, like a bulletin-board, bound only to have the designated papers
removed, if legal cause is shown (e.g., they are pornographic). The
rest of the papers are fine (and that will be most of them anyway). And,
judging from ArXiv, no publishers will bother to request removal anyway!

But true institution-based self-archiving will not only avoid this
needless nuisance, but it will accelerate and strengthen
self-archiving.

Stevan Harnad
Received on Wed Apr 16 2003 - 21:13:06 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:46:57 GMT