Re: Central vs. Distributed Archives

From: Stevan Harnad <harnad_at_coglit.ecs.soton.ac.uk>
Date: Fri, 3 Nov 2000 08:24:44 +0000

On Thu, 2 Nov 2000, Greg Kuperberg wrote:

> We have had much more success by moving in the opposite direction,
> i.e., by strengthening distributed open archival with a centralized
> foundation.

And continued good success to the math arXiv project!

But why restrict efforts to centralized ones only? The whole point of
OAI interoperability is that it should no longer make any difference
whether a refereed paper is archived in a central archive or a
distributed archive or both! (The only alternative we want to avoid is
"neither"!)

By way of example of how it no longer makes any difference, CogPrints
<http://cogprints.soton.ac.uk> is a centralized archive for
cognitive science -- but it is using EXACTLY the same OAI-compliant
Eprints architecture as is has been developed for distributed,
institution-based archiving by http://www.eprints.org. In fact, the
OAI-compliant Eprints software was DERIVED from the prior centralized
CogPrints software!

And institutions are institutions, whether they mount centralized
archives or institutional archives.

And mirroring and harvesting for reliability and permanence are
available to both.

So why keep repeating that centralized archiving helped accelerate math
archiving more quickly than the prior (pre-OAI) distributed archiving?
True, but things didn't stop there. And linear growth is still linear
growth, whereas what we need is exponential growth, across all
disciplines, if we are to reach the optimal and inevitable before we
expire!

So let 1000 flowers bloom, central and distributed. Interoperability
will harvest them all.

> The MPRESS project (http://mathnet.preprints.org/)
> has a lot in common with OAI, and it was started before the universal
> math arXiv. It has its own metadata standard, "Dublin Core", and its
> has a number of institutional preprint series among its data feeds.
> But it hasn't yet caught on.

Maybe that was because it was going it alone, instead of distributing
its efforts across disciplines, as the Open Archives Insitiative is
doing. It's one thing to adopt a standard, quite another to get others
to adopt it too.

(This is why your advocacy of centralized archiving and anti-advocacy
of distributed archiving is divisive and counterproductive: We should
be supporting every effort that gets all the refereed literature up
there, online, accessible, searchable, navigable, and free for all.
Centralized archiving has not managed this alone, so let it now benefit
from the help of Distributed Archiving!)

> It doesn't seem to make much difference to
> authors whether a preprint series is indexed by MPRESS or not.

I don't understand this point. It may be another symptom of the
conflation between publishing and archiving, and between preprints and
postprints: What authors are choosing when they PUBLISH a paper, is a
journal, i.e., a quality-certifier with a known level of quality, a
trusted "brand." What authors are choosing when they ARCHIVE their
eprint -- whether the journal-certified, refereed POSTprint or the
unrefereed PREprint -- is a means of making their paper maximally
visible and accessible online, for free for all. OAI-interoperability
provides that, provided the metadata-protocol is shared by all
archives, irrespective of whether they are centralized or
institutional.

MPRESS apparently did not become such a universal (we might even call it
"distributed") standard. Perhaps this was in part because it did not
inititially adopt OAI's strategy of minimalism: Pick the minimal
functional metadata set, to maximize the ease of compliance, rather than
going all the way to Dublin Core from the outset. (OAI is inching
towards Dublin Core too, but thanks to minimalism and proselytising
across disciplines, it may manage to bring everyone else along with
it.)

> Part of
> the trouble with MPRESS is that not all of its sources are providing
> as good metadata as they promised. Ironically the lion's share of good
> metadata in MPRESS comes from the math arXiv.
>
> I would like to know where OAI thinks that MPRESS went wrong. In fact
> since I maintain a "service provider" for the math arXiv, I looked into
> using OA-compliant metadata instead of the ad hoc metadata that I get from
> the arXiv. I discovered that the OA standard is an oversimplification
> of the full arXiv metadata record, to the point that I can't use the
> OA format.

I will have to leave this to OAI experts to reply to.

> But don't get me wrong. I am in favor of fragmented interoperability if
> you really can't hope for something better. And as I said, the overall
> STM literature might well have to be fragmented, for now, down to the
> level of individual disciplines (e.g. chemistry) or small groups of
> disciplines (physics+math+cs).

"Fragmented interoperability" is a tautology": The whole point of
interoperability is shared metadata standards unifying distributed
("fragmented") systems.

As to "hopes": The only pertinent hope is the freeing of the entire
refereed literature online. Centralized self-archiving alone (which I
backed for a number of years, initially advocating putting the whole
literature into arXiv) just is not progressing fast enough. Enter OAI
and distributed institution-based self-archiving to help speed it on
its way.

--------------------------------------------------------------------
Stevan Harnad harnad_at_cogsci.soton.ac.uk
Professor of Cognitive Science harnad_at_princeton.edu
Department of Electronics and phone: +44 23-80 592-582
             Computer Science fax: +44 23-80 592-865
University of Southampton http://www.ecs.soton.ac.uk/~harnad/
Highfield, Southampton http://www.princeton.edu/~harnad/
SO17 1BJ UNITED KINGDOM

NOTE: A complete archive of the ongoing discussion of providing free
access to the refereed journal literature online is available at the
American Scientist September Forum (98 & 99 & 00):

    http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html

You may join the list at the site above.

Discussion can be posted to:

    american-scientist-open-access-forum_at_amsci.org
Received on Mon Jan 24 2000 - 19:17:43 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:45:56 GMT