Nonstandard Archives, Standards, and Content
I agree with Lee Giles completely. Standards are important, desirable,
and will certainly prevail, but the primary objective is to get the
research literature online and free, NOW; and there is already a lot of
it up there, with wonderful resources like Lee's & Steve's splendid
researchindex to harvest and navigate it as one big virtual archive.
Try researchindex and prepare to be astounded. Bravo!
On Fri, 22 Jun 2001, Lee Giles wrote:
> Standards are great and often make the difference between the success
> and failure of an endeavor. But in some cases other standards can be
> used and not put additional burdens on authors and users. It's possible
> to set up an open archive that's useful and not require authors any
> additional work except putting their papers on their web site in some
> eformat. This works because there are already a few but widely used
> accepted standards for publishing documents - pdf, doc, postscript,
> html, etc. (It would be easy to include new ones such as xml.)
>
> The archive works by being active instead of passive. A smart crawler
> spiders the web searching for manuscripts. After finding the
> edocuments, an indexer converts the documents to text, indexes them and
> provides a query engine that allows search based on key words, phrases
> and citations. Other features such as cocitation, active
> bibliographies, collaborative filtering, etc. can be installed. Links
> to the original papers can be maintained.
>
> This is entire process is automated except for requirement that the
> authors place their papers in some standard eformat in an accessible
> web site. Because this is automated, some errors do occur.
> Subsequently, authors and others can ask for corrections.
>
> As an example, see researchindex.org and cora.whizbang.com which have
> archives for computer science papers. These two archives have over
> 300,000 papers, 500,000 unique authors and 3 million citations. In
> addition, they receive about 100,000 page views a day. The
> researchindex software is free for noncommercial use and cora has
> established a new archive for statistics papers.
Received on Wed Jan 03 2001 - 19:17:43 GMT
This archive was generated by hypermail 2.3.0
: Fri Dec 10 2010 - 19:46:09 GMT