---- (2) Tim Brody (ECS, Southampton): >>Henk Ellermann, Google Searches Repositories: So What Does Google >>Search For?, http://eepi.ubib.eur.nl/iliit/archives/000479.html >> >>But it is not only the quantity. Even when documents are available it does not >>mean that they are available to everyone. And if it's available to anyone, you >>still can't be sure that the system is running... >> >>What we badly need is a continuous and authoritative review of existing >>Institutional Repositories. The criteria to "judge" the repositories would >>have to include: >> >> * number of documents, (with breakdown per document type) >> * percentage of freely accessible documents >> * up-time >> >>It is great that Google becomes part of the Institutional Repositories effort, but >>we should learn to give fair and honest [data] about what we have to offer. There >>is actually not that much at the moment. We can only hope that what Google will >>expose is more than just the message "amateurs at work". I would agree with Henk that the current -- early -- state of 'Institutional Repositories' (aka Eprint Archives) is not yet the promised land of open access to research material. Institutional research archives (and hence the services built on them) will succeed or fail depending on whether there is the drive within the institution to enhance its visibility and impact by mandating that its author-employees deposit all their refereed-research output. Then, once it achieves critical mass, the archive can support itself as part of the culture of the institution. The archive is the public record of "the best the institution has done". So those archives that Henk refers to, with their patchy, minimal contents, need to look at what is going into this public record of their research output, and must decide whether it reflects the institution's achievements. As a technical aside, DP9 was developed for exposing OAI things to Web crawlers some time ago: http://arc.cs.odu.edu:8080/dp9/about.jsp I would be surprised if Google were to base any long-term service on only an archive's contents. Without the linking structure of the Web a search engine is left with only keyword-frequency techniques, which the Web has shown fails to scale to very large data sets. For my money, Google-over-Citebase/Citeseer-over-Institutional Archives is much more interesting (the Archive gives editorial management, Citebase/Citeseer the linking structure, and Google the search wizardry). > Stevan Harnad: > > Eprints, for example, has over 120 archives worldwide of exactly the same kind, > with over 40,000 papers in them: > http://archives.eprints.org/eprints.php?action=analysis I have revised the description on that page to say that a *record* is not necessarily a full-text. And of course a full-text is not necessarily a peer-reviewed postprint. It would help bean-counters like myself if repository/archive administrators would tag in an obvious place what their content types are (i.e. what type of material is in the system), and how the number of metadata records corresponds to publicly accessible full-texts. Tim Brody Southampton University http://citebase.eprints.org/Received on Tue Apr 13 2004 - 17:13:40 BST
This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:47:26 GMT