Re: Enhanced metadata, interaperability and searchability from David Goodman on 2006-10-27 (American-Scientist-Open-Access-Forum)

From: David Goodman <dgoodman_at_Princeton.EDU>
Date: Fri, 27 Oct 2006 14:33:11 -0400

I do not see how work on finding aids will in the least divert attention from the
need to increase the amount of OA available.

For the content to do any good, people must be able to find it.
If people cannot easily find what has been deposited now, they will
be somewhat less ready to bother later, or to arrange for their
own articles.

An important incentive for OA at this point might be the findability and
use of the 15%. There;'s nothing like seeing a functional system to encourage
use, and even the present 15% is a great many articles.

The judgment about the usefulness of the Google method will depend not on
theoretical considerations, but on how well it works. Does anyone have any
actual data yet?

David Goodman, Ph.D., M.L.S.
previously:
Bibliographer and Research Librarian
Princeton University Library

dgoodman_at_princeton.edu

----- Original Message -----
From: Stevan Harnad <harnad_at_ECS.SOTON.AC.UK>
Date: Friday, October 27, 2006 12:04 pm
Subject: [AMERICAN-SCIENTIST-OPEN-ACCESS-FORUM] Enhanced metadata, interaperability and searchability
To: AMERICAN-SCIENTIST-OPEN-ACCESS-FORUM_at_LISTSERVER.SIGMAXI.ORG

> > at CCLRC (so-called formalised DC)... it is strongly interlinked
> with CERIF
> > (the data/metadata standard for research information maintained
> by euroCRIS
> > http://www.eurocris.org). Stevan in particular will remember all
> this from the
> > CRIS2006 conference which he attended.
> http://ct.eurocris.org/CRIS2006/
> I not only remember the Eurocris meeting, but I strongly endorse CERIF
> and cris's! But let there be no misunderstanding: OA's priority
> today is
> *content*, not search, or enhanced interoperability. OAI (or even
> less) is
> enough for now: What's missing and urgent is not enhanced
> interoperabilitybut OA content (the absent 85% of it). On no
> account should either
> self-archivers or self-archiving mandaters wait for or weight
> themselves down
> with enhanced metadata schemes at this time! On the contrary, once
> OA content heads reliably and rapidly towards 100%, the enhanced
> interoperability will not be far behind!
>
> Stevan Harnad
>
> On Fri, 27 Oct 2006, Jeffery, KG (Keith) wrote:
>
> > All -
> >
> > I agree with Les that we still need repositories and that Google,
> even> if customised, is still a rather blunt instrument.
> >
> > The problem with the metadata is fairly obvious; it is machine
> readable> but not machine understandable i.e. the syntax is rather
> loose and the
> > semantics almost non-existent. This results in the end-user
> having to
> > browse on screen to achieve the required degree of recall and
> relevance> - a time-consuming and non-scalable way forward.
> >
> > If this is compared with the formal metadata of a DBMS schema, or
> that> associated with any particular, specialised domain of scientific
> > research for data exchange / access then the difference is obvious
> > immediately.
> >
> > We need formalised metadata that ensures (heterogeneous) computer
> > software systems can interoperate using it. We have to resolve
> character> set, language, syntax and semantics. We've had a go at
> this at CCLRC
> > (so-called formalised DC) and it is strongly interlinked with
> CERIF (the
> > data/metadata standard for research information maintained by
> euroCRIS> www.eurocris.org). Stevan in particular will remember
> all this from the
> > CRIS2006 conference which he attended.
> > ------------------------------------------------------------------
> ------
> > --------------------------------------------------
> > Prof Keith G Jeffery Director Information Technology
> > and International Strategy
> > kgj_at_rl.ac.uk CCLRC Rutherford Appleton Laboratory
> > T:+44 1235 44 6103 Chilton, Didcot, OXON OX11 0QX UK
> > F:+44 1235 44 5147
> > WWW Person: http://www.bitd.clrc.ac.uk/Person/K.G.Jeffery
> > Department: http://www.bitd.clrc.ac.uk
> > President ERCIM & CCLRC Director: http://www.ercim.org/
> > W3C Office at CLRC-RAL http://www.w3.org/
> > President euroCRIS http://www.eurocris.org/
> > VLDB Trustee Emeritus: http://www.vldb.org/
> > EDBT Board Member http://www.edbt.org/
> >
> > -----Original Message-----
> > From: American Scientist Open Access Forum
> > To: AMERICAN-SCIENTIST-OPEN-ACCESS-FORUM_at_LISTSERVER.SIGMAXI.ORG
> > Date: Fri, 27 Oct 2006 10:15:52 +0100
> > From: Leslie Carr <lac_at_ecs.soton.ac.uk>
> > To: JISC-REPOSITORIES_at_JISCMAIL.AC.UK
> > Subject: Re: OpenDOAR Search
> >
> > On 26 Oct 2006, at 19:00, Hubbard Bill wrote:
> >
> > > Please find below an announcement from OpenDOAR for a search
> facility> > based on OpenDOAR holdings.
> >
> > This is a very interesting service!
> >
> > There was a discussion on this list at the beginning of August about
> > "Search Engines for Repositories Only". There were several
> attempts to
> > define constrained searches using RollYO or similar, but they all
> > suffered from one defect or another (too few sites, or logins
> required> etc). The Google Custom Search that OpenDOAR have set up
> seems much more
> > suitable to the repository community needs. Further, it would
> seem to be
> > fairly simple to set up Country-specific searches (a la UKOLN's
> EPrints> UK) by providing location-identifying annotations for each
> repository.>
> > I have had a go with this, and created a ROAR-based Repository
> Search> Engine at http://google.com/coop/cse?cx=009118135948994945300%
> > 3Agvogitng0da
> > You can search all the ROAR repositories for a keyword and then
> Derek> Law can click on 'Scottish Research' to reduce the set of
> results to
> > those coming from the Scottish repositories (the "small and smart"
> > ones, according to his recent keynote at Open Scholarship :-)
> >
> > There is a serious point that this opens up: why would we bother
> with> OAI-based repositories, if you can do it all with Google? The
> advantage> that OAI provided us was "metatdata", ie the possibility
> of providing
> > more accurate resource identification. The advantage of repositories
> > were that they provided an identifiable source of (well-
> > maintained) research material. Of course, the one can be
> simulated by
> > the other, and if Google could support a simple quality control
> > "refereed material" tag then we could get by without OAI and without
> > repositories.
> >
> > Well, it doesn't, and so OAI still seems our best hope. However,
> even> with five years of OAI our repositories are not doing a very
> good job of
> > sharing metadata that helps a service to comprehend the status of
> the> holdings that it harvests (is this a published, refereed journal
> > article or equivalent? Is this a paper from an unrefereed workshop?
> > is this a chemical data file?) Too much is still down to
> interpretation> and subsequent data mining of the web pages. The
> Eprints Application
> > Profile (http://www.ukoln.ac.uk/repositories/
> > digirep/index/Eprints_Application_Profile) seems to be doing a
> good job
> > in achieving consensus in the use of Dublin Core, but there is an
> urgent> need for it to be implemented by all repositories!
> >
> > We've spent a lot of time and effort on advocacy and policies
> over the
> > last couple of years, but I think it's time that we went back to
> some of
> > the technical fundamentals and made sure that our information
> > interoperability is up to scratch, otherwise we'll find ourselves
> in a
> > universe where the only thing you can do is a keyword search!
> > --
> > Les
> > (just my opinion)
> >
>
Received on Fri Oct 27 2006 - 22:13:32 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:48:33 GMT