Here is a reply by Maurits van der Graaf to some questions
from Prof. T.D. Wilson and myself about whether the Driver study
http://www.driver-support.eu/documents/DRIVER%20Inventory%20study%202007.pdf
http://www.pleiade.nl/wiki
over-estimates current deposit rates. The main result seems to
be that about 15% of repository content (average 9000 items) is
full-text journal articles. What percentage this represents of
annual institutional output is not determined, but when asked to
estimate it, authors' estimates averaged 37%. [There is a
need to check on what the authors based their estimates.] -- SH
---------- Forwarded message ---------- Date: Mon, 7 May 2007
From: Maurits van der Graaf <m.vdgraaf--pleiade.nl> To:
harnad_at_ecs.soton.ac.uk Subject: DRIVER study data on repositories
Dear Prof. Harnad and Prof. Wilson,
Thank you very much for your interesting and important comments on the
results of the DRIVER inventory study, posted on the BOAI forum.
Please find attached my response, also mailed to the BOAI forum to be
posted there. I hope this addresses the topics you raised adequately.
Your sincerely,
Maurits van der Graaf
Pleiade Management & Consultancy
+31(0)20-4889397
http://www.pleiade.nl
Professor Wilson raises in his comments on the DRIVER Inventory Study a
very interesting topic: how successful are the (mostly institutional)
repositories in covering the research output of their institutes?
In our study we asked the managers of the repositories for research output
throughout the European Union to provide data on the contents such as the
type of material covered, numbers in total etc. Using a number of sources,
we identified approximately 230 institutes with a possible repository for
research output in the European Union and approached them to participate in
this inventory study. In all, 114 repositories from 17 countries
participated.
Based on figures provided by 104 repositories, it appears
that on average digital repositories contained nearly 9000 records (8984,
as assessed in the second half of 2006). The large majority of these
records (90%) relate to textual materials: these records can be split in
metadata-only records (61%) and full text records (29%).
(5% of the records relate to non-textual materials such as images, video,
music and primary datasets. The 5% 'other materials' relate to learning
materials, students papers etc.)
What types of textual materials [of the 29% of the average 9000 records
per repository] are deposited? More than half of the textual materials
relate to journal articles (54%), a smaller share are for books or book
chapters (19%). Theses, proceedings and working papers - often labelled
as grey literature - have a share of 29%.
[I.e., about 15% of the average 9000 are journal articles]
In another question we asked the respondents of this survey to estimate the
percentage of the research output from their institute of 2005 deposited in
their repository. The average percentage estimated was 37%.
How do these figures relate to other studies? A recent study by the
Association of Research Libraries surveyed 87 research institutes with 31
operational institutional repositories. They find that a typical
institutional repository holds about 3800 digital objects (SPEC KIT 292,
Institutional Repositories, July 2006, ARL).
A much broader survey (2147 libraries in the USA contacted, 446
participants) identifies 48 operational repositories. Of those, 50%
contains less than 1000 digital documents, and nearly 20% more than
5000 items.
Although lower than our figures, these American surveys suggest that
research repositories contain thousands of items, instead of the hundreds
of items found by the study of Professor Wilson among 22 UK research
repositories.
The discrepancy between our numbers and the numbers of Professor Wilson
could be caused by:
A different selection of institutional repositories:
our selection (although the largest study on operational
repositories so far) might be biased to more active and more
successful repositories.
Timing of the survey:
Professor Wilson includes numbers up to 2004, since then many
research institutes have accelerated their activities with regard
to repositories for research output.
Different records identified:
we included also metadata-only records in the numbers; the inclusion
criteria of the other surveys are not explicitly stated.
The DRIVER project aims to put a test-bed in place across Europe to assist
the development of a knowledge infrastructure, based on repositories for
research output. For that purpose, our survey aimed to make an inventory of
the current state of repositories in the European Union. Based on its
results, we believe that the situation with regard to covering research
output by institutional repositories is better than as suggested by
Professor Wilson. But even with this more positive outlook, coverage of
research output remains a crucial element in the further development of
repositories and the proposed knowledge infrastructure.
Please see for the entire study and options to comment on the study
http://www.pleiade.nl/wiki
On Thu, 3 May 2007, Stevan Harnad wrote:
> Forwarded from BOAI Forum: Important corrections from Professor Wilson
> regarding the true rate and proportion of spontaneous self-archiving
> of article full-texts. This rate and proportion is almost certainly
> over-estimated by the Driver Study -- but that only reinforces its
> recommendation that self-archiving needs to be mandated. -- SH
> http://www.driver-support.eu/documents/DRIVER%20Inventory%20study%202007.pdf
>
> Date: Wed, 2 May 2007 23:23:21 +0100
> From: Prof. Tom Wilson <t.d.wilson--sheffield.ac.uk>
> To: BOAI Forum <boai-forum--ecs.soton.ac.uk>
>
> One of the items in Peter Suber's OA News made me raise my eyebrows:
> http://www.earlham.edu/~peters/fos/2007_04_29_fosblogarchive.html#5931097067404812219
> he quotes from the DRIVER report that:
>
> "On average, the estimated percentage of research output of 2005 deposited in
> the digital repositories is 37%"
>
> This, of course, is a Europe-wide study and perhaps the success of repositories
> varies considerably from country to country. In the study I did last year of
> the UK repositories, I estimated that they contained something in the order of
> 3% of the research output from the UK in 2004 - I would be very surprised if
> they had improved tenfold in one year.
>
> Looking further into the report I see that only 57 UK institutions were invited
> to respond to the investigation and, of these, only 51% responded (and we can
> assume that those who respond are most interested in the subject under
> investigation). The report also notes that:
>
> "On average a digital repository contains in total 8,984 items."
>
> Meaning, of course, items of all kinds, not solely journal papers. Again, this
> figure contrasts starkly with the situation I found in the UK, where, the
> combined total of journal papers in ALL of 21 repositories available for study
> was 9,739 - an average (meaningless, of course, as any average of a skewed
> distribution) of 464 items per repository. In fact the totals by institution
> ranged from 2 items in total to 5,139 items, with a median value of 78 items.
>
> A comparison of these data with those form the DRIVER report still leaves me
> puzzled :-)
>
> Professor T.D. Wilson, PhD, Hon.PhD
> Publisher/Editor in Chief
> Information Research
> InformationR.net
> e-mail: t.d.wilson--shef.ac.uk
> Web site: http://InformationR.net/
>
Received on Mon May 07 2007 - 13:50:51 BST