Microsoft Research Faculty Summit 2007
Microsoft Conference Center, Redmond, Washington, July 16
http://research.microsoft.com/workshops/FS2007/
ESCIENCE: DATA CAPTURE TO SCHOLARLY COMMUNICATION
Tony Hey, Microsoft Research (Chair)
Research Communication, Navigation, Evaluation, and Impact in the
Open Access Era, Stevan Harnad, University of Southampton
http://research.microsoft.com/workshops/FS2007/agenda_mon.aspx
The global research community is moving toward the optimal and
inevitable outcome in the online age: All research articles as well as
the data on which they are based will be openly accessible for free for
all on the web, deposited in researchers' own OAI-compliant
Institutional Repositories, and mandated by their institutions and
funders. Research users, funders, evaluators, and analysts, as well as
teachers, and the general public will have an unprecedented capacity not
only to read, assess and use research findings, but to comment upon
them, entering into the global knowledge growth process. Prepublication
preprints, published postprints, data, analytic tools and commentary
will all be fully and navigably interlinked. Scientometrics will
generate powerful new ways to navigate, analyze, rank, and evaluate this
Open Access corpus, its past history, and its future trajectory. A vast
potential for providing services that mine and manage this rich global
research database will be open both to the academic community as well as
to enterprising industries. [See: "Publication-Archiving, Data-Archiving
and Scientometrics," forthcoming in CTWatch]
http://users.ecs.soton.ac.uk/harnad/Temp/ctwatch.doc
The Digital Data Universe
Chris Greer, National Science Foundation
CyberInfrastructure to Support Scientific Exploration and
Collaboration Dennis Gannon, Indiana University
Funding for experimental and computational science has undergone a
dramatic shift from having been dominated by single investigator
research projects to large, distributed, and multidisciplinary
collaborations tied together by powerful information technologies.
Because cutting-edge science now requires access to vast data resources,
extremely high-powered computation, and state-of-the-art tools, the
individual researcher with a great idea or insight is at a serious
disadvantage compared to large, well-financed groups. However, just as
the Web is now able to provide most of humanity with access to nearly
unlimited data, theory, and knowledge, a transformation is also underway
that can broaden participation in basic scientific discovery and empower
entirely new communities with the tools needed to bring about a paradigm
shift in basic research techniques.
The roots of this transformation can be seen in the emergence of
on-demand supercomputing and vast data storage available from companies
like Amazon and the National Science Foundation's TeraGrid Science
Gateways program, which takes the concept of a Web portal and turns it
into an access point for state-of-the-art data archives and scientific
applications that run on back-end supercomputers. However, this
transformation is far from complete. What we are now seeing emerge is a
redefinition of ?computational experiment? from simple reporting of the
results from simulations or data analysis to a documented and repeatable
workflow in which every derived data product has an automatically
generated provenance record. This talk extrapolates these ideas to the
broader domain of scholarly workflow and scientific publication, and
qualitative as well as quantitative data, and ponders the possible
impact of multicore, ubiquitous gigabyte bandwidth and personal exabyte
storage.
Received on Sat Jul 14 2007 - 13:52:00 BST