Chemistry International
Vol. 24, No. 6
November 2002
Scientific
and Technical Information
Cometh
A Digital Dark Age?
by
Tony Davies
I was
fortunate enough to represent IUPAC at the recent ICSTI (International
Council for Scientific and Technical Information) seminar on the Digital
Preservation of the Records of Science hosted by UNESCO in Paris over
14 and 15 February. The topics covered were an eye-opener for an analytical
spectroscopist. I had thought that over the years we had managed to
supply our field with a range of widely implemented international data
standards capable of guaranteed long-term digital archiving. I suppose
I was rather proud of what we had achieved as a community of users,
manufacturers, and industry. I now realize we are the lucky ones. The
rest of the scientific world are currently running scared of what now
appears to be the advent of the so-called "Digital Dark Ages."
In this issue I will highlight the reasons for the meeting at UNESCO
and what urgently needs to be achieved on a global scale.
There
is a general worry in the international scientific community that the
moves toward electronic production and presentation of scientific data
will lead to serious deficits in the archiving of the records of science.
The first meeting on this topic was organized in January 2000 by ICSTI.
A progress review in 2001 established the urgent need for a second meeting,
which was hosted by UNESCO this February.
The objectives
of the February meeting were outlined as follows:
- to
ensure all the interests in digital preservation in science are aware
of all current activities in the field
-
to evaluate the needs for coordination of the efforts
- to
create any necessary structures and work programs to ensure coordination
of the activities
It was
decided that future meetings should also deal with the following issues:
- What
are the varieties and future uses of scientific and technological
information that must and will be archived?
- What
is the minimum amount of information (data fields) needed to locate
and identify information and who is creating what kinds of standards
related to location and basic identification?
- What
business and information models are appropriate and how should access
to the digital archives be arranged?
- Where
are the common issues with the preservation of more general cultural
archives and how can these be accommodated?
The
Seminar
The
seminar started with the usual welcoming speeches and an explanation
of the interests of the sponsoring organizations. There then followed
two days of specialist presentations from interested scientific organizations,
international representative bodies, and renowned speakers from the
scientific publishing industry.
For me,
one of the most worrying revelations during the two-day meeting was
the current acute fear amongst science historians, which was reported
by William Anderson of CODATA. He used a phrase, which at the time was
completely new to me, in revealing that there is imminent danger in
the arrival of a new "Dark Age," wherein our scientific cultural
heritage may be permanently lost through the exclusive use of electronic
media. This Dark Age will become more severe when electronic laboratory
notebooks finally became integrated into the normal working environment.
The danger of this new era was highlighted by an example of the problems
archivists are now struggling with.
A worrying
example was used to highlight the problem of archiving electronic material.
Following the death of an eminent British scientist, his widow presented
his archive material to the British Library for posterity. The problem
was, however, that there is effectively no infrastructure available
in this national archive for handling two old personal computers and
boxes of old format disks!
|
Unfortunately,
we are currently in the situation that science archivists have
well-established practices for handling paper legacies, but currently
have terrible problems when presented with digital content.
|
The
Debate on What to Archive?
A
large amount of time was devoted to discussion on exactly what should
be archived; however, no general agreement was reached. The data community
(probably heavily influenced by the FDA 21 CFR Part 11 rules currently
revolutionizing pharmaceutical IT) thought that all information needed
to be stored, whereas the traditional archivists looked at the logistics
and demanded that only selected content land in the electronic archives.
...there
is no equivalent law requiring that electronic-only publications
be archived.
|
Although
there is a legal requirement for publishers to deposit to their national
archives all material printed in that particular country, there is no
equivalent law requiring that electronic-only publications be archived.
This international legal loophole urgently needs to be closed and will
apparently be addressed during the Spanish presidency of the European
Union. It will be interesting to see how the Council of Ministers deals
with this thorny subject.
On an
international level it was clear that the classic role of the librarian
as archivist is outdated and being continually undermined by the digital
presentation of scientific publications. An ever-increasing proportion
of library budgets is being spent on digital-only subscriptions to peer-reviewed
scientific journals. These electronic journals are maintained off-site
and accessed through the Internet-often on a pay-per-view basis. The
librarians cannot archive this material, as it never physically lands
in the individual organizations. It was generally agreed that it is
foolish to expect the publishers to take over the role of archivists
and so another mechanism needs to be put in place.
A series
of presentations dealt with individual limited-term projects that were
or had been run in various countries funded by the Mellon Foundation,
the EU, and by different national governments. What was strikingly clear
was that the projects were not coordinated and any benefit would probably
end with the funding.
Not
Just a Problem for Scientists!
Having
only just become aware of the phrase "Digital Dark Age," you
can imagine my complete surprise when browsing through one of the bookshops
at Newark International Airport two weeks later, I discovered a brand
new book Dark Ages II-When Digital Data Die by Bryon Bergeron, a teacher
at Harvard Medical School and MIT (published by Prentice Hall PTR, Upper
Saddle River, New Jersey 07458, USA, ISBN 0-13-066107-4, www.phptr.com
). Much of this interestingly written book, which contains many anecdotes,
directly addresses the problem of long-term data archiving. Written
in clear, normal language, it is not a tacky techie tome for IT freaks.
Instead, it has good advice for everyone from home computer users to
managers of corporate networks. Bergeron attacks "Bloatware"
succinctly and provides many useful links to more detailed information
sources such as the US NARA (National Archives and Records Administration)
Center for Electronic Records guidelines. The table below, extracted
from the book, gives an idea of the level of the advice the book offers.
Expected
Media Lifetimes under Ideal and Typical Conditions. (Extracted
and adapted from Dark Ages II, Chapter 3, page 82.)
|
Storage
medium
|
Typical
lifetime (years)
|
Ideal
lifetime (years)
|
Comments |
CD-R |
5-100 |
2-30 |
Dye
less stable than pits used in commercial CD-ROMs |
CD-ROM |
30-200 |
5-50 |
Uses
pits on a metal surface to encode data-fragile surface |
DVD |
100 |
20 |
Higher
density susceptible to environmental changes |
DVD-R |
20-30 |
10 |
As
with CD-R less stable than commercial media |
Hard
disks |
?100 |
10-20 |
Lifetime
is down to stability of the mechanical parts |
Magnetic
tape |
30-100 |
5-20 |
Rewind
periodically to release tension |
WORM |
30-200 |
5-50 |
Formats
not as standardized as for CD-ROMs and DVDs |
Paper
Buffered |
?500 |
50-500 |
! |
Photographic
print |
?200 |
?100 |
Assuming
non-acid paper and stored out of light (non Polaroids!) |
Microfilm |
500 |
100-200 |
Standard
for archives |
Meeting
Outcome
One
of the messages that came out of the meeting was the clear need for
a more active advocacy effort to make scientists aware of the encroaching
danger and, especially, of the heritage value of their work, which they
should be careful to make available to archivists. As digital preservation
will not be a cheap exercise it was seen as important that the need
be expressed at many levels in order to convince those who control the
different budget sources of the vital nature of this work. ICSTI will
take the lead in this area.
The different
needs of the text archivists as opposed to the data archivists were
clear to all by the end of the meeting. This was especially the case
during discussions on metadata content. From my own recent experiences
working with FDA 21 CFR part 11 compliance systems, I can see that the
issues of exactly what metadata is worthy of storage and how to obtain
it is still a critical factor in an industry well advanced in archiving
digital content. Among those sciences just feeling their way into this
field, there are those who cannot currently agree on what constitutes
metadata!
I was
surprised by the depth of thought given to this issue by many of the
contributors to the seminar. There were a number of well-constructed
arguments, such as those presenting the desire for a "technology
watch" on current archival computing systems. This technology watch
will need to be established in order to warn in advance of upcoming
mitigation needs when computer hardware or software on which the archives
are reliant are about to become outdated.
Developing
countries reported that they need help, not only in the area of the
preservation of science information, but also with more exposure, which
they currently lack.
Conclusions
Okay,
all I can say is worry! Basically, we should all be rather worried about
the current status of born-digital scientific information. Fortunately,
the current precarious state of our science legacy has been spotted
and there are now international initiatives underway at a political
level to secure the significant funding required for establishing the
necessary infrastructure. We can only hope that they are successful.
Maybe by talking about the problems with our colleagues we can raise
awareness and support those striving to find appropriate solutions
Tony
Davies <[email protected]>
is secretary of the IUPAC Committee on Printed and Electronic Publications,
and external professor at the University of Glamorgan, United Kingdom.
Reprinted
from <www.spectroscopyeurope.com>