Electronic Data Publication in Geochemistry:
A Plea for “Full Disclosure”
Editorial 2001GC000234 -- Published 11 October 2001

GERM Steering Committee
Hubert Staudigel, Francis Albarede, Don Anderson, Lou Derry, Albrecht Hofmann, Charlie Langmuir, Bill McDonough, Henry Shaw, Bill White and Alan Zindler


Download PDF of manuscript ...


 

Computer technology and automated analytical instruments have resulted in an explosion in the quantity of geochemical data produced over the past several decades. While this is an extremely positive development for science, it has had the unfortunate effects of diminishing the value of individual analyses and has created a reluctance on the part of authors, editors, and publishers to publish data. Data, it seems, like much of the rest of what our society produces, have become disposable. After a short appearance in glossy illustrations, the fate of analytical results is to be buried, if not in landfills, then at least on shelves in investigators' offices.

One man's castaway can be another man's treasure. Data are useful well beyond the initial publication in a scientific paper. One might argue that interpretations of data may change and the data are the only permanent aspect of a paper. There are a number of important scientific papers based entirely, or almost so, on previously published data. If, however, the data are not published, such contributions are not possible. There is a good chance that geochemical samples have to be reanalyzed simply because data are not preserved for future use, beginning a new cycle of data generation and burial. Data are a most important heritage for geochemistry that has to be preserved for the future. Therefore our plea is for full publication of all data contributing to papers in geochemistry.

This is already the policy of the American Geophysical Union (AGU). The policy adopted by AGU's publications committee in 1993 states "Data sets discussed in data papers published in AGU books and journals must be publicly available and accessible to the scientific community indefinitely," and "data sets that are available only from the author, through miscellaneous public network services, or academic, government or commercial institutions not chartered specifically for archiving data, may not be cited in AGU publications" available at http://www.agu.org/pubs/data_policy.html. Our plea, then, is that authors and editors respect this policy and that other publishers adopt a similar one.

Computer technology, which has largely led to the data "devaluation" mentioned above, also provides the potential solution through information technology (IT). Like most other earth science disciplines, geochemistry stands to reap substantial benefits from embracing IT. These benefits include wider dissemination of geochemical data, increased ease of use of data by different earth science subdisciplines, and more efficient storage and retrieval of archived data. In the "Gutenberg" age, data had to be transmitted on the printed page. The cost of doing so was the main reason for the reluctance to publish large volumes of data. In the current "electronic" age communicating data this way is no longer necessary. Furthermore, it is not desirable. For most of us, data are most useful in our computers, where we can analyze them, compare results, create visualizations, etc.

Our plea is, thus, a plea for full electronic publication of data. Most publishers now make electronic supplements available through the World Wide Web. This is certainly a positive development, and it is hoped that publishers commit to permanently archive these data. Authors, however, need to make use of these supplements, and editors need to insure that they do. Publishers need to make these web sites easy to use and readily accessible.

Electronic publication of analytical results, however, does not fully solve the problem. There are currently no standards for the submission of data and metadata, which substantially limits the utility of many data sets. Data formats do not have to be restrictive, but they have to be clearly described, so differing units or normalizations can be transformed into a common format. Metadata, data about data, should include a full characterization of the samples analyzed and the precise location where the sample was taken and there should be full disclosure of analytical techniques and sample processing. Such data can be coded as text. Definition of data and metadata formats in electronic data supplements has to define new standards of scholarly data publications, but they do not have to be rigid. They need to allow for expansion as science proceeds or omission when particular metadata types are no longer useful.

The Geochemical Earth Reference Model (GERM) Initiative has made some proposals toward definition of data and metadata formats that is open to discussion and community input at http://earthref.org/metadata/GERM/. We invite suggestions or alternative proposals from the geochemical community.