Congratulations on completing your research project. Now it is time to publish.
Open research statements are now a common requirement when publishing research. These support reuse, validation, and citation and often take the form of Data availability, Data access, Code availability, Open Research, and Software availability statements. “Data available upon request” is increasingly considered insufficient by many funding agencies and publishers. Availability of research data and methods also address the need for reviewers to have access to sufficient information to evaluate the publication. In doing so, it is an aid to reproducibility and replicability of research, an integral part of the scientific method.
Here we adopt the definitions of reproducibility and replicability from NAS (2019):
Reproducibility is obtaining consistent results using the same input data; computational steps, methods, and code; and conditions of analysis.
Replicability is obtaining consistent results across studies aimed at answering the same scientific question, each of which has obtained its own data. Two studies may be considered to have replicated if they obtain consistent results given the level of uncertainty inherent in the system under study.
A description of methods and experimental data and parameters have always been required when publishing peer reviewed research. However, in the modern era of big data and computation, fulfilling this requirement means making code, input, and output files openly available. In some places, the term “computational reproducibility” may be used.
Below we specify what elements should be deposited into a trusted repository for models to be reproducible but not executable. We adopt the philosophy that upon execution from a binary, container or compilation, a learner (someone new to the code but not yet a skilled user) should be able to reproduce all model results within the range of reasonable numerical and system tolerances.
When constructing your computational model, you make choices - choices of geometry, meshes, initial conditions, boundary conditions, and material models. The choices made in compilation and numerics chosen e.g. solvers, time stepping also impact your results. In addition, research sometimes requires you to modify or add to code and/or write new pre- or post-processors. The combination of these choices makes your research unique. Models can be derivatives of other work and/or part of a release package and those works should be cited. Your modifications are what makes your model distinct and the fusion of these changes around a research hypothesis forms the so-called 3rd pillar of science, computation - theory and experimentation being the 1st and 2nd, respectively.
Along with the code repository itself, modeling software often has a defined directory structure for model inputs and outputs. When depositing, retain this directory structure as practicable.
Deposit any input file you have modified and used in support of your publication and needed to reproduce results. Deposit any output files needed to verify that the reproduction of results was successful e.g. log files and scripts, and support figures in publication. Preserve the expected (default) directory structure. From these files, a learner should be able to recreate raw model outputs from given input files and scripts.
Include:
Input files
See links to individual software packages for specific guidance.
Additional guidance is provided by AGU - Guidelines for Research Primarily Based on Numerical Models or Theory.
Some but not all research is based on a released and archived version of the code and/or executable binaries. Use the following guidelines to decide whether you must deposit code used in your research.
If you:
Pre- and Post-processing
Scripts
Your data and software should be deposited in a trusted community repository that assigns a persistent identifier (PID) such as a DOI and has a stated preservation policy. PIDs establish the authenticity of a resource and provide access to a resource if its location changes. A url is NOT a PID.
Repositories commonly used include:
AGU provides a listing and description of domain-discipline repositories. See also the Generalist Repository Comparison Chart and https://www.re3data.org/.
Check with your journal for recommendations. Most require FAIR-aligned repositories.
CIG has a community on Zenodo to promote discoverability of research products in geodynamics. All CIG software are a part of this community. When depositing your model and software, associate your research with our Communities:
Computational Infrastructure for Geodynamics
Aid additional discoverability in Zenodo by using the recommended keywords. The Keyword metadata field is found in the Basic Information section.
Remember to add Related/alternate identifiers field to the metadata for related publications and datasets. You can edit your metadata after you publish but you cannot alter the deposit.
See also our Zenodo Best practices.
Many journals are signatories to or endorse the Enabling FAIR Data Commitment Statement in the Earth, Space, and Environmental Sciences. As such, they share the joint belief that data should, to the greatest extent possible, be shared, open, and stored in community-approved FAIR-aligned repositories, and strive to adopt a shared set of author guidelines that support these principles, providing a common set of expectations for authors in the Earth, space, and environmental sciences. These include:
AGU has the most comprehensive data and software policies as well as guidance for authors. Meeting AGUs requirements should satisfy most journal policies. AGU provides the following templates for data and software:
Depending on the software used, you may consider archiving both your data and your code modifications in the same repository. Include a README file that describes the contents of your deposit and the associated publication. For some data (e.g. observational or experimental), depositing in a domain repository is the best place. Note that not all repositories support depositing both data and software. In addition, creating two citable objects clarifies what is being cited.
Follow the links below to find more information on relevant policies for specific journals.
Publisher | Journal | Resources |
American Astronomical Society Publishing | The Astrophysical Journal | Policy Statement on Software Data Guide |
American Geophysical Union | Journal of Geophysical Research Solid Earth Geophysical Research Letters (GRL) Geochemistry, Geophysics, Geosystems (G3) Reviews of Geophysics Tectonics |
Data and Software for Authors |
Copernicus | Solid Earth | Data Policy |
Elsevier | EPSL Icarus |
|
Tectonophysics | Guide for Authors | |
IOP | The Astrophysical Journal | Data Availability Policy |
Nature Portfolio | Nature Geoscience Nature Communications |
Reporting standards Authorship |
Oxford Academic | Geophysical Journal International | Data, Results, and Software Policy |
Additional software specific guidance on data deposits and software and data availability statement (links pending):