diff options
Diffstat (limited to 'tex')
-rw-r--r-- | tex/src/appendix-existing-tools.tex | 33 |
1 files changed, 33 insertions, 0 deletions
diff --git a/tex/src/appendix-existing-tools.tex b/tex/src/appendix-existing-tools.tex index 3aba534..1062aba 100644 --- a/tex/src/appendix-existing-tools.tex +++ b/tex/src/appendix-existing-tools.tex @@ -324,6 +324,39 @@ The team can host the Git history on a web page and collaborate through that. There are several Git hosting services for example \href{http://codeberg.org}{codeberg.org}, \href{http://gitlab.com}{gitlab.com}, \href{http://bitbucket.org}{bitbucket.org} or \href{http://github.com}{github.com} (among many others). Storing the changes in binary files is also possible in Git, however it is most useful for human-readable plain-text sources. + + + + + + + +\subsection{Archiving} +\label{appendix:archiving} + +Long-term, bytewise, checksummed archiving of software research projects is necessary for a project to be reproducible many decades later. +The Wayback Machine\footnote{\inlinecode{\url{https://archive.org}}} and similar services such as Archive Today\footnote{\inlinecode{\url{https://archive.today}}} provide on-demand long-term archiving of web pages, which is a critically important service for preserving the history of the World Wide Web. +However, research project software archiving requires the preservation of files and metadata about the files, not of web pages. +This is commonly done in public research repositories such as Zenodo\footnote{\inlinecode{\url{https://zenodo.org}}}, which publishes md5sums of uploaded files, freezes them as a DOI-identified version of record, and provides convenient maintenance of metadata by the uploading user. +Universities now regularly provide their own repositories,\footnote{E.g. \inlinecode{\url{https://repozytorium.umk.pl}}} many of which are registered with the \emph{Open Archives Initiative} that aims at repository interoperability.\footnote{\inlinecode{\url{https://www.openarchives.org/Register/BrowseSites}}} + +For preserving the full editing records of a software project, \emph{Software Heritage}\citeappendix{dicosmo18} is especially useful. +Software Heritage allows a user to anonymously nominate the URL of a git (or cvs) commit history of any project and request that it be archived. +The Software Heritage scripts (themselves free-licensed) download the repository and allow the repository as a whole or individual files to be accessed using a URI. + +The {\LaTeX} and figure source files for the final research paper itself are also best archived on a preprint server such as ArXiv\footnote{\inlinecode{\url{https://arXiv.org}}}, which pioneered the archiving of research papers. +ArXiv recommends that the figures of a research paper are provided in postscript, a plain-text format, to maximise long-term longevity, and (normally) provides the source package and both postscript and pdf formats of the paper by email and on the web. +ArXiv provides long-term stable URIs, allowing versions, for each accepted research preprint.\footnote{\inlinecode{\url{https://arxiv.org/help/arxiv_identifier}}} + +An open question in archiving the full sequence of steps that go into a quantitative scientific research project is how to or whether to preserve ``scholarly ephemera'' in scientific software development. +This refers to discussion about the software such as reports on bugs or proposals of adding features, which are usually referred to as ``Issues'', and ``pull requests'', which propose that a change be ``pulled'' into the main branch of a software development repository by the core developers. +These ephemera are not part of the git commit history of a software project, but add wider context and understanding beyond the commit history itself, and provide a record that could be used to allocate intellectual credit. +For these reasons, the \emph{Investigating \& Archiving the Scholarly Git Experience} (IASGE) project proposes that the empemera should be archived as well as the git repositories themselves.\footnote{\inlinecode{\href{https://investigating-archiving-git.gitlab.io/updates/define-scholarly-ephemera}{https://investigating-archiving-git.gitlab.io/updates/}}\\\inlinecode{\href{https://investigating-archiving-git.gitlab.io/updates/define-scholarly-ephemera}{define-scholarly-ephemera}}} + + + + + \subsection{Job management} \label{appendix:jobmanagement} Any analysis will involve more than one logical step. |