diff options
-rw-r--r-- | README.md | 9 | ||||
-rw-r--r-- | paper.tex | 27 | ||||
-rw-r--r-- | tex/src/appendix-necessity.tex | 2 | ||||
-rw-r--r-- | tex/src/supplement.tex | 1 |
4 files changed, 18 insertions, 21 deletions
@@ -1,13 +1,14 @@ -Reproducible source for Akhlaghi et al. (2020, arXiv:2006.03018) +Reproducible source for Akhlaghi et al. (2021, arXiv:2006.03018) ---------------------------------------------------------------- -Copyright (C) 2018-2020 Mohammad Akhlaghi <mohammad@akhlaghi.org>\ +Copyright (C) 2018-2021 Mohammad Akhlaghi <mohammad@akhlaghi.org>\ See the end of the file for license conditions. This is the reproducible project source for the paper titled "**Towards Long-term and Archivable Reproducibility**", by Mohammad Akhlaghi, Raúl -Infante-Sainz, Boudewijn F. Roukema, David Valls-Gabaud, Roberto -Baena-Gallé, see [arXiv:2006.03018](https://arxiv.org/abs/2006.03018) or +Infante-Sainz, Boudewijn F. Roukema, Mohammadreza Khellat, David +Valls-Gabaud, Roberto Baena-Gallé, see +[arXiv:2006.03018](https://arxiv.org/abs/2006.03018) or [zenodo.3872247](https://doi.org/10.5281/zenodo.3872247). To learn more about the purpose, principles and technicalities of this @@ -72,12 +72,12 @@ %% CONTEXT Analysis pipelines commonly use high-level technologies that are popular when created, but are unlikely to be readable, executable, or sustainable in the long term. %% AIM - A set of criteria is introduced to address this problem. + A set of criteria is introduced to address this problem: %% METHOD Completeness (no \new{execution requirement} beyond \new{a minimal Unix-like operating system}, no administrator privileges, no network connection, and storage primarily in plain text); modular design; minimal complexity; scalability; verifiable inputs and outputs; version control; linking analysis with narrative; and free software. They have been tested in several research publications in various fields. %% RESULTS - As a proof of concept, ``Maneage'' is introduced for storing projects in machine-actionable and human-readable plain text, enabling cheap archiving, provenance extraction, and peer verification. + As a proof of concept, ``Maneage'' is introduced, enabling cheap archiving, provenance extraction, and peer verification. %% CONCLUSION We show that longevity is a realistic requirement that does not sacrifice immediate or short-term reproducibility. The caveats (with proposed solutions) are then discussed and we conclude with the benefits for the various stakeholders. @@ -85,15 +85,15 @@ \vspace{2.5mm} \emph{Appendices} --- - Two comprehensive appendices that review existing solutions; available + Two comprehensive appendices that review the longevity of existing solutions; available \ifdefined\separatesupplement -at \href{https://arxiv.org/abs/\projectarxivid}{\texttt{arXiv:\projectarxivid}} or \href{https://doi.org/10.5281/zenodo.\projectzenodoid}{\texttt{zenodo.\projectzenodoid}}. +as supplementary ``Web extras'' on the journal webpage. \else -at the end (Appendices \ref{appendix:existingtools} and \ref{appendix:existingsolutions}). +after main body of paper (Appendices \ref{appendix:existingtools} and \ref{appendix:existingsolutions}). \fi \vspace{2.5mm} - \emph{Reproducible supplement} --- + \emph{Reproducibility} --- All products in \href{https://doi.org/10.5281/zenodo.\projectzenodoid}{\texttt{zenodo.\projectzenodoid}}, Git history of source at \href{https://gitlab.com/makhlaghi/maneage-paper}{\texttt{gitlab.com/makhlaghi/maneage-paper}}, which is also archived in \href{https://archive.softwareheritage.org/browse/origin/directory/?origin_url=https://gitlab.com/makhlaghi/maneage-paper.git}{SoftwareHeritage}. @@ -147,19 +147,13 @@ Longevity is defined as the length of time that a project remains \emph{function Functionality is defined as \emph{human readability} of the source and its \emph{execution possibility} (when necessary). Many usage contexts of a project do not involve execution: for example, checking the configuration parameter of a single step of the analysis to re-\emph{use} in another project, or checking the version of used software, or the source of the input data. Extracting these from execution outputs is not always possible.} - -Longevity is as important in science as in some fields of industry, but not all; e.g., fast-evolving tools can be appropriate in short-term commercial projects. -To highlight the necessity, a short review of commonly-used tools is provided below: -(1) environment isolators (virtual machines, VMs, or containers); -(2) package managers (PMs, like Conda, Nix, or Spack); -(3) job management (like shell scripts or Make); -(4) notebooks (like Jupyter). -\new{A comprehensive review of existing tools and solutions is available in the +A basic review of the longevity of commonly-used tools is provided here \new{(for a more comprehensive review, please see \ifdefined\separatesupplement - \href{https://doi.org/10.5281/zenodo.\projectzenodoid}{appendices}.% + the supplementary appendices% \else% - appendices (\ref{appendix:existingsolutions}).% + appendices \ref{appendix:existingtools} and \ref{appendix:existingsolutions}% \fi% + ). } To isolate the environment, VMs have sometimes been used, e.g., in \href{https://is.ieis.tue.nl/staff/pvgorp/share}{SHARE} (awarded second prize in the Elsevier Executable Paper Grand Challenge of 2011, but discontinued in 2019). @@ -692,6 +686,7 @@ The Pozna\'n Supercomputing and Networking Center (PSNC) computational grant 314 \input{tex/src/appendix-existing-tools.tex} \input{tex/src/appendix-existing-solutions.tex} \input{tex/src/appendix-used-software.tex} +%\input{tex/src/appendix-necessity.tex} \bibliographystyleappendix{IEEEtran_openaccess} \bibliographyappendix{IEEEabrv,references} \fi diff --git a/tex/src/appendix-necessity.tex b/tex/src/appendix-necessity.tex index 325fb69..7124223 100644 --- a/tex/src/appendix-necessity.tex +++ b/tex/src/appendix-necessity.tex @@ -13,7 +13,7 @@ %% ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or %% FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License %% for more details. See <http://www.gnu.org/licenses/>. -\section{Necessity for reproducible research} +\section{Necessity for reproducible research\\(not submitted to journal, here for basic review)} \label{appendix:necessity} The increasing volume and complexity of data analysis has been highly productive, giving rise to a new branch of ``Big Data'' in many fields of the sciences and industry. diff --git a/tex/src/supplement.tex b/tex/src/supplement.tex index 635efc2..317907d 100644 --- a/tex/src/supplement.tex +++ b/tex/src/supplement.tex @@ -85,6 +85,7 @@ \end{abstract} %% Import the appendices. +\appendices \input{tex/src/appendix-existing-tools.tex} \input{tex/src/appendix-existing-solutions.tex} \input{tex/src/appendix-used-software.tex} |