diff options
Diffstat (limited to 'paper.tex')
-rw-r--r-- | paper.tex | 32 |
1 files changed, 17 insertions, 15 deletions
@@ -79,7 +79,7 @@ at the end (Appendices \ref{appendix:existingtools} and \ref{appendix:existingso \emph{Reproducible supplement} --- All products in \href{https://doi.org/10.5281/zenodo.\projectzenodoid}{\texttt{zenodo.\projectzenodoid}}, Git history of source at \href{https://gitlab.com/makhlaghi/maneage-paper}{\texttt{gitlab.com/makhlaghi/maneage-paper}}, - which is also archived on \href{https://archive.softwareheritage.org/browse/origin/directory/?origin_url=https://gitlab.com/makhlaghi/maneage-paper.git}{SoftwareHeritage}. + which is also archived in \href{https://archive.softwareheritage.org/browse/origin/directory/?origin_url=https://gitlab.com/makhlaghi/maneage-paper.git}{SoftwareHeritage}. \end{abstract} % Note that keywords are not normally used for peer-review papers. @@ -126,9 +126,10 @@ Decades later, scientists are still held accountable for their results and there \section{Longevity of existing tools} \label{sec:longevityofexisting} \new{Reproducibility is defined as ``obtaining consistent results using the same input data; computational steps, methods, and code; and conditions of analysis'' \cite{fineberg19}. -Longevity is defined as the length of time during which a project remains usable. -Usability is defined by context: for machines (machine-actionable, or executable files) \emph{and} humans (readability of the source). -Many usage contexts do not involve execution: for example, checking the configuration parameter of a single step of the analysis to re-\emph{use} in another project, or checking the version of used software, or the source of the input data (extracting these from the outputs of execution is not always possible).} +Longevity is defined as the length of time that a project remains \emph{usable}. +Usability is defined by context: for machines (machine-actionable, or executable files) \emph{and/or} humans (readability of the source). +Many usage contexts do not involve execution: for example, checking the configuration parameter of a single step of the analysis to re-\emph{use} in another project, or checking the version of used software, or the source of the input data. +Extracting these from the outputs of execution is not always possible.} Longevity is as important in science as in some fields of industry, but not all; e.g., fast-evolving tools can be appropriate in short-term commercial projects. To highlight the necessity, a short review of commonly-used tools is provided below: @@ -390,7 +391,7 @@ Other built files (intermediate analysis steps) cascade down in the lineage to o Just before reaching the ultimate target (\inlinecode{paper.pdf}), the lineage reaches a bottleneck in \inlinecode{verify.mk} to satisfy the verification criteria (this step was not yet available in \cite{akhlaghi19, infante20}). All project deliverables (macro files, plot or table data and other datasets) are verified at this stage, with their checksums, to automatically ensure exact reproducibility. -Where exact reproducibility is not possible, values can be verified by any statistical means, specified by the project authors. +Where exact reproducibility is not possible \new{(for example due to parallelization)}, values can be verified by any statistical means, specified by the project authors. \begin{figure*}[t] \begin{center} \includetikz{figure-branching}{scale=1}\end{center} @@ -498,13 +499,14 @@ However, because the PM and analysis components share the same job manager (Make They later share their low-level commits on the core branch, thus propagating it to all derived projects. A related caveat is that, POSIX is a fuzzy standard, not guaranteeing the bit-wise reproducibility of programs. -It has been chosen here, however, as the underlying platform because our focus is on reproducing the results (data), which does not necessarily need bit-wise reproducible software. -POSIX is ubiquitous and low-level software (e.g., core GNU tools) are install-able on most; each internally corrects for differences affecting its functionality (partly as part of the GNU portability library). +It has been chosen here, however, as the underlying platform \new{because our focus is on reproducing the results (output of software), not the software itself.} +POSIX is ubiquitous and low-level software (e.g., core GNU tools) are install-able on most. +Well written software internally corrects for differences in OS or hardware that may affect its functionality (through tools like the GNU portability library). On GNU/Linux hosts, Maneage builds precise versions of the compilation tool chain. -However, glibc is not install-able on some POSIX OSs (e.g., macOS). -All programs link with the C library, and this may hypothetically hinder the exact reproducibility \emph{of results} on non-GNU/Linux systems, but we have not encountered this in our research so far. -With everything else under precise control, the effect of differing Kernel and C libraries on high-level science can now be systematically studied with Maneage in follow-up research. -\new{Using continuous integration (CI) is one way to precisely identify breaking points with updated technologies on available systems.} +However, glibc is not install-able on some POSIX OSs (e.g., macOS) and all programs link with the C library. +This may hypothetically hinder the exact reproducibility \emph{of results} on non-GNU/Linux systems, but we have not encountered this in our research so far. +With everything else under precise control in Maneage, the effect of differing Kernel and C libraries on high-level science can now be systematically studied in follow-up research \new{(including floating-point arithmetic or optimization differences). +Using continuous integration (CI) is one way to precisely identify breaking points on multiple systems.} % DVG: It is a pity that the following paragraph cannot be included, as it is really important but perhaps goes beyond the intended goal. %Thirdly, publishing a project's reproducible data lineage immediately after publication enables others to continue with follow-up papers, which may provide unwanted competition against the original authors. @@ -528,7 +530,7 @@ From the data repository perspective, these criteria can also be useful, e.g., t (2) Automated and persistent bidirectional linking of data and publication can be established through the published \emph{and complete} data lineage that is under version control. (3) Software management: with these criteria, each project comes with its unique and complete software management. It does not use a third-party PM that needs to be maintained by the data center (and the many versions of the PM), hence enabling robust software management, preservation, publishing, and citation. -For example, see \href{https://doi.org/10.5281/zenodo.3524937}{zenodo.3524937}, \href{https://doi.org/10.5281/zenodo.3408481}{zenodo.3408481}, \href{https://doi.org/10.5281/zenodo.1163746}{zenodo.1163746}, where we have exploited the free-software criterion to distribute the tarballs of all the software used with each project's source as deliverables. +For example, see \href{https://doi.org/10.5281/zenodo.1163746}{zenodo.1163746}, \href{https://doi.org/10.5281/zenodo.3408481}{zenodo.3408481}, \href{https://doi.org/10.5281/zenodo.3524937}{zenodo.3524937}, \href{https://doi.org/10.5281/zenodo.3951151}{zenodo.3951151} or \href{https://doi.org/10.5281/zenodo.4062460}{zenodo.4062460} where we have exploited the free-software criterion to distribute the source code of all software used in each project as deliverables. (4) ``Linkages between documentation, code, data, and journal articles in an integrated environment'', which effectively summarizes the whole purpose of these criteria. @@ -1098,7 +1100,7 @@ In summary IDEs are generally very specialized tools, for special projects and a \label{appendix:jupyter} Jupyter (initially IPython) \citeappendix{kluyver16} is an implementation of Literate Programming \citeappendix{knuth84}. The main user interface is a web-based ``notebook'' that contains blobs of executable code and narrative. -Jupyter uses the custom built \inlinecode{.ipynb} format\footnote{\url{https://nbformat.readthedocs.io/en/latest}}. +Jupyter uses the custom built \inlinecode{.ipynb} format\footnote{\inlinecode{\url{https://nbformat.readthedocs.io/en/latest}}}. Jupyter's name is a combination of the three main languages it was designed for: Julia, Python and R. The \inlinecode{.ipynb} format, is a simple, human-readable (can be opened in a plain-text editor) file, formatted in JavaScript Object Notation (JSON). It contains various kinds of ``cells'', or blobs, that can contain narrative description, code, or multi-media visualizations (for example images/plots), that are all stored in one file. @@ -1110,7 +1112,7 @@ Defining dependencies between the cells can allow non-linear execution which is It allows automation, run-time optimization (deciding not to run a cell if its not necessary) and parallelization. However, Jupyter currently only supports a linear run of the cells: always from the start to the end. It is possible to manually execute only one cell, but the previous/next cells that may depend on it, also have to be manually run (a common source of human error, and frustration for complex operations). -Integration of directional graph features (dependencies between the cells) into Jupyter has been discussed, but as of this publication, there is no plan to implement it (see Jupyter's GitHub issue 1175\footnote{\url{https://github.com/jupyter/notebook/issues/1175}}). +Integration of directional graph features (dependencies between the cells) into Jupyter has been discussed, but as of this publication, there is no plan to implement it (see Jupyter's GitHub issue 1175\footnote{\inlinecode{\url{https://github.com/jupyter/notebook/issues/1175}}}). The fact that the \inlinecode{.ipynb} format stores narrative text, code and multi-media visualization of the outputs in one file, is another major hurdle: The files can easy become very large (in volume/bytes) and hard to read. @@ -1302,7 +1304,7 @@ Taverna is only a workflow manager and isn't integrated with a package manager, \label{appendix:madagascar} Madagascar\footnote{\inlinecode{\url{http://ahay.org}}} \citeappendix{fomel13} is a set of extensions to the SCons job management tool (reviewed in \ref{appendix:scons}). Madagascar is a continuation of the Reproducible Electronic Documents (RED) project that was discussed in Appendix \ref{appendix:red}. -Madagascar has been used in the production of hundreds of research papers or book chapters\footnote{\url{http://www.ahay.org/wiki/Reproducible_Documents}}, 120 prior to \citeappendix{fomel13}. +Madagascar has been used in the production of hundreds of research papers or book chapters\footnote{\inlinecode{\url{http://www.ahay.org/wiki/Reproducible_Documents}}}, 120 prior to \citeappendix{fomel13}. Madagascar does include project management tools in the form of SCons extensions. However, it isn't just a reproducible project management tool. |