diff options
-rw-r--r-- | paper.tex | 21 |
1 files changed, 10 insertions, 11 deletions
@@ -126,11 +126,10 @@ Decades later, scientists are still held accountable for their results and there \section{Longevity of existing tools} \label{sec:longevityofexisting} \new{Reproducibility is defined as ``obtaining consistent results using the same input data; computational steps, methods, and code; and conditions of analysis'' \cite{fineberg19}. -We define \emph{longevity} as the length of time that the source of a project remains: -(1) readable for humans, \emph{and} -(2) \emph{machine-actionable} \emph{and/or} executable for the available machines and platforms during the very time span. -Many usage contexts do not involve execution: for example, checking the configuration parameter of a single step of the analysis to re-\emph{use} in another project, or checking the version of used software, or the source of the input data. -Extracting these from the outputs of execution is not always possible.} +Longevity is defined as the length of time that a project remains \emph{functional} after its creation. +Functionality is defined as \emph{human readability} of the source and its \emph{execution possibility} (when necessary). +Many usage contexts of a project don't involve execution: for example, checking the configuration parameter of a single step of the analysis to re-\emph{use} in another project, or checking the version of used software, or the source of the input data. +Extracting these from execution outputs is not always possible.} Longevity is as important in science as in some fields of industry, but not all; e.g., fast-evolving tools can be appropriate in short-term commercial projects. To highlight the necessity, a short review of commonly-used tools is provided below: @@ -276,6 +275,7 @@ In such cases, it is best to immediately convert the data upon collection, and a + \section{Proof of concept: Maneage} With the longevity problems of existing tools outlined above, a proof-of-concept tool is presented here via an implementation that has been tested in published papers \cite{akhlaghi19, infante20}. @@ -322,12 +322,11 @@ On GNU/Linux distributions, even the GNU Compiler Collection (GCC) and GNU Binut Currently, {\TeX}Live is also being added (task \href{http://savannah.nongnu.org/task/?15267}{15267}), but that is only for building the final PDF, not affecting the analysis or verification. \new{Finally, some software cannot be built on some CPU architectures, hence by default, the architecture is included in the final built paper automatically (see below).} -\new{Because everything is built from source, building the core Maneage environment on an 8-core CPU takes about 1.5 hours (GCC consumes more than half of the time). -When the analysis involves complex computations, this is negligible compared to the actual analysis. -Also, due to the Git features blended into Maneage, it is best (from the perspective of provenance) to start a project immediately within Maneage, thereby recording the history of changes as the project matures. -To avoid repeating the build on different systems, Maneage'd projects can be built in a container or VM. -The \inlinecode{README.md} file \href{https://archive.softwareheritage.org/browse/origin/directory/?origin_url=http://git.maneage.org/project.git}{has instructions} on building a Maneage'd project in Docker. -Through Docker (or VMs), users on Microsoft Windows can benefit from Maneage, and for Windows-native software that can be run in batch-mode, technologies like Windows Subsystem for Linux can be used.} +\new{Building the core Maneage software environment on an 8-core CPU takes about 1.5 hours (GCC consumes more than half of the time). +However, this is only necessary once for every computer, the analysis phase (which usually takes months to write for a normal project) will use the same environment later. +To facilitate moving to another computer in the short term, Maneage'd projects can be built in a container or VM. +The \href{https://archive.softwareheritage.org/browse/origin/directory/?origin_url=http://git.maneage.org/project.git}{\inlinecode{README.md}} file has instructions on building in Docker. +Through Docker (or VMs), users on Microsoft Windows can benefit from Maneage, and for Windows-native software that can be run in batch-mode, evolving technologies like Windows Subsystem for Linux may be usable.} The analysis phase of the project however is naturally different from one project to another at a low-level. It was thus necessary to design a generic framework to comfortably host any project, while still satisfying the criteria of modularity, scalability, and minimal complexity. |