aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--paper.tex21
1 files changed, 10 insertions, 11 deletions
diff --git a/paper.tex b/paper.tex
index 08dab9f..d07eabe 100644
--- a/paper.tex
+++ b/paper.tex
@@ -126,11 +126,10 @@ Decades later, scientists are still held accountable for their results and there
\section{Longevity of existing tools}
\label{sec:longevityofexisting}
\new{Reproducibility is defined as ``obtaining consistent results using the same input data; computational steps, methods, and code; and conditions of analysis'' \cite{fineberg19}.
-We define \emph{longevity} as the length of time that the source of a project remains:
-(1) readable for humans, \emph{and}
-(2) \emph{machine-actionable} \emph{and/or} executable for the available machines and platforms during the very time span.
-Many usage contexts do not involve execution: for example, checking the configuration parameter of a single step of the analysis to re-\emph{use} in another project, or checking the version of used software, or the source of the input data.
-Extracting these from the outputs of execution is not always possible.}
+Longevity is defined as the length of time that a project remains \emph{functional} after its creation.
+Functionality is defined as \emph{human readability} of the source and its \emph{execution possibility} (when necessary).
+Many usage contexts of a project don't involve execution: for example, checking the configuration parameter of a single step of the analysis to re-\emph{use} in another project, or checking the version of used software, or the source of the input data.
+Extracting these from execution outputs is not always possible.}
Longevity is as important in science as in some fields of industry, but not all; e.g., fast-evolving tools can be appropriate in short-term commercial projects.
To highlight the necessity, a short review of commonly-used tools is provided below:
@@ -276,6 +275,7 @@ In such cases, it is best to immediately convert the data upon collection, and a
+
\section{Proof of concept: Maneage}
With the longevity problems of existing tools outlined above, a proof-of-concept tool is presented here via an implementation that has been tested in published papers \cite{akhlaghi19, infante20}.
@@ -322,12 +322,11 @@ On GNU/Linux distributions, even the GNU Compiler Collection (GCC) and GNU Binut
Currently, {\TeX}Live is also being added (task \href{http://savannah.nongnu.org/task/?15267}{15267}), but that is only for building the final PDF, not affecting the analysis or verification.
\new{Finally, some software cannot be built on some CPU architectures, hence by default, the architecture is included in the final built paper automatically (see below).}
-\new{Because everything is built from source, building the core Maneage environment on an 8-core CPU takes about 1.5 hours (GCC consumes more than half of the time).
-When the analysis involves complex computations, this is negligible compared to the actual analysis.
-Also, due to the Git features blended into Maneage, it is best (from the perspective of provenance) to start a project immediately within Maneage, thereby recording the history of changes as the project matures.
-To avoid repeating the build on different systems, Maneage'd projects can be built in a container or VM.
-The \inlinecode{README.md} file \href{https://archive.softwareheritage.org/browse/origin/directory/?origin_url=http://git.maneage.org/project.git}{has instructions} on building a Maneage'd project in Docker.
-Through Docker (or VMs), users on Microsoft Windows can benefit from Maneage, and for Windows-native software that can be run in batch-mode, technologies like Windows Subsystem for Linux can be used.}
+\new{Building the core Maneage software environment on an 8-core CPU takes about 1.5 hours (GCC consumes more than half of the time).
+However, this is only necessary once for every computer, the analysis phase (which usually takes months to write for a normal project) will use the same environment later.
+To facilitate moving to another computer in the short term, Maneage'd projects can be built in a container or VM.
+The \href{https://archive.softwareheritage.org/browse/origin/directory/?origin_url=http://git.maneage.org/project.git}{\inlinecode{README.md}} file has instructions on building in Docker.
+Through Docker (or VMs), users on Microsoft Windows can benefit from Maneage, and for Windows-native software that can be run in batch-mode, evolving technologies like Windows Subsystem for Linux may be usable.}
The analysis phase of the project however is naturally different from one project to another at a low-level.
It was thus necessary to design a generic framework to comfortably host any project, while still satisfying the criteria of modularity, scalability, and minimal complexity.