aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMohammad Akhlaghi <mohammad@akhlaghi.org>2020-06-30 03:39:36 +0100
committerMohammad Akhlaghi <mohammad@akhlaghi.org>2020-06-30 03:39:36 +0100
commitfb7154a9c0595f616036b0cea4c1d1dd38863496 (patch)
tree381c903113c16e7dbd72f1f1f580cae80acabe2e
parent991f4c25729ac2526f30abf3d68111dd820fbfed (diff)
Implemented comments by Mervyn O'Luing
Mervyn had read the paper and provided some interesting thoughts that I tried to implement. Mervyn's comments are shown below. I just haven't addressed the last point yet, because I am affraid it may make the text too long (we are already on the boundary of the word-limit). We have already discussed that it is a good research topic, and have hopefully triggered the curiosity of the readers to test it ;-). ------------------- Page 2: Regarding Criterion 1: Completeness. A project must be self contained? So this includes not requiring root or administrator privileges. This suggests that the project is only made open after the development has been completed? Regarding Criterion 5: 'a clerk can do it' -- in the pc world that we live in could this be taken as a disparaging comment? Page 5: 'The C library is linked with all programs, and this dependence can hypothetically hinder exact reproducibility of results, but we have not encountered this so far.' - what do you think might happen if this does affect reproducibility? Do you have a plan to deal with this? Or are you going to wait until you hear of such cases as the number will probably be small? Have you done probability analysis to show that the rates are likely to be very small? Or should you have a disclaimer with maneage?
-rw-r--r--paper.tex36
1 files changed, 17 insertions, 19 deletions
diff --git a/paper.tex b/paper.tex
index e8b8e70..a306ce6 100644
--- a/paper.tex
+++ b/paper.tex
@@ -140,8 +140,7 @@ We will thus focus on Docker here.
Ideally, it is possible to precisely identify the Docker ``images'' that are imported with their checksums, but that is rarely practiced in most solutions that we have surveyed.
Usually, images are imported with generic operating system (OS) names; e.g., \cite{mesnard20} uses `\inlinecode{FROM ubuntu:16.04}'.
-The extracted tarball (from \url{https://partner-images.canonical.com/core/xenial}) is updated almost monthly and only the most
-recent five are archived.
+The extracted tarball (from \url{https://partner-images.canonical.com/core/xenial}) is updated almost monthly and only the most recent five are archived.
Hence, if the Dockerfile is run in different months, its output image will contain different OS components.
In the year 2024, when long-term support for this version of Ubuntu expires, the image will be unavailable at the expected URL.
Other OSes have similar issues because pre-built binary files are large and expensive to maintain and archive.
@@ -184,14 +183,14 @@ We argue and propose that workflows satisfying the following criteria can not on
\textbf{Criterion 1: Completeness.}
A project that is complete (self-contained) has the following properties.
-(1) It has no dependency beyond the Portable Operating System Interface: POSIX (a minimal Unix-like environment).
+(1) No dependency beyond the Portable Operating System Interface: POSIX (a minimal Unix-like environment).
POSIX has been developed by the Austin Group (which includes IEEE) since 1988 and many OSes have complied.
-(2) ``No dependency'' requires that the project itself must be primarily stored in plain text, not needing specialized software to open, parse, or execute.
-(3) It does not affect the host OS (its libraries, programs, or environment).
-(4) It does not require root or administrator privileges.
-(5) It builds its own controlled software for an independent environment.
-(6) It can run locally (without an internet connection).
-(7) It contains the full project's analysis, visualization \emph{and} narrative: from access to raw inputs to doing the analysis, producing final data products \emph{and} its final published report with figures \emph{as output}, e.g., PDF or HTML.
+(2) Primarily stored as plain text, not needing specialized software to open, parse, or execute.
+(3) No affect on the host OS libraries, programs or environment.
+(4) Does not require root privileges to run (during development or post-publication).
+(5) Builds its own controlled software for an independent environment.
+(6) Can run locally (without an internet connection).
+(7) Contains the full project's analysis, visualization \emph{and} narrative: including instructions to automatically access/download raw inputs, build necessary software, do the analysis, produce final data products \emph{and} final published report with figures \emph{as output}, e.g., PDF or HTML.
(8) It can run automatically, with no human interaction.
\textbf{Criterion 2: Modularity.}
@@ -215,8 +214,7 @@ A scalable project can easily be used in arbitrarily large and/or complex projec
On a small scale, the criteria here are trivial to implement, but can rapidly become unsustainable.
\textbf{Criterion 5: Verifiable inputs and outputs.}
-The project should verify its inputs (software source code and data) \emph{and} outputs.
-Reproduction should be straightforward enough so that ``\emph{a clerk can do it}''\cite{claerbout1992} (with no expert knowledge).
+The project should automatically verify its inputs (software source code and data) \emph{and} outputs, not needing ny expert knowledge.
\textbf{Criterion 6: Recorded history.}
No exploratory research is done in a single, first attempt.
@@ -228,12 +226,11 @@ The derivation ``history'' of a result is thus not any the less valuable as itse
\textbf{Criterion 7: Including narrative, linked to analysis.}
A project is not just its computational analysis.
A raw plot, figure or table is hardly meaningful alone, even when accompanied by the code that generated it.
-A narrative description must also be part of the deliverables (defined as ``data article'' in \cite{austin17}): describing the purpose of the computations, and interpretations of the result, and the context in relation to
-other projects/papers.
+A narrative description must is also a deliverable (defined as ``data article'' in \cite{austin17}): describing the purpose of the computations, and interpretations of the result, and the context in relation to other projects/papers.
This is related to longevity, because if a workflow contains only the steps to do the analysis or generate the plots, in time it may get separated from its accompanying published paper.
\textbf{Criterion 8: Free and open source software:}
-Reproducibility (defined in \cite{fineberg19}) can be achieved with a black box (non-free or non-open-source software); this criterion is therefore necessary because nature is already a black box.
+Reproducibility (defined in \cite{fineberg19}) is possible with a black box (non-free or non-open-source software); this criterion is therefore necessary because nature is already a black box.
A project that is free software (as formally defined), allows others to learn from, modify, and build upon it.
When the software used by the project is itself also free, the lineage can be traced to the core algorithms, possibly enabling optimizations on that level and it can be modified for future hardware.
In contrast, non-free tools typically cannot be distributed or modified by others, making it reliant on a single supplier (even without payments).
@@ -283,10 +280,10 @@ Acting as a link, the macro files build the core skeleton of Maneage.
For example, during the software building phase, each software package is identified by a \LaTeX{} file, containing its official name, version and possible citation.
These are combined at the end to generate precise software acknowledgment and citation (see \cite{akhlaghi19, infante20}), which are excluded here because of the strict word limit.
The macro files also act as Make \emph{targets} and \emph{prerequisites} to allow accurate dependency tracking and optimized execution (in parallel, no redundancies), for any level of complexity (e.g., Maneage builds Matplotlib if requested; see Figure~1 of \cite{alliez19}).
-All software dependencies are built down to precise versions of every tool, including the shell, POSIX tools (e.g.,
-GNU Coreutils) or \TeX{}Live, providing the same environment.
-On GNU/Linux distributions, even the GNU Compiler Collection (GCC) and GNU Binutils are built from source and the GNU C library (glibc) is being added (task 15390).
-Temporary relocation of a project, without building from source, can be done by building the project in a container or VM.
+All software dependencies are built down to precise versions of every tool, including the shell, POSIX tools (e.g., GNU Coreutils) and ofcourse, the high-level science software.
+On GNU/Linux distributions, even the GNU Compiler Collection (GCC) and GNU Binutils are built from source and the GNU C library (glibc) is being added (task \href{http://savannah.nongnu.org/task/?15390}{15390}).
+Currently {\TeX}Live is also being added (task \href{http://savannah.nongnu.org/task/?15267}{15267}), but that is only for building the final PDF, not affecting the analysis or verification.
+Temporary relocation of a built project, without building from source, can be done by building the project in a container or VM (\inlinecode{README.md} has recommendations on building a \inlinecode{Dockerfile}).
The analysis phase of the project however is naturally different from one project to another at a low-level.
It was thus necessary to design a generic framework to comfortably host any project, while still satisfying the criteria of modularity, scalability, and minimal complexity.
@@ -455,7 +452,7 @@ With everything else under precise control, the effect of differing Kernel and C
%This is a long-term goal and would require major changes to academic value systems.
%2) Authors can be given a grace period where the journal or a third party embargoes the source, keeping it private for the embargo period and then publishing it.
-Other implementations of the criteria, or future improvements in Maneage, may solve some of the caveats above, but this proof of concept already shows their many advantages.
+Other implementations of the criteria, or future improvements in Maneage, may solve some of the caveats, but this proof of concept already shows their many advantages.
For example, publication of projects meeting these criteria on a wide scale will allow automatic workflow generation, optimized for desired characteristics of the results (e.g., via machine learning).
The completeness criterion implies that algorithms and data selection can be included in the optimizations.
@@ -497,6 +494,7 @@ Johan Knapen,
Tamara Kovazh,
Terry Mahoney,
Ryan O'Connor,
+Mervyn O'Luing,
Simon Portegies Zwart,
Idafen Santana-P\'erez,
Elham Saremi,