aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--paper.tex35
1 files changed, 28 insertions, 7 deletions
diff --git a/paper.tex b/paper.tex
index 63db54e..5d6b88e 100644
--- a/paper.tex
+++ b/paper.tex
@@ -52,15 +52,36 @@
%% Abstract
{\noindent\mpregular
- Over the last 30 years, many reproducible workflow solutions have been proposed, mostly using the common high-level technology of the day.
- Thus providing immediate reproducibility, but problematic in the long-term because high-level technologies evolve.
- Scientists are accountable to their results decades later, and don't have the resources to re-write their projects.
+ %%CONTEXT
+ Many reproducible workflow solutions have been proposed during recent decades.
+ Most use high-level technology that is popular, providing immediate reproducibility that is not sustainable in the long term.
This creates generational gaps between scientists and makes it hard to build upon previous work.
- In this paper, we report the result of our research project on a fundamentally new design that is founded on the principles of completeness (e.g., no dependency beyond a POSIX-compatible operating system, no administrator privileges, and no network connection), with modular and straightforward design, temporal provenance, scalability, and free software.
- It is called Maneage (managing+lineage) and is stored in machine-actionable, and human-readable plain-text format.
- Facilitating version-control, publication, archival, and automatic parsing to extract data provenance.
- It can build its environment automatically, possibly, in virtual machines, containers or any future technology as a binary blob for immediate/fast reproduction.
+ Decades after their results are published, scientists lack the resources to re-write their project software.
+ %% [This is probably the sentence in this section that could most easily be
+ %%removed: it more or less repeats the basic issue of reproducibility.]
+ %%AIM
+ We aim to introduce a standard of reproducibility criteria that is more rigorous than those previously adopted.
+ %%METHOD
+ In this paper, we propose this new standard: completeness (no dependency beyond a POSIX-compatible operating system, no administrator privileges, and no network connection); modular and straightforward design; temporal provenance; scalability; and free software.
+ %% I would suggest "free-licensed software" or "free-and-open-source
+ %% software". RMS would scream at us, but the risk is that the editor (or
+ %% reader) thinks of free-as-in-beer software. Alternatives include "free
+ %% software (as in free speech)" - but that looks a bit too informal - or
+ %% long expressions such as "free software (in the sense of the Free
+ %% Software Definition). If we have enough words available, "software
+ %% satisfying the Free Software Definition" would be clear and formal (but
+ %% probably too specific, since there's also the Open Source Software
+ %% Definition of the OSI, and Debian's DFSG).
+ %
+ %%RESULTS
+ We demonstrate that these criteria are achievable by presenting a concrete example that satisfies these criteria.
+ "Maneage" (managing+lineage) is stored in machine-actionable, human-readable plain-text format, with version-control, archival, automatic parsing to extract data provenance, and peer-reviewable paper verification.
+ %% It can build its environment automatically or can be placed in a container as a binary blob for immediate/fast reproduction.
+ %% [This sentence is probably sort of true for many systems, and is less critical to the "research question"; I suggest dropping it.]
Maneage has already been used in several scientific publications including the present one, with snapshot \projectversion.
+ %%CONCLUSION
+ Thus, it is realistic to require that reproducibility solutions satisfy our newly proposed standard.
+
\horizontalline
\noindent