aboutsummaryrefslogtreecommitdiff
path: root/paper.tex
diff options
context:
space:
mode:
authorMohammad Akhlaghi <mohammad@akhlaghi.org>2020-05-30 02:36:03 +0100
committerMohammad Akhlaghi <mohammad@akhlaghi.org>2020-05-30 02:36:03 +0100
commit666091b70b6ceeedf5760d2e7b16b40dbbeb7ca8 (patch)
tree685f346debbd308ba5f2fa2a42bfae9e6fa847a0 /paper.tex
parent097077d8e03ec82f16b58d4aac083305e51701cc (diff)
Minor edits removing redundant sentences
Some of the redundant sentences have been removed and some minor edits made.
Diffstat (limited to 'paper.tex')
-rw-r--r--paper.tex7
1 files changed, 3 insertions, 4 deletions
diff --git a/paper.tex b/paper.tex
index 51e816e..b8b502b 100644
--- a/paper.tex
+++ b/paper.tex
@@ -102,9 +102,8 @@ Data Lineage, Provenance, Reproducibility, Scientific Pipelines, Workflows
Reproducible research has been discussed in the sciences for at least 30 years \cite{claerbout1992, fineberg19}.
Many reproducible workflow solutions (hereafter, ``solutions'') have been proposed, mostly relying on the common technology of the day: starting with Make and Matlab libraries in the 1990s, to Java in the 2000s and mostly shifting to Python during the last decade.
-Recently, controlling the environment has been facilitated through generic package managers (PMs) and containers.
-However, because of their high-level nature, such third-party tools for the workflow (not the analysis) develop very fast, e.g., Python 2 code often cannot run with Python 3, interrupting many projects.
+However, technologies develop very fast, e.g., Python 2 code often cannot run with Python 3, interrupting many projects in the last decade.
The cost of staying up to date within this rapidly evolving landscape is high.
Scientific projects, in particular, suffer the most: scientists have to focus on their own research domain, but to some degree they need to understand the technology of their tools, because it determines their results and interpretations.
Decades later, scientists are still held accountable for their results.
@@ -119,7 +118,7 @@ While longevity is important in science and some fields of industry, this is not
To highlight the necessity of longevity in reproducible research, some of the most commonly used tools are reviewed here from this perspective.
Most existing solutions use a common set of third-party tools that can be categorized as:
(1) environment isolators -- virtual machines (VMs) or containers;
-(2) PMs -- Conda, Nix, or Spack;
+(2) package managers (PMs) -- Conda, Nix, or Spack;
(3) job management -- shell scripts, Make, SCons, or CGAT-core;
(4) notebooks -- such as Jupyter.
@@ -273,7 +272,7 @@ Manually updating these in the narrative is prone to errors and discourages impr
The ultimate aim of any project is to produce a report accompanying a dataset, providing visualizations, or a research article in a journal.
Let's call this \inlinecode{paper.pdf}.
-Acting as a link, the macro files of each analysis step (which produce numbers, tables, figures included in the report) thus build the core structure (skeleton) of Maneage.
+Acting as a link, the macro files described above therefore build the core skeleton of Maneage.
For example, during the software building phase, each software package is identified by a \LaTeX{} file, containing its official name, version and possible citation.
These are combined in the end to generate precise software acknowledgment and citation (see \cite{akhlaghi19, infante20}; excluded here due to the strict word limit).