aboutsummaryrefslogtreecommitdiff
path: root/tex/src/appendix-existing-tools.tex
diff options
context:
space:
mode:
Diffstat (limited to 'tex/src/appendix-existing-tools.tex')
-rw-r--r--tex/src/appendix-existing-tools.tex22
1 files changed, 19 insertions, 3 deletions
diff --git a/tex/src/appendix-existing-tools.tex b/tex/src/appendix-existing-tools.tex
index 794a3fe..b6068e4 100644
--- a/tex/src/appendix-existing-tools.tex
+++ b/tex/src/appendix-existing-tools.tex
@@ -85,7 +85,12 @@ On a more fundamental level, VMs or contains do not store \emph{how} the core en
This information is usually in a third-party repository, and not necessarily inside container or VM file, making it hard (if not impossible) to track for future users.
This is a major problem when considering reproducibility which is also highlighted as a major issue in terms of long term reproducibility in \citeappendix{oliveira18}.
-The example of \cite{mesnard20} was previously mentioned in Section \ref{criteria}.
+The example of \cite{mesnard20} was previously mentioned in
+\ifdefined\separatesupplement
+the main body of this paper, when discussing the criteria.
+\else
+in Section \ref{criteria}.
+\fi
Another useful example is the \href{https://github.com/benmarwick/1989-excavation-report-Madjedbebe/blob/master/Dockerfile}{\inlinecode{Dockerfile}} of \citeappendix{clarkso15} (published in June 2015) which starts with \inlinecode{FROM rocker/verse:3.3.2}.
When we tried to build it (November 2020), the core downloaded image (\inlinecode{rocker/verse:3.3.2}, with image ``digest'' \inlinecode{sha256:c136fb0dbab...}) was created in October 2018 (long after the publication of that paper).
In principle, it is possible to investigate the difference between this new image and the old one that the authors used, but that would require a lot of effort and may not be possible where the changes are not available in a third public repository or not under version control.
@@ -288,7 +293,12 @@ Hence we will just review Git here, but the general concept of version control i
\subsubsection{Git}
With Git, changes in a project's contents are accurately identified by comparing them with their previous version in the archived Git repository.
When the user decides the changes are significant compared to the archived state, they can ``commit'' the changes into the history/repository.
-The commit involves copying the changed files into the repository and calculating a 40 character checksum/hash that is calculated from the files, an accompanying ``message'' (a narrative description of the purpose/goals of the changes), and the previous commit (thus creating a ``chain'' of commits that are strongly connected to each other like Figure \ref{fig:branching}).
+The commit involves copying the changed files into the repository and calculating a 40 character checksum/hash that is calculated from the files, an accompanying ``message'' (a narrative description of the purpose/goals of the changes), and the previous commit (thus creating a ``chain'' of commits that are strongly connected to each other like
+\ifdefined\separatesupplement
+the figure on Git in the main body of the paper.
+\else
+Figure \ref{fig:branching}).
+\fi
For example \inlinecode{f4953cc\-f1ca8a\-33616ad\-602ddf\-4cd189\-c2eff97b} is a commit identifier in the Git history of this project.
Commits are is commonly summarized by the checksum's first few characters, for example \inlinecode{f4953cc}.
@@ -313,7 +323,13 @@ There are many tools for managing the sequence of jobs, below we review the most
The most commonly used workflow system for many researchers is to run the commands, experiment on them and keep the output when they are happy with it.
As an improvement, some also keep a narrative description of what they ran.
Atleast in our personal experience with colleagues, this method is still being heavily practiced by many researchers.
-Given that many researchers do not get trained well in computational methods, this is not surprizing and as discussed in Section \ref{discussion}, we believe that improved literacy in computational methods is the single most important factor for the integrity/reproducibility of modern science.
+Given that many researchers do not get trained well in computational methods, this is not surprizing and as discussed in
+\ifdefined\separatesupplement
+the discussion section of the main paper,
+\else
+Section \ref{discussion},
+\fi
+we believe that improved literacy in computational methods is the single most important factor for the integrity/reproducibility of modern science.
\subsubsection{Scripts}
\label{appendix:scripts}