From 0f3d6ab47f96ef14cd197f2ff3228338b57c937a Mon Sep 17 00:00:00 2001 From: Boud Roukema Date: Wed, 2 Dec 2020 05:13:23 +0100 Subject: URL of statistical verification This commit adds the SWH URL of the statistical verification script to the paper and tidies up the corresponding answer in '1-answer.txt'. The script file includes more extensive documentation than the earlier 'make' version of the method. --- paper.tex | 4 ++-- peer-review/1-answer.txt | 17 ++++++++++------- 2 files changed, 12 insertions(+), 9 deletions(-) diff --git a/paper.tex b/paper.tex index 93f8094..98498ad 100644 --- a/paper.tex +++ b/paper.tex @@ -287,7 +287,7 @@ In such cases, it is best to immediately convert the data upon collection, and a \section{Proof of concept: Maneage} With the longevity problems of existing tools outlined above, a proof-of-concept tool is presented here via an implementation that has been tested in published papers \cite{akhlaghi19, infante20}. -\new{Since the initial submission of this paper, it has also been used in \href{https://doi.org/10.5281/zenodo.3951151}{zenodo.3951151} (on the COVID-19 pandemic) and \href{https://doi.org/10.5281/zenodo.4062460}{zenodo.4062460} (which illustrates statistical reproducibility for parallelised code).} +\new{Since the initial submission of this paper, it has also been used in \href{https://doi.org/10.5281/zenodo.3951151}{zenodo.3951151} (on the COVID-19 pandemic) and \href{https://doi.org/10.5281/zenodo.4062460}{zenodo.4062460} (which \href{https://archive.softwareheritage.org/swh:1:cnt:4217e24e4a474ba43a4d30abfb0a42b823ef4640}{illustrates statistical reproducibility}, e.g., for parallelised code).} It was also awarded a Research Data Alliance (RDA) adoption grant for implementing the recommendations of the joint RDA and World Data System (WDS) working group on Publishing Data Workflows \cite{austin17}, from the researchers' perspective. The tool is called Maneage, for \emph{Man}aging data Lin\emph{eage} (the ending is pronounced as in ``lineage''), hosted at \url{https://maneage.org}. @@ -408,7 +408,7 @@ Other built files (intermediate analysis steps) cascade down in the lineage to o Just before reaching the ultimate target (\inlinecode{paper.pdf}), the lineage reaches a bottleneck in \inlinecode{verify.mk} to satisfy the verification criteria (this step was not yet available in \cite{akhlaghi19, infante20}). All project deliverables (macro files, plot or table data and other datasets) are verified at this stage, with their checksums, to automatically ensure exact reproducibility. -Where exact reproducibility is not possible \new{(for example due to parallelization)}, values can be verified by any statistical means, specified by the project authors. +Where exact reproducibility is not possible \new{(for example, due to parallelization)}, values can be verified by \new{\href{https://archive.softwareheritage.org/swh:1:cnt:4217e24e4a474ba43a4d30abfb0a42b823ef4640}{a statistical method specified}} by the project authors. \begin{figure*}[t] \begin{center} \includetikz{figure-branching}{scale=1}\end{center} diff --git a/peer-review/1-answer.txt b/peer-review/1-answer.txt index b837ce4..76c574e 100644 --- a/peer-review/1-answer.txt +++ b/peer-review/1-answer.txt @@ -686,7 +686,6 @@ ANSWER: Maneage is flexible enough to enable a wide range of workflows to be implemented. This is done by leveraging the highly modular and flexible nature of Makefiles run via 'Make'. - ------------------------------ @@ -702,13 +701,17 @@ ANSWER: Floating point errors and optimizations have been mentioned in the discussion (Section V). The issue with parallelization has also been discussed in Section IV, in the part on verification ("Where exact reproducibility is not possible (for example due to parallelization), -values can be verified by any statistical means, specified by the project -authors."). +values can be verified by a statistical method specified by the project +authors."). We have linked keywords in the latter sentence to a Software +Heritage URI [swh] with the specific file in the Peper and Roukema +Maneage'd paper that illustrates an example of how statistical +verification of parallelised code can work in practice. + +We would be interested to hear if any other papers already exist that use +automatic statistical verification of parallelised code as has been done +in this Maneage'd paper. -##################### -Find a good way to link to (Peper and Roukema: -https://doi.org/10.5281/zenodo.4062460) -##################### +[swh] https://archive.softwareheritage.org/swh:1:cnt:4217e24e4a474ba43a4d30abfb0a42b823ef4640 ------------------------------ -- cgit v1.2.1