aboutsummaryrefslogtreecommitdiff
path: root/peer-review
diff options
context:
space:
mode:
authorBoud Roukema <boud@cosmo.torun.pl>2020-11-25 17:07:28 +0100
committerBoud Roukema <boud@cosmo.torun.pl>2020-11-25 17:07:28 +0100
commit5af1406a499ed3936fea303a532015e90370c4b6 (patch)
treeded2c1202a09c5a18bb819a30d271edbb3bdbdcd /peer-review
parent8b5473136dc48846bb44b986dfd755c1e3f6f332 (diff)
Reviewer points 1-15; appendix clickable links
This commit updates "paper.tex" and "peer-review/1-answer.txt" for the first 15 (out of 59!) reviewer points, excluding points 2 (not yet done) and 9 (README-hacking.md needs tidying). A fix to "reproduce/analysis/make/paper.mk" for the links in the appendices is also done in this commit (the same algorithm as for paper.tex is added). The links in the appendices are not (yet) clickable.
Diffstat (limited to 'peer-review')
-rw-r--r--peer-review/1-answer.txt141
1 files changed, 78 insertions, 63 deletions
diff --git a/peer-review/1-answer.txt b/peer-review/1-answer.txt
index 9c6bbd9..91cf3d8 100644
--- a/peer-review/1-answer.txt
+++ b/peer-review/1-answer.txt
@@ -45,8 +45,8 @@ ANSWER:
3. [Associate Editor] Some terminology is not well-defined
(e.g. longevity).
-ANSWER: It has now been clearly defined in the first paragraph of Section
-II. With this definition, the main argument of the paper is much clearer,
+ANSWER: Longevity has now been defined in the first paragraph of Section
+II. With this definition, the main argument of the paper is clearer,
thank you (and thank you to the referees for highlighting this).
------------------------------
@@ -59,7 +59,9 @@ thank you (and thank you to the referees for highlighting this).
categorization to characterize their longevity.
ANSWER: The longevity of the general tools reviewed in Section II is now
-mentioned immediately after each (highlighted in green).
+mentioned immediately after each (VMs, SHARE: discontinued in 2019;
+Docker: 6 months; python-dependent package managers: a few years;
+Jupyter notebooks: shortest longevity non-core python dependency).
------------------------------
@@ -70,7 +72,7 @@ mentioned immediately after each (highlighted in green).
5. [Associate Editor] Background and related efforts need significant
improvement. (See below.)
-ANSWER: This has been done, as mentioned in (1).
+ANSWER: This has been done, as mentioned in (1.) above.
------------------------------
@@ -81,7 +83,7 @@ ANSWER: This has been done, as mentioned in (1).
6. [Associate Editor] There is consistency among the reviews that
related work is particularly lacking.
-ANSWER: This has been done, as mentioned in (1).
+ANSWER: This has been done, as mentioned in (1.) above.
------------------------------
@@ -93,9 +95,10 @@ ANSWER: This has been done, as mentioned in (1).
explaining how it deals with the nagging problem of running on CPU
vs. different architectures.
-ANSWER: The CPU architecture of the running system is now reported in the
-"Acknowledgments" section and a description of the problem and its solution
-in Maneage is also added in the "Proof of concept: Maneage" Section.
+ANSWER: The CPU architecture of the running system is now reported in
+the "Acknowledgments" section and a description of the problem and its
+solution in Maneage is also added and illustrated in the "Proof of
+concept: Maneage" Section.
------------------------------
@@ -109,10 +112,11 @@ in Maneage is also added in the "Proof of concept: Maneage" Section.
architectures. Is CI employed in any way in the work presented in
this article?
-ANSWER: CI has been added in the discussion as one solution to find
-breaking points in operating system updates and new/different
-architectures. For the core Maneage branch, we have defined task #15741 [1]
-to add CI on many architectures in the near future.
+ANSWER: CI has been added in the discussion section (V) as one
+solution to find breaking points in operating system updates and
+new/different architectures. For the core Maneage branch, we have
+defined task #15741 [1] to add CI on many architectures in the near
+future.
[1] http://savannah.nongnu.org/task/?15741
@@ -147,9 +151,13 @@ README-hacking.md webpage into smaller pages that can be entered.
related work to help readers understand the clear need for a new
approach, if this is being presented as new/novel.
-ANSWER: Thank you for highlighting this important point. We saw that its
-necessary to contrast our proof of concept demonstration more directly with
-Maneage. Two paragraphs have been added in Sections II and IV for this.
+ANSWER: Thank you for highlighting this important point. We saw that
+it is necessary to contrast our Maneage proof-of-concept demonstration
+more directly against the Jupyter notebook type of approach. Two
+paragraphs have been added in Sections II and IV to clarify this (our
+criteria require and build in more modularity and longevity than
+Jupyter).
+
------------------------------
@@ -188,33 +196,32 @@ ANSWER:
and the related provenance work that has already been done and can be
exploited using these criteria and our proof of concept is indeed very
large. However, the 6250 word-count limit is very tight and if we add
- more on it in this length, we would have to remove more directly
- relevant points. Hopefully this can be the subject of a follow up
- paper.
+ more on it in this length, we would have to remove points of higher priority.
+ Hopefully this can be the subject of a follow-up paper.
3. A review of ReproZip is in Appendix B.
4. A review of Occam is in Appendix B.
5. A review of Popper is in Appendix B.
-6. A review of Whole tale is in Appendix B.
+6. A review of Whole Tale is in Appendix B.
7. A review of Snakemake is in Appendix A.
-8. CWL and WDL are described in Appendix A (job management).
-9. Nextflow is described in Appendix A (job management).
+8. CWL and WDL are described in Appendix A (Job management).
+9. Nextflow is described in Appendix A (Job management).
10. Sumatra is described in Appendix B.
-11. Podman is mentioned in Appendix A (containers).
-12. AppImage is mentioned in Appendix A (package management).
-13. Flatpak is mentioned in Appendix A (package management).
-14. nbdev and jupytext are high-level tools to generate documentation and
+11. Podman is mentioned in Appendix A (Containers).
+12. AppImage is mentioned in Appendix A (Package management).
+13. Flatpak is mentioned in Appendix A (Package management).
+14. Snap is mentioned in Appendix A (Package management).
+15. nbdev and jupytext are high-level tools to generate documentation and
packaging custom code in Conda or pypi. High-level package managers
like Conda and Pypi have already been thoroughly reviewed in Appendix A
- for their longevity issues, so we feel there is no need to include
- these.
-15. Bazel has been mentioned in Appendix A (job management).
-16. Debian's reproducible builds is only for ensuring that software
- packaged for Debian are bitwise reproducible. As mentioned in the
- discussion of this paper, the bitwise reproducibility of software is
- not an issue in the context discussed here, the reproducibility of the
+ for their longevity issues, so we feel that there is no need to
+ include these.
+16. Bazel is mentioned in Appendix A (job management).
+17. Debian's reproducible builds are only designed for ensuring that software
+ packaged for Debian is bitwise reproducible. As mentioned in the
+ discussion section of this paper, the bitwise reproducibility of software is
+ not an issue in the context discussed here; the reproducibility of the
relevant output data of the software is the main issue.
-
------------------------------
@@ -231,14 +238,18 @@ ANSWER:
* Executable/reproducible paper articles and original concepts
ANSWER: Thank you for highlighting these points. Appendix B starts with a
-subsection titled "suggested rules, checklists or criteria" that review of
-existing criteria. That include the proposed sources here (and others).
+subsection titled "suggested rules, checklists or criteria" with a review of
+existing sets of criteria. This subsection includes the sources proposed
+by the reviewer [Sandve et al; Rule et al; Nust et al] (and others).
-arXiv:1401.2000 has been added in Appendix A as an example paper using
+ArXiv:1401.2000 has been added in Appendix A as an example paper using
virtual machines. We thank the referee for bringing up this paper, because
-the link to the VM provided in the paper no longer works (the file has been
-removed on the server). Therefore added with SHARE, it very nicely
-highlighting our main issue with binary containers or VMs and their lack of
+the link to the VM provided in the paper no longer works (the URL
+http://archive.comp-phys.org/provenance_challenge/provenance_machine.ova
+redirects to
+https://share.phys.ethz.ch//~alpsprovenance_challenge/provenance_machine.ova
+which gives a 'Not Found' html response). Together with SHARE, this very nicely
+highlights our main issue with binary containers or VMs: their lack of
longevity.
------------------------------
@@ -261,17 +272,18 @@ longevity.
requires significant resources to translate or rewrite every
few years."
-ANSWER: They have been clarified in the highlighted parts of the text:
+ANSWER: These points have been clarified in the highlighted parts of the text:
-1. Many examples have been given throughout the newly added appendices. To
- avoid confusion in the main body of the paper, we have removed the "we
- have surveyed" part. It is already mentioned above it that a large
- survey of existing methods/solutions is given in the appendices.
+1. Many examples have been given throughout the newly added
+ appendices. To avoid confusion in the main body of the paper, we
+ have removed the "we have surveyed" part. It is already mentioned
+ above this point in the text that a large survey of existing
+ methods/solutions is given in the appendices.
2. Due to the thorough discussion of this issue in the appendices with
precise examples, this line has been removed to allow space for the
other points raised by the referees. The main point (high cost of
- keeping binaries) is aldreay abundantly clear.
+ keeping binaries) is already abundantly clear.
On a similar topic, Dockerhub's recent announcement that inactive images
(for over 6 months) will be deleted has also been added. The announcemnt
@@ -280,8 +292,9 @@ ANSWER: They have been clarified in the highlighted parts of the text:
https://www.docker.com/blog/docker-hub-image-retention-policy-delayed-and-subscription-updates
3. A small statement has been added, reminding the readers that almost all
- free software projects are built with Make (note that CMake is just a
- high-level wrapper over Make: it finally produces a 'Makefile').
+ free software projects are built with Make (CMake is popular, but it is just a
+ high-level wrapper over Make: it finally produces a 'Makefile'; practical
+ usage of CMake generally obliges the user to understand Make).
4. The example of Python 2 has been added.
@@ -299,23 +312,24 @@ ANSWER: They have been clarified in the highlighted parts of the text:
papers have been written on this topic). Note that CI is
well-established technology (e.g. Jenkins is almost 10 years old).
-ANSWER: Thank you for raising this issue. We had initially planned to add
-this issue also, but like many discussion points, we were forced to remove
+ANSWER: Thank you for raising these issues. We had initially planned to
+discuss CIs, but like many discussion points, we were forced to remove
it before the first submission due to the very tight word-count limit. We
have now added a sentence on CI in the discussion.
-On the initial note, indeed, the "executable" files of Bash, Git or Make
-are not bitwise reproducible/identical on different systems. However, as
-mentioned in the discussion, we are concerned with the _output_ of the
-software's executable file, _after_ the execution of its job. We (or any
-user of Bash) is not interested in the executable file itself. The
-reproducibility of the binary file only becomes important if a bug is found
-(very rare for common usage in such core software of the OS). Hence even
-though the compiled binary files of specific versions of Git, Bash or Make
+On the issue of Bash/Git/Make, indeed, the _executable_ Bash, Git and
+Make binaries are not bitwise reproducible/identical on different
+systems. However, as mentioned in the discussion, we are concerned
+with the _output_ of the software's executable file, _after_ the
+execution of its job. We (or any user of Bash) is not interested in
+the executable file itself. The reproducibility of the binary file
+only becomes important if a significant bug is found (very rare for
+ordinary usage of such core software of the OS). Hence, even though
+the compiled binary files of specific versions of Git, Bash or Make
will not be bitwise reproducible/identical on different systems, their
-outputs are exactly reproducible: 'git describe' or Bash's 'for' loop will
-have the same output on GNU/Linux, macOS or FreeBSD (that produce bit-wise
-different executables).
+scientific outputs are exactly reproducible: 'git describe' or Bash's
+'for' loop will have the same output on GNU/Linux, macOS/Darwin or
+FreeBSD (despite having bit-wise different executables).
------------------------------
@@ -326,9 +340,10 @@ different executables).
15. [Reviewer 1] Criterion has been proposed previously. Maneage itself
provides little novelty (see comments below).
-ANSWER: The previously suggested criteria that were mentioned are reviewed
-in the newly added Appendix B, and the novelty/necessity of the proposed
-criteria is shown by comparison there.
+ANSWER: The previously suggested sets of criteria that were listed by
+Reviewer 1 are reviewed by us in the newly added Appendix B, and the
+novelty and advantages of our proposed criteria are contrasted there
+with the earlier sets of criteria.
------------------------------