From bb5d173399d657453533cf1bdda584203a1d096e Mon Sep 17 00:00:00 2001 From: Mohammad Akhlaghi Date: Thu, 26 Nov 2020 04:56:07 +0000 Subject: All the referee points have been answered There is an answer for all the referee points now. I also did some minor edits in the paper. But we are still over the limit by around 250 words. The only remaining point that is not yet addressed (and has '####' around it) is the discussion on parallelization and its effect on reproducibility. --- peer-review/1-answer.txt | 102 +++++++++++++++++++++++++++++++---------------- 1 file changed, 68 insertions(+), 34 deletions(-) (limited to 'peer-review') diff --git a/peer-review/1-answer.txt b/peer-review/1-answer.txt index 5e612f8..e0b0da1 100644 --- a/peer-review/1-answer.txt +++ b/peer-review/1-answer.txt @@ -32,9 +32,10 @@ reader can easily access them. 2. [Associate Editor] There are general concerns about the paper lacking focus -########################### -ANSWER: -########################### +ANSWER: With all the corrections/clarifications that have been done in this +review the focus of the paper should be clear now. We are very grateful to +the thorough listing of points by the referees. + ------------------------------ @@ -45,9 +46,10 @@ ANSWER: 3. [Associate Editor] Some terminology is not well-defined (e.g. longevity). -ANSWER: Longevity has now been defined in the first paragraph of Section -II. With this definition, the main argument of the paper is clearer, -thank you (and thank you to the referees for highlighting this). +ANSWER: Reproducibility, Longevity and Usage have now been explicitly +defined in the first paragraph of Section II. With this definition, the +main argument of the paper is clearer, thank you (and thank you to the +referees for highlighting this). ------------------------------ @@ -132,10 +134,13 @@ future. is on the article, it is important that readers not be confused when they visit your site to use your tools. -########################### -ANSWER [NOT COMPLETE]: We should separate the various sections of the -README-hacking.md webpage into smaller pages that can be entered. -########################### +ANSWER: Thank you for raising this important point. We have broken down the +very long "About" page into multiple pages to help in readability: + +https://maneage.org/about.html + +Generally, the webpage will soon undergo major improvements to be even more +clear. ------------------------------ @@ -597,9 +602,9 @@ level of peer-review control. Tutorial. A topic breakdown is interesting, as the markdown reading may be too long to find information. -##################################### -ANSWER: -##################################### +ANSWER: Thank you very much for this good suggestion, it has been +implemented: https://maneage.org/about.html . The webpage will continuously +be improved and such feedback is always very welcome. ------------------------------ @@ -691,9 +696,12 @@ highly modular and flexible nature of Makefiles run via 'Make'. which might occur because of the way the code is written, and the hardware architecture (including if code is optimised / parallelised). -################################ -ANSWER: -################################ +ANSWER: Floating point errors and optimizations have been mentioned in the +discussion (Section V). The issue with parallelization has also been +discussed in Section IV, in the part on verification ("Where exact +reproducibility is not possible (for example due to paralleliza- tion), +values can be verified by any statistical means, specified by the project +authors."). ------------------------------ @@ -703,18 +711,32 @@ ANSWER: 37. [Reviewer 4] Performance ... is never mentioned -################################ -ANSWER: -################################ +ANSWER: Performance is indeed an important issue for _immediate_ +reproducibility and we would have liked to discuss it. But due to the +strict word-count, we feel that adding it to the discussion points, without +having adequate space to elaborate, can confuse the readers of this paper +(which is focused on long term usability). ------------------------------ + + + + 38. [Reviewer 4] Tradeoff, which might affect Criterion 3 is time to result, people use popular frameworks because it is easier to use them. -################################ -ANSWER: -################################ +ANSWER: That is true. In section IV, we have given the time it takes to +build Maneage (only once for a project on each computer) to be around 1.5 +hours on an 8-core CPU (a typical machine that may be used for data +analysis). We therefore conclude that when the analysis is complex (and +thus taking many hours or days to complete), this time is negligible. + +But if the project's full analysis takes 10 minutes or less (like the +extremely simple analysis done in this paper which takes a fraction of a +second). Indeed, the 1.5 hour building time is significant. In those cases, +as discussed in the main body, the project can be built once in a Docker +image and easily moved to other computers. ------------------------------ @@ -747,9 +769,13 @@ there. 40. [Reviewer 4] Potentially an interesting sidebar to investigate how LaTeX/TeX has ensured its longevity! -############################## -ANSWER: -############################## +ANSWER: That is indeed a very interesting subject to study. We have been in +touch with Karl Berry (one of the core people behind TeX Live, who also +plays a prominent role in GNU) and have whitnessed the TeX Live community's +efforts to become more and more portable and longer-lived. But after +looking at the strict word limit, we couldn't find a place to highlight +this. But it is indeed a subject worthy of a full paper (that can be very +useful for many software projects0.. ------------------------------ @@ -760,9 +786,11 @@ ANSWER: 41. [Reviewer 4] The title is not specific enough - it should refer to the reproducibility of workflows/projects. -############################## -ANSWER: -############################## +ANSWER: Since this journal is focused on "Computing in Science and +Engineering", the fact that it relates to computational workflows will be +clear to any reader. Since the other referees didn't complain about this, +we will keep it as it was, but of course, we are open to the suggestions of +the editors in the final title. ------------------------------ @@ -820,9 +848,16 @@ determined by the host kernel, usually a decade", for Python packages: longevity of the workflows that can be produced using these tools? What happens if you use a combination of all four categories of tools? -########################## -ANSWER: -########################## +ANSWER: Thank you for highlighting this. The title has been shortend and +the section immediately starts with definitions. + +The aspects of the tools discussed in this section are orthogonal to each +other. For example a VM/container, package manager, notebook: some projects +may have any different combinations of the three. In some aspects using +them together can improve the operations, but for example building a +VM/container with or without a package manager makes no difference on the +main issue we raise about containers (that they are large binary blobs that +don't necessarily contain how the environment within them was built). ------------------------------ @@ -888,9 +923,8 @@ project. This can be generalized to any Git based collaboration model. ease of depositing in a repository, and ease of use by another researcher. -####################### -ANSWER: -####################### +ANSWER: These have been highlighted in various parts of the text (also +reviewed in previous points). ------------------------------ -- cgit v1.2.1