diff options
-rw-r--r-- | paper.tex | 13 | ||||
-rw-r--r-- | peer-review/1-answer.txt | 73 |
2 files changed, 49 insertions, 37 deletions
@@ -196,7 +196,7 @@ In summary, notebooks can rarely deliver their promised potential \cite{rule18} An exceptional solution we encountered was the Image Processing Online Journal (IPOL, \href{https://www.ipol.im}{ipol.im}). Submitted papers must be accompanied by an ISO C implementation of their algorithm (which is buildable on any widely used OS) with example images/data that can also be executed on their webpage. This is possible owing to the focus on low-level algorithms with no dependencies beyond an ISO C compiler. -However, many data-intensive projects commonly involve dozens of high-level dependencies, with large and complex data formats and analysis, and hence this solution is not scalable. +However, many data-intensive projects commonly involve dozens of high-level dependencies, with large and complex data formats and analysis, so this solution is not scalable. @@ -1225,7 +1225,7 @@ This failure to communicate in the details is a very serious problem, leading to \label{appendix:existingsolutions} As reviewed in the introduction, the problem of reproducibility has received a lot of attention over the last three decades and various solutions have already been proposed. -The core principles that many of the existing solutions (including Maneage) aims to achieve are nicely summarized by the FAIR principles \citeappendix{wilkinson16}. +The core principles that many of the existing solutions (including Maneage) aim to achieve are nicely summarized by the FAIR principles \citeappendix{wilkinson16}. In this appendix, some of the solutions are reviewed. The solutions are based on an evolving software landscape, therefore they are ordered by date: when the project has a webpage, the year of its first release is used for the sorting, otherwise their paper's publication year is used. @@ -1393,15 +1393,16 @@ An IPOL paper is a traditional research paper, but with a focus on implementatio The published narrative description of the algorithm must be detailed to a level that any specialist can implement it in their own programming language (extremely detailed). The author's own implementation of the algorithm is also published with the paper (in C, C++ or MATLAB), the code must be commented well enough and link each part of it with the relevant part of the paper. The authors must also submit several example datasets/scenarios. -The referee actually inspects the code and narrative, confirming that they match with each other, and with the stated conclusions of the published paper. +The referee is expected to inspect the code and narrative, confirming that they match with each other, and with the stated conclusions of the published paper. After publication, each paper also has a ``demo'' button on its webpage, allowing readers to try the algorithm on a web-interface and even provide their own input. The IPOL model is the single most robust model of peer review and publishing computational research methods/implementations that we have seen in this survey. It has grown steadily over the last 10 years, publishing 23 research articles in 2019 alone. We encourage the reader to visit its webpage and see some of its recent papers and their demos. -The reason it can be so thorough and complete is its a very narrow scope (image processing algorithms), where the published algorithms are highly atomic, not needing significant dependencies (beyond input/output), allowing the referees/readers to go deep into each implemented algorithm. -In fact, high-level languages like Perl, Python or Java are not acceptable in IPOL precisely because of the additional complexities/dependencies that they require. -If any referee/reader was inclined to do so, a paper written in Maneage (the proof-of-concept solution presented in this paper) allows for a similar level of scrutiny, but for much more complex research scenarios, involving hundreds of dependencies and complex processing on the data. +The reason it can be so thorough and complete is its very narrow scope (image processing algorithms), where the published algorithms are highly atomic, not needing significant dependencies (beyond input/output), allowing the referees and readers to go deeply into each implemented algorithm. +In fact, high-level languages like Perl, Python or Java are not acceptable in IPOL precisely because of the additional complexities, such as dependencies, that they require. +If any referee or reader were inclined to do so, a paper written in Maneage (the proof-of-concept solution presented in this paper) could be scrutinised at a similar detailed level, but for much more complex research scenarios, involving hundreds of dependencies and complex processing of the data. + diff --git a/peer-review/1-answer.txt b/peer-review/1-answer.txt index 91cf3d8..da92bd3 100644 --- a/peer-review/1-answer.txt +++ b/peer-review/1-answer.txt @@ -357,7 +357,7 @@ with the earlier sets of criteria. ANSWER: In the submitted version we had stated that "Ideally, it is possible to precisely identify the Docker “images” that are imported with -their checksums ...". But to be more clear and directly to the point, it +their checksums ...". But to be more clear and go directly to the point, it has been edited to explicity say "... to recreate an identical OS image later". @@ -375,11 +375,11 @@ later". ANSWER: Thank you very much for raising this important point. We hadn't seen other reproducibility papers mention this important point and missed it. In the acknowledgments (where we also mention the commit hashes) we now -explicity mention the exact CPU architecture used to build this paper: +explicitly mention the exact CPU architecture used to build this paper: "This project was built on an x86_64 machine with Little Endian byte-order and address sizes 39 bits physical, 48 bits virtual.". This is because we have already seen cases where the architecture is the same, but programs -fail because of the byte-order. +fail because of the byte order. Generally, Maneage will now extract this information from the running system during its configuration phase and provide the users with three @@ -396,7 +396,7 @@ different LaTeX macros that they can use anywhere in their paper. ANSWER: This has been clarified with the short extra statement "a minimal Unix-like standard that is shared between many operating systems". We would -have liked to explain this more, but the word-limit is very constraining. +have liked to explain this more, but the word limit is very constraining. ------------------------------ @@ -411,7 +411,7 @@ have liked to explain this more, but the word-limit is very constraining. with the measurement tool. ANSWER: Thank you very much for this good point. A description of a -possible solution to this has been added after criteria 8. +possible solution to this has been added after criterion 8. ------------------------------ @@ -430,23 +430,25 @@ Maneage'd software on Zenodo (https://doi.org/10.5281/zenodo.3883409) and also downloading from there. Until recently we would directly access each software's own webpage to -download the files, and this caused many problems like this. In other +download the source files, and this caused frequent problems of this sort. In other cases, we were very frustrated when a software's webpage would temporarily -be unavailable (for maintainance reasons), this wouldn't allow us to build -new projects. - -Since all the software are free, we are allowed to re-distribute them and -Zenodo is defined for long-term archival of academic artifacts, so we -figured that a software source code repository on Zenodo would be the most -reliable solution. At configure time, Maneage now accesses Zenodo's DOI and -resolves the most recent URL to automatically download any necessary -software source code that the project needs from there. +be unavailable (for maintenance reasons); this would be a hindrance in +trying to build new projects. + +Since all the software is free-licensed, we are legally allowed to +re-distribute it (within the conditions, such as not removing copyright +notices) and Zenodo is defined for long-term archival of +academic digital objects, so we decided that a software source code +repository on Zenodo would be the most reliable solution. At configure +time, Maneage now accesses Zenodo's DOI and resolves the most recent +URL to automatically download any necessary software source code that +the project needs from there. Generally, we also keep all software in a Git repository on our own webpage: http://git.maneage.org/tarballs-software.git/tree. Also, Maneage users can also identify their own custom URLs for downloading software, which will be given higher priority than Zenodo (useful for situations when -a custom software is downloaded and built in a project branch (not the core +custom software is downloaded and built in a project branch (not the core 'maneage' branch). ------------------------------ @@ -467,7 +469,7 @@ building of GCC can optionally be disabled with the '--host-cc' option to significantly speed up the build when the host's GCC is similar). Furthermore, Maneage can be built within a Docker container. -Generally, a paragraph has been added in Section IV on this issue (the +A paragraph has been added in Section IV on this issue (the build time and building within a Docker container). We have also defined task #15818 [1] to have our own core Docker image that is ready to build a Maneaged project and will be adding it shortly. @@ -499,9 +501,9 @@ ANSWER: "Reproducibility" has been defined along with "Longevity" and ANSWER: Ansible and Helm are primarily designed for distributed computing. For example Helm is just a high-level package manager for a -Kubernetes cluster that is based on containers. A review of them can be -added in the Appendix, but we feel they may not be too relevant for this -paper. +Kubernetes cluster that is based on containers. A review of them could be +added to the Appendices, but we feel they this would distract somewhat +from the main points of our current paper. ------------------------------ @@ -514,8 +516,9 @@ paper. the maintaining community, which creates another problem within the perspective of the article. -ANSWER: Thank you very much for highlighting that this point was not included -for the sake of length, it has been fitted into the introduction now. +ANSWER: Thank you very much for highlighting this point. We had excluded +this point for the sake of article length, but we have restored it in +the introduction of the revised version. ------------------------------ @@ -527,7 +530,7 @@ for the sake of length, it has been fitted into the introduction now. the discussion presented by THAIN et al., 2015 is interesting to increase essential points of the current work. -ANSWER: Thank you very much for pointing this the works by Thain. We +ANSWER: Thank you very much for pointing out the works by Thain. We couldn't find any first-author papers in 2015, but found Meng & Thain (https://doi.org/10.1016/j.procs.2017.05.116) which had a related discussion of why they didn't use Docker containers in their work. That @@ -542,9 +545,10 @@ paper is now cited in the discussion of Containers in Appendix A. 26. [Reviewer 3] About the Singularity, the description article was missing (Kurtzer GM, Sochat V, Bauer MW, 2017). -ANSWER: Thank you for the reference, we could not put it in the main body -of the paper (like many others) due to the strict bibliography limit of 12, -but it has been cited in Appendix A (where we discuss Singularity). +ANSWER: Thank you for the reference. We are restricted in the main +body of the paper due to the strict bibliography limit of 12 +references; we have included Kurtzer et al 2017 in Appendix A (where +we discuss Singularity). ------------------------------ @@ -556,8 +560,8 @@ but it has been cited in Appendix A (where we discuss Singularity). (WILKINSON et al., 2016). ANSWER: The FAIR principles have been mentioned in the main body of the -paper, but unfortunately we had to remove its citation the main paper (like -many others) within the maximum limit 12 references. We have cited it in +paper, but unfortunately we had to remove its citation in the main paper (like +many others) to keep to the maximum of 12 references. We have cited it in Appendix B. ------------------------------ @@ -571,9 +575,16 @@ Appendix B. reproducibility of a publication could be better explored, which would further enrich the tool presented. -##################################### -ANSWER: -##################################### + +ANSWER: Our section II discussing existing tools seems to be the most +appropriate place to mention IPOL, so we have retained its position at +the end of this section. + +We have indeed included an in-depth discussion of IPOL in Appendix B. +We recommend it to the reader for any project written uniquely in C, +and we comment on the readiness of Maneage'd projects for a similar +level of peer-review control. + ------------------------------ |