aboutsummaryrefslogtreecommitdiff
path: root/peer-review/1-review.txt
blob: 16e227b9fccc6fad12d6b38f582a69f18a035c6c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
From: cise computer org
To: mohammad akhlaghi org,
    infantesainz gmail com,
    boud astro uni torun pl,
    david valls-gabaud observatoiredeparis psl eu,
    rbaena iac es
Received: Tue, 22 Sep 2020 15:28:21 -0400
Subject: Computing in Science and Engineering, CiSESI-2020-06-0048
         major revision required

--------------------------------------------------

Computing in Science and Engineering,CiSESI-2020-06-0048
"Towards Long-term and Archivable Reproducibility"
manuscript type: Reproducible Research

Dear Dr. Mohammad Akhlaghi,

The manuscript that you submitted to Computing in Science and Engineering
has completed the review process. After carefully examining the manuscript
and reviews, we have decided that the manuscript needs major revisions
before it can be considered for a second review.

Your revision is due before 22-Oct-2020. Please note that if your paper was
submitted to a special issue, this due date may be different. Contact the
peer review administrator, Ms. Jessica Ingle, at cise computer.org if you
have questions.

The reviewer and editor comments are attached below for your
reference. Please maintain our 6,250–word limit as you make your revisions.

To upload your revision and summary of changes, log on to
https://mc.manuscriptcentral.com/cise-cs, click on your Author Center, then
"Manuscripts with Decisions."  Under "Actions," choose "Create a Revision"
next to the manuscript number.

Highlight the changes to your manuscript by using the track changes mode in
MS Word, the latexdiff package if using LaTex, or by using bold or colored
text.

When submitting your revised manuscript, you will need to respond to the
reviewer comments in the space provided.

If you have questions regarding our policies or procedures, please refer to
the magazines' Author Information page linked from the Instructions and
Forms (top right corner of the ScholarOne Manuscripts screen) or you can
contact me.

We look forward to receiving your revised manuscript.

Sincerely,
Dr. Lorena A. Barba
George Washington University
Mechanical and Aerospace Engineering
Editor-in-Chief, Computing in Science and Engineering

--------------------------------------------------





EiC comments:
Some reviewers request additions, and overview of other tools, etc. In
doing your revision, please remember space limitations: 6,250 words
maximum, including all main body, abstract, keyword, bibliography (12
references or less), and biography text. See "Write For Us" section of the
website: https://www.computer.org/csdl/magazine/cs

Comments of the Associate Editor: Associate Editor
Comments to the Author: Thank to the authors for your submission to the
Reproducible Research department.

Thanks to the reviewers for your careful and thoughtful reviews. We would
appreciate it if you can make your reports available and share the DOI as
soon as possible, per our original invitation e-mail. We will follow up our
original invitation to obtain your review DOI, if you have not already
included it in your review comments.

Based on the review feedback, there are a number of major issues that
require attention and many minor ones as well. Please take these into
account as you prepare your major revision for another round of
review. (See the actual review reports for details.)

1. In general, there are a number of presentation issues needing
attention. There are general concerns about the paper lacking focus. Some
terminology is not well-defined (e.g. longevity). In addition, the
discussion of tools could benefit from some categorization to characterize
their longevity. Background and related efforts need significant
improvement. (See below.)

2. There is consistency among the reviews that related work is particularly
lacking and not taking into account major works that have been written on
this topic. See the reviews for details about work that could potentially
be included in the discussion and how the current work is positioned with
respect to this work.

3. The current work needs to do a better job of explaining how it deals
with the nagging problem of running on CPU vs. different architectures.  At
least one review commented on the need to include a discussion of
continuous integration (CI) and its potential to help identify problems
running on different architectures. Is CI employed in any way in the work
presented in this article?

4. The presentation of the Maneage tool is both lacking in clarity and
consistency with the public information/documentation about the tool. While
our review focus is on the article, it is important that readers not be
confused when they visit your site to use your tools.

5. A significant question raised by one review is how this work compares to
"executable" papers and Jupyter notebooks.  Does this work embody
similar/same design principles or expand upon the established alternatives?
In any event, a discussion of this should be included in
background/motivation and related work to help readers understand the clear
need for a new approach, if this is being presented as new/novel.

Reviews:

Please note that some reviewers may have included additional comments in a
separate file. If a review contains the note "see the attached file" under
Section III A - Public Comments, you will need to log on to ScholarOne
Manuscripts to view the file. After logging in, select the Author Center,
click on the "Manuscripts with Decisions" queue and then click on the "view
decision letter" link for this manuscript. You must scroll down to the very
bottom of the letter to see the file(s), if any. This will open the file
that the reviewer(s) or the Associate Editor included for you along with
their review.

--------------------------------------------------





Reviewer: 1
Recommendation: Author Should Prepare A Major Revision For A Second Review

Comments:

 * Adding an explicit list of contributions would make it easier to the
   reader to appreciate these.

 * These are not mentioned/cited and are highly relevant to this paper (in
   no particular order):

     * Git flows, both in general and in particular for research.
     * Provenance work, in general and with git in particular
     * Reprozip: https://www.reprozip.org/
     * OCCAM: https://occam.cs.pitt.edu/
     * Popper: http://getpopper.io/
     * Whole Tale: https://wholetale.org/
     * Snakemake: https://github.com/snakemake/snakemake
     * CWL https://www.commonwl.org/ and WDL https://openwdl.org/
     * Nextflow: https://www.nextflow.io/
     * Sumatra: https://pythonhosted.org/Sumatra/
     * Podman: https://podman.io
     * AppImage (https://appimage.org/), Flatpack
       (https://flatpak.org/), Snap (https://snapcraft.io/)
     * nbdev https://github.com/fastai/nbdev  and jupytext
     * Bazel: https://bazel.build/
     * Debian reproducible builds: https://wiki.debian.org/ReproducibleBuilds

     * Existing guidelines similar to the proposed "Criteria for
       longevity". Many articles of these in the form "10 simple rules for
       X", for example (not exhaustive list):
          * https://doi.org/10.1371/journal.pcbi.1003285
          * https://arxiv.org/abs/1810.08055
          * https://osf.io/fsd7t/

     * A model project for reproducible papers: https://arxiv.org/abs/1401.2000

     * Executable/reproducible paper articles and original concepts

 * Several claims in the manuscript are not properly justified, neither in
   the text nor via citation. Examples (not exhaustive list):

     * "it is possible to precisely identify the Docker “images” that are
       imported with their checksums, but that is rarely practiced in most
       solutions that we have surveyed [which ones?]"

     * "Other OSes [which ones?] have similar issues because pre-built
       binary files are large and expensive to maintain and archive."

     * "Researchers using free software tools have also already had some
       exposure to it"

     * "A popular framework typically falls out of fashion and requires
       significant resources to translate or rewrite every few years."

 * As mentioned in the discussion by the authors, not even Bash, Git or
   Make is reproducible, thus not even Maneage can address the longevity
   requirements. One possible alternative is the use of CI to ensure that
   papers are re-executable (several papers have been written on this
   topic). Note that CI is well-established technology (e.g. Jenkins is
   almost 10 years old).

Additional Questions:

1. How relevant is this manuscript to the readers of this periodical?
   Please explain your rating in the Detailed Comments section.: Very
   Relevant

2. To what extent is this manuscript relevant to readers around the world?:
   The manuscript is of interest to readers throughout the world

1. Please summarize what you view as the key point(s) of the manuscript and
   the importance of the content to the readers of this periodical.: This
   article introduces desiderata for long-term archivable reproduciblity
   and presents Maneage, a system whose goal is to achieve these outlined
   properties.

2. Is the manuscript technically sound? Please explain your answer in the
   Detailed Comments section.: Partially

3. What do you see as this manuscript's contribution to the literature in
   this field?: Presentation of Maneage

4. What do you see as the strongest aspect of this manuscript?: A great
   summary of Maneage, as well as its implementaiton.

5. What do you see as the weakest aspect of this manuscript?: Criterion has
   been proposed previously. Maneage itself provides little novelty (see
   comments below).

1. Does the manuscript contain title, abstract, and/or keywords?: Yes

2. Are the title, abstract, and keywords appropriate? Please elaborate in
   the Detailed Comments section.: Yes

3. Does the manuscript contain sufficient and appropriate references
   (maximum 12-unless the article is a survey or tutorial in scope)? Please
   elaborate in the Detailed Comments section.: Important references are
   missing; more references are needed

4. Does the introduction clearly state a valid thesis? Please explain your
   answer in the Detailed Comments section.: Could be improved

5. How would you rate the organization of the manuscript? Please elaborate
   in the Detailed Comments section.: Satisfactory

6. Is the manuscript focused? Please elaborate in the Detailed Comments
   section.: Satisfactory

7. Is the length of the manuscript appropriate for the topic? Please
   elaborate in the Detailed Comments section.: Satisfactory

8. Please rate and comment on the readability of this manuscript in the
   Detailed Comments section.: Easy to read

9. Please rate and comment on the timeliness and long term interest of this
   manuscript to CiSE readers in the Detailed Comments section. Select all
   that apply.: Topic and content are of limited interest to CiSE readers.

Please rate the manuscript. Explain your choice in the Detailed Comments
section.: Good

--------------------------------------------------





Reviewer: 2
Recommendation: Accept If Certain Minor Revisions Are Made

Comments: https://doi.org/10.22541/au.159724632.29528907

Operating System: Authors mention that Docker is usually used with an image
of Ubuntu without precision about the version used. And Even if users take
care about the version, the image is updated monthly thus the image used
will have different OS components based on the generation time. This
difference in OS components will interfere on the reproducibility. I agree
on that, but I would like to add that it is a wrong habit of users. It is
possible to generate reproducible Docker images by generating it from an
ISO image of the OS. These ISO images are archived, at least for Ubuntu
(http://old-releases.ubuntu.com/releases) and for Debian
(https://cdimage.debian.org/mirror/cdimage/archive) thus allow users to
generate an OS with identical components. Combined with the
snapshot.debian.org service, it is even possible to update a Debian release
to a specific time point up to 2005 and with a precision of six hours. With
combination of both ISO image and snapshot.debian.org service it is
possible to obtain an OS for Docker or for a VM with identical components
even if users have to use the PM of the OS. Authors should add indication
that using good practices it is possible to use Docker or VM to obtain
identical OS usable for reproducible research.

CPU architecture: The CPU architecture of the platform used to run the
workflow is not discussed in the manuscript. During software integration in
Debian, I have seen several software failing their unit tests due to
different behavior from itself or from a library dependency. This not
expected behavior was only present on non-x86 architectures, mainly because
developers use a x86 machine for their developments and tests. Bug or
feature? I don’t know, but nowadays, it is quite frequent to see computers
with a non-x86 CPU. It would be annoying to fail the reproducibility step
because of a different in CPU architecture. Authors should probably take
into account the architecture used in their workflow or at least report it.

POSIX dependency: I don’t understand the "no dependency beyond
POSIX". Authors should more explained what they mean by this sentence. I
completely agree that the dependency hell must be avoided and dependencies
should be used with parsimony. Unfortunately, sometime we need proprietary
or specialized software to read raw data.  For example in genetics,
micro-array raw data are stored in binary proprietary formats. To convert
this data into a plain text format, we need the proprietary software
provided with the measurement tool.

Maneage: I was not able to properly set up a project with Maneage. The
configuration step failed during the download of tools used in the
workflow. This is probably due to a firewall/antivirus restriction out of
my control. How frequent this failure happen to users? Moreover, the time
to configure a new project is quite long because everything needs to be
compiled. Authors should compare the time required to set up a project
Maneage versus time used by other workflows to give an indication to the
readers.

Disclaimer: For the sake of transparency, it should be noted that I am
involved in the development of Debian, thus my comments are probably
oriented.

Additional Questions:

1. How relevant is this manuscript to the readers of this periodical?
   Please explain your rating in the Detailed Comments section.: Relevant

2. To what extent is this manuscript relevant to readers around the world?:
   The manuscript is of interest to readers throughout the world

1. Please summarize what you view as the key point(s) of the manuscript and
   the importance of the content to the readers of this periodical.: The
   authors describe briefly the history of solutions proposed by
   researchers to generate reproducible workflows. Then, they report the
   problems with the current tools used to tackle the reproducible
   problem. They propose a set of criteria to develop new reproducible
   workflows and finally they describe their proof of concept workflow
   called "Maneage". This manuscript could help researchers to improve
   their workflow to obtain reproducible results.

2. Is the manuscript technically sound? Please explain your answer in the
   Detailed Comments section.: Yes

3. What do you see as this manuscript's contribution to the literature in
   this field?: The authors try to propose a simple answer to the
   reproducibility problem by defining new criteria. They also propose a
   proof of concept workflow which can be directly used by researchers for
   their projects.

4. What do you see as the strongest aspect of this manuscript?: This
   manuscript describes a new reproducible workflow which doesn't require
   another new trendy high-level software. The proposed workflow is only
   based on low-level tools already widely known. Moreover, the workflow
   takes into account the version of all software used in the chain of
   dependencies.

5. What do you see as the weakest aspect of this manuscript?: Authors don't
   discuss the problem of results reproducibility when analysis are
   performed using CPU with different architectures. Some libraries have
   different behaviors when they ran on different architectures and it
   could influence final results. Authors are probably talking about x86,
   but there is no reference at all in the manuscript.

1. Does the manuscript contain title, abstract, and/or keywords?: Yes

2. Are the title, abstract, and keywords appropriate? Please elaborate in
   the Detailed Comments section.: Yes

3. Does the manuscript contain sufficient and appropriate references
   (maximum 12-unless the article is a survey or tutorial in scope)? Please
   elaborate in the Detailed Comments section.: References are sufficient
   and appropriate

4. Does the introduction clearly state a valid thesis? Please explain your
   answer in the Detailed Comments section.: Yes

5. How would you rate the organization of the manuscript? Please elaborate
   in the Detailed Comments section.: Satisfactory

6. Is the manuscript focused? Please elaborate in the Detailed Comments
   section.: Satisfactory

7. Is the length of the manuscript appropriate for the topic? Please
   elaborate in the Detailed Comments section.: Satisfactory

8. Please rate and comment on the readability of this manuscript in the
   Detailed Comments section.: Easy to read

9. Please rate and comment on the timeliness and long term interest of this
   manuscript to CiSE readers in the Detailed Comments section. Select all
   that apply.: Topic and content are of immediate and continuing interest
   to CiSE readers

Please rate the manuscript. Explain your choice in the Detailed Comments
section.: Good

--------------------------------------------------





Reviewer: 3
Recommendation: Accept If Certain Minor Revisions Are Made

Comments: Longevity of workflows in a project is one of the problems for
reproducibility in different fields of computational research. Therefore, a
proposal that seeks to guarantee this longevity becomes relevant for the
entire community, especially when it is based on free software and is easy
to access and implement.

GOODMAN et al., 2016, BARBA, 2018 and PLESSER, 2018 observed in their
research that the terms reproducibility and replicability are frequently
found in the scientific literature and their use interchangeably ends up
generating confusion due to the authors' lack of clarity. Thus, authors
should define their use of the term briefly for their readers.

The introduction is consistent with the proposal of the article, but deals
with the tools separately, many of which can be used together to minimize
some of the problems presented. The use of Ansible, Helm, among others,
also helps in minimizing problems. When the authors use the Python example,
I believe it is interesting to point out that today version 2 has been
discontinued by the maintaining community, which creates another problem
within the perspective of the article. Regarding the use of VM's and
containers, I believe that the discussion presented by THAIN et al., 2015
is interesting to increase essential points of the current work. About the
Singularity, the description article was missing (Kurtzer GM, Sochat V,
Bauer MW, 2017). I also believe that a reference to FAIR is interesting
(WILKINSON et al., 2016).

In my opinion, the paragraph on IPOL seems to be out of context with the
previous ones. This issue of end-to-end reproducibility of a publication
could be better explored, which would further enrich the tool presented.

The presentation of the longevity criteria was adequate in the context of
the article and explored the points that were dealt with later.

The presentation of the tool was consistent. On the project website, I
suggest that the information contained in README-hacking be presented on
the same page as the Tutorial. A topic breakdown is interesting, as the
markdown reading may be too long to find information.

Additional Questions:

1. How relevant is this manuscript to the readers of this periodical?
   Please explain your rating in the Detailed Comments section.: Relevant

2. To what extent is this manuscript relevant to readers around the world?:
   The manuscript is of interest to readers throughout the world

1. Please summarize what you view as the key point(s) of the manuscript and
   the importance of the content to the readers of this periodical.: In
   this article, the authors discuss the problem of the longevity of
   computational workflows, presenting what they consider to be criteria
   for longevity and an implementation based on these criteria, called
   Maneage, seeking to ensure a long lifespan for analysis projects.

2. Is the manuscript technically sound? Please explain your answer in the
   Detailed Comments section.: Yes

3. What do you see as this manuscript's contribution to the literature in
   this field?: In this article, the authors discuss the problem of the
   longevity of computational workflows, presenting what they consider to
   be criteria for longevity and an implementation based on these criteria,
   called Maneage, seeking to ensure a long lifespan for analysis projects.

   As a key point, the authors enumerate quite clear criteria that can
   guarantee the longevity of projects and present a free software-based
   way of achieving this objective. The method presented by the authors is
   not easy to implement for many end users, with low computer knowledge,
   but it can be easily implemented by users with average knowledge in the
   area.

4. What do you see as the strongest aspect of this manuscript?: One of the
   strengths of the manuscript is the implementation of Maneage entirely in
   free software and the search for completeness presented in the
   manuscript. The use of GNU software adds the guarantee of long
   maintenance by one of the largest existing software communities. In
   addition, the tool developed has already been tested in different
   publications, showing itself consistent in different scenarios.

5. What do you see as the weakest aspect of this manuscript?: For the
   proper functioning of the proposed tool, the user needs prior knowledge
   of LaTeX, GIT and the command line, which can keep inexperienced users
   away. Likewise, the tool is suitable for Unix users, keeping users away
   from Microsoft environments.

   Even though Unix-like environments are the majority in the areas of
   scientific computing, many users still perform their analysis in
   different areas on Windows computers or servers, with the assistance of
   package managers.

1. Does the manuscript contain title, abstract, and/or keywords?: Yes

2. Are the title, abstract, and keywords appropriate? Please elaborate in
   the Detailed Comments section.: Yes

3. Does the manuscript contain sufficient and appropriate references
   (maximum 12-unless the article is a survey or tutorial in scope)? Please
   elaborate in the Detailed Comments section.: Important references are
   missing; more references are needed

4. Does the introduction clearly state a valid thesis? Please explain your
   answer in the Detailed Comments section.: Could be improved

5. How would you rate the organization of the manuscript? Please elaborate
   in the Detailed Comments section.: Satisfactory

6. Is the manuscript focused? Please elaborate in the Detailed Comments
   section.: Could be improved

7. Is the length of the manuscript appropriate for the topic? Please
   elaborate in the Detailed Comments section.: Satisfactory

8. Please rate and comment on the readability of this manuscript in the
   Detailed Comments section.: Easy to read

9. Please rate and comment on the timeliness and long term interest of this
   manuscript to CiSE readers in the Detailed Comments section. Select all
   that apply.: Topic and content are of immediate and continuing interest
   to CiSE readers

Please rate the manuscript. Explain your choice in the Detailed Comments
section.: Excellent

--------------------------------------------------





Reviewer: 4
Recommendation: Author Should Prepare A Major Revision For A Second Review

Comments: Overall evaluation - Good.

This paper is in scope, and the topic is of interest to the readers of
CiSE. However in its present form, I have concerns about whether the paper
presents enough new contributions to the area in a way that can then be
understood and reused by others. The main things I believe need addressing
are: 1) Revisit the criteria, show how you have come to decide on them,
give some examples of why they are important, and address potential missing
criteria. 2) Clarify the discussion of challenges to adoption and make it
clearer which tradeoffs are important to practitioners. 3) Be clearer about
which sorts of research workflow are best suited to this approach.

B2.Technical soundness: here I am discussing the soundness of the paper,
rather than the soundness of the Maneage tool. There are some fundamental
additional challenges to reproducibility that are not addressed. Although
software library versions are addressed, there is also the challenge of
mathematical reproducibility, particularly of the handling of floating
point number, which might occur because of the way the code is written, and
the hardware architecture (including if code is optimised /
parallelised). This could obviously be addressed through a criterion around
how code is written, but this will also come with a tradeoff against
performance, which is never mentioned. Another tradeoff, which might affect
Criterion 3 is time to result - people use popular frameworks because it is
easier to use them.  Regarding the discussion, I would liked to have seen
explanation of how these challenges to adoption were identified: was this
anecdotal, through surveys. participant observation?  As a side note around
the technical aspects of Maneage - it is using LaTeX which in turn is built
on TeX which in turn has had many portability problems in the past due to
being written using WEB / Tangle, though with web2c this is largely now
resolved - potentially an interesting sidebar to investigate how LaTeX/TeX
has ensured its longevity!

C2. The title is not specific enough - it should refer to the
reproducibility of workflows/projects.

C4. As noted above, whilst the thesis stated is valid, it may not be useful
to practitioners of computation science and engineering as it stands.

C6. Manuscript focus. I would have liked a more focussed approach to the
presentation of information in II. Longevity is not defined, and whilst
various tools are discussed and discarded, no attempt is made to categorise
the magnitude of longevity for which they are relevant. For instance,
environment isolators are regarded by the software preservation community
as adequate for timescale of the order of years, but may not be suitable
for the timescale of decades where porting and emulation are used. The
title of this section "Commonly used tools and their longevity" is also
confusing - do you mean the longevity of the tools or the longevity of the
workflows that can be produced using these tools? What happens if you use a
combination of all four categories of tools?

C8. Readability. I found it difficult to follow the description of how
Maneage works. It wasn't clear to me if code was being run to generate the
results and figures in a LaTeX paper that is part of a project in
Maneage. It appears to be suggested this is the case, but Figure 1 doesn't
show how this works - it just has the LaTeX files, the data files and the
Makefiles. Is it being suggested that LaTeX itself is the programming
language, using its macro functionality? I was a bit confused on how
collaboration is handled as well - this appears to be using the Git
branching model, and the suggestion that Maneage is keeping track of all
components from all projects - but what happens if you are working with
collaborators that are using their own Maneage instance?

I would also liked to have seen a comparison between this approach and
other "executable" paper approaches e.g. Jupyter notebooks, compared on
completeness, time taken to write a "paper", ease of depositing in a
repository, and ease of use by another researcher.

Additional Questions:

1. How relevant is this manuscript to the readers of this periodical?
   Please explain your rating in the Detailed Comments section.: Relevant

2. To what extent is this manuscript relevant to readers around the world?:
   The manuscript is of interest to readers throughout the world

1. Please summarize what you view as the key point(s) of the manuscript and
   the importance of the content to the readers of this periodical.: This
   manuscript discusses the challenges of reproducibility of computational
   research workflows, suggests criteria for improving the "longevity" of
   workflows, describes the proof-of-concept tool, Maneage, that has been
   built to implement these criteria, and discusses the challenges to
   adoption.

   Of primary importance is the discussion of the challenges to adoption,
   as CiSE is about computational science which does not take place in a
   theoretical vacuum. Many of the identified challenges relate to the
   practice of computational science and the implementation of systems in
   the real world.

2. Is the manuscript technically sound? Please explain your answer in the
   Detailed Comments section.: Partially

3. What do you see as this manuscript's contribution to the literature in
   this field?: The manuscript makes a modest contribution to the
   literature through the description of the proof-of-concept, in
   particular its approach to integrating asset management, version control
   and build and the discussion of challenges to adoption.

   The proposed criteria have mostly been discussed at length in many other
   works looking at computational reproducibility and executable papers.

4. What do you see as the strongest aspect of this manuscript?: The
   strongest aspect is the discussion of difficulties for widespread
   adoption of this sort of approach. Because the proof-of-concept tool
   received support through the RDA, it was possible to get feedback from
   researchers who were likely to use it. This has highlighted and
   reinforced a number of challenges and caveats.

5. What do you see as the weakest aspect of this manuscript?: The weakest
   aspect is the assumption that research can be easily compartmentalized
   into simple and complete packages. Given that so much of research
   involves collaboration and interaction, this is not sufficiently
   addressed. In particular, the challenge of interdisciplinary work, where
   there may not be common languages to describe concepts and there may be
   different common workflow practices will be a barrier to wider adoption
   of the primary thesis and criteria.

1. Does the manuscript contain title, abstract, and/or keywords?: Yes

2. Are the title, abstract, and keywords appropriate? Please elaborate in
   the Detailed Comments section.: No

3. Does the manuscript contain sufficient and appropriate references
   (maximum 12-unless the article is a survey or tutorial in scope)? Please
   elaborate in the Detailed Comments section.: References are sufficient
   and appropriate

4. Does the introduction clearly state a valid thesis? Please explain your
   answer in the Detailed Comments section.: Could be improved

5. How would you rate the organization of the manuscript? Please elaborate
   in the Detailed Comments section.: Satisfactory

6. Is the manuscript focused? Please elaborate in the Detailed Comments
   section.: Could be improved

7. Is the length of the manuscript appropriate for the topic? Please
   elaborate in the Detailed Comments section.: Satisfactory

8. Please rate and comment on the readability of this manuscript in the
   Detailed Comments section.: Readable - but requires some effort to
   understand

9. Please rate and comment on the timeliness and long term interest of this
   manuscript to CiSE readers in the Detailed Comments section. Select all
   that apply.: Topic and content are of immediate and continuing interest
   to CiSE readers

Please rate the manuscript. Explain your choice in the Detailed Comments
section.: Good

--------------------------------------------------





Reviewer: 5
Recommendation: Author Should Prepare A Major Revision For A Second Review

Comments:

Major figures currently working in this exact field do not have their work
acknowledged in this work. In no particular order: Victoria Stodden,
Michael Heroux, Michela Taufer, and Ivo Jimenez. All of these authors have
multiple publications that are highly relevant to this paper. In the case
of Ivo Jimenez, his Popper work [Jimenez I, Sevilla M, Watkins N, Maltzahn
C, Lofstead J, Mohror K, Arpaci-Dusseau A, Arpaci-Dusseau R. The popper
convention: Making reproducible systems evaluation practical. In2017 IEEE
International Parallel and Distributed Processing Symposium Workshops
(IPDPSW) 2017 May 29 (pp. 1561-1570). IEEE.] and the later revision that
uses GitHub Actions, is largely the same as this work. The lack of
attention to virtual machines and containers is highly problematic. While a
reader cannot rely on DockerHub or a generic OS version label for a VM or
container, these are some of the most promising tools for offering true
reproducibility. On the data side, containers have the promise to manage
data sets and workflows completely [Lofstead J, Baker J, Younge A. Data
pallets: containerizing storage for reproducibility and
traceability. InInternational Conference on High Performance Computing 2019
Jun 16 (pp. 36-45). Springer, Cham.] Taufer has picked up this work and has
graduated a MS student working on this topic with a published thesis. See
also Jimenez's P-RECS workshop at HPDC for additional work highly relevant
to this paper.

Some other systems that do similar things include: reprozip, occam, whole
tale, snakemake.

While the work here is a good start, the paper needs to include the context
of the current community development level to be a complete research
paper. A revision that includes evaluation of (using the criteria) and
comparison with the suggested systems and a related work section that
seriously evaluates the work of the recommended authors, among others,
would make this paper worthy for publication.

Additional Questions:

1. How relevant is this manuscript to the readers of this periodical?
   Please explain your rating in the Detailed Comments section.: Very
   Relevant

2. To what extent is this manuscript relevant to readers around the world?:
   The manuscript is of interest to readers throughout the world

1. Please summarize what you view as the key point(s) of the manuscript and
   the importance of the content to the readers of this periodical.: This
   paper describes the Maneage system for reproducibile workflows. It lays
   out a bit of the need, has very limited related work, and offers
   criteria any system that offers reproducibility should have, and finally
   describes how Maneage achieves these goals.

2. Is the manuscript technically sound? Please explain your answer in the
   Detailed Comments section.: Partially

3. What do you see as this manuscript's contribution to the literature in
   this field?: Yet another example of a reproducible workflows
   project. There are numerous examples, mostly domain specific, and this
   one is not the most advanced general solution.

4. What do you see as the strongest aspect of this manuscript?: Working
   code and published artifacts

5. What do you see as the weakest aspect of this manuscript?: Lack of
   context in the field missing very relevant work that eliminates much, if
   not all, of the novelty of this work.

1. Does the manuscript contain title, abstract, and/or keywords?: Yes

2. Are the title, abstract, and keywords appropriate? Please elaborate in
   the Detailed Comments section.: Yes

3. Does the manuscript contain sufficient and appropriate references
   (maximum 12-unless the article is a survey or tutorial in scope)? Please
   elaborate in the Detailed Comments section.: Important references are
   missing; more references are needed

4. Does the introduction clearly state a valid thesis? Please explain your
   answer in the Detailed Comments section.: Could be improved

5. How would you rate the organization of the manuscript? Please elaborate
   in the Detailed Comments section.: Satisfactory

6. Is the manuscript focused? Please elaborate in the Detailed Comments
   section.: Could be improved

7. Is the length of the manuscript appropriate for the topic? Please
   elaborate in the Detailed Comments section.: Could be improved

8. Please rate and comment on the readability of this manuscript in the
   Detailed Comaments section.: Easy to read

9. Please rate and comment on the timeliness and long term interest of this
   manuscript to CiSE readers in the Detailed Comments section. Select all
   that apply.: Topic and content are likely to be of growing interest to
   CiSE readers over the next 12 months

Please rate the manuscript. Explain your choice in the Detailed Comments
section.: Fair