aboutsummaryrefslogtreecommitdiff
path: root/tex/src/appendix-existing-solutions.tex
diff options
context:
space:
mode:
authorBoud Roukema <boud@cosmo.torun.pl>2021-06-13 01:56:45 +0200
committerMohammad Akhlaghi <mohammad@akhlaghi.org>2021-06-13 15:09:19 +0100
commita77bba6249f03062eb9849997c77245ad403d027 (patch)
tree5579d2b4b0b7256eea4b0626d9bd76c48be56938 /tex/src/appendix-existing-solutions.tex
parent313db0b04bd3499f83d9e79fd7e92578cd367c2b (diff)
Add GHTorrent, some https, notabug
This commit adds a few sentences in relation to the first known attempt to store and make available git repository hosting ephemera (GHTorrent, introduced to us by Roberto Di Cosmo). Since one of the two sponsors of GHTorrent is Microsoft, both the ethics and practical aspects of this in the context of reproducibility and scientific ethics as expressed by the international scientific community are rather unclear, so a link to one of the well-known lists of practical and ethical issues with Github is included. A minor fix is made in 'tex/src/appendix-existing-solutions.tex', since the word 'data' is plural (singular is 'datum').
Diffstat (limited to 'tex/src/appendix-existing-solutions.tex')
-rw-r--r--tex/src/appendix-existing-solutions.tex2
1 files changed, 1 insertions, 1 deletions
diff --git a/tex/src/appendix-existing-solutions.tex b/tex/src/appendix-existing-solutions.tex
index 578d6e9..1d515e4 100644
--- a/tex/src/appendix-existing-solutions.tex
+++ b/tex/src/appendix-existing-solutions.tex
@@ -437,7 +437,7 @@ It is possible to include the build instructions of the software used within the
For the data, it is similarly not possible to extract which data server they came from.
Hence two projects that each use a 1-terabyte dataset will need a full copy of that same 1-terabyte file in their bundle, making long-term preservation extremely expensive.
Such files can be excluded from the bundle through modifications in the configuration file.
-However, this will add complexity: a higher-level script will be necessary with the ReproZip bundle, to make sure that the data and bundle are used together, or to check the integrity of the data (in case it may have changed).
+However, this will add complexity: a higher-level script will be necessary with the ReproZip bundle, to make sure that the data and bundle are used together, or to check the integrity of the data (in case they have changed).
Finally, because it is only a snapshot of one moment in a project's history, preserving the connection between the ReproZip'd bundles of various points in a project's history is likely to be difficult (for example, when software or data are updated, or when analysis methods are modified).
In other words, a ReproZip user will have to personally define an archival method to preserve the various black boxes of the project as it evolves, and tracking what has changed between the versions is not trivial.