aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMohammad Akhlaghi <mohammad@akhlaghi.org>2020-06-01 06:29:52 +0100
committerMohammad Akhlaghi <mohammad@akhlaghi.org>2020-06-01 06:29:52 +0100
commit7d6ee8ce978a24cd2c796734f358dd8cd49da553 (patch)
treee51fdb90798371378cbf81cadda72095ac39b9a0
parent3db3859a2d5615ad9d2e2fb7361bd3cc569475a9 (diff)
Minor edits to clarify some of the previous corrections
Boud's point about a "random reader" not being a good example case was correct. But "user" also gives it a software perspective that is ofcourse not wrong, its can just be confusing. So I thought of changing it to "interested reader". In the part about the C-library dependency of high-level software, from Boud's correction, I found out that it is very hard to convey what I wanted to say (that separating errors due to C-library implementation and measurement errors will be easy, because they should be on much different scales). But I then corrected it to give it a slightly better tone while mentioning the same thing: that with Maneage we can now accurately measure the effect of the C library.
-rw-r--r--paper.tex6
1 files changed, 3 insertions, 3 deletions
diff --git a/paper.tex b/paper.tex
index 036f89b..e405a29 100644
--- a/paper.tex
+++ b/paper.tex
@@ -350,7 +350,7 @@ For example, in Figure \ref{fig:datalineage} (bottom), \inlinecode{INPUTS.conf}
To illustrate this, we report that \cite{menke20} studied $\menkenumpapersdemocount$ papers in $\menkenumpapersdemoyear$ (which is not in their original plot).
The number \inlinecode{\menkenumpapersdemoyear} is stored in \inlinecode{demo-year.conf} and the result (\inlinecode{\menkenumpapersdemocount}) was calculated after generating \inlinecode{columns.txt}.
Both are expanded as \LaTeX{} macros when creating this PDF file.
-A user can change the value in \inlinecode{demo-year.conf} to automatically update the result in the PDF, without necessarily knowing the underlying low-level implementation.
+An interested reader can change the value in \inlinecode{demo-year.conf} to automatically update the result in the PDF, without necessarily knowing the underlying low-level implementation.
Furthermore, the configuration files are a prerequisite of the targets that use them.
If changed, Make will \emph{only} re-execute the dependent recipe and all its descendants, with no modification to the project's source or other built products.
This fast and cheap testing encourages experimentation (without necessarily knowing the implementation details, e.g., by co-authors or future readers), and ensures self-consistency.
@@ -434,12 +434,12 @@ We have thus found that more than once, advanced users add, or fix, their requir
On a related note, POSIX is a fuzzy standard that does not guarantee bit-wise reproducibility of programs.
However, it has been chosen as the underlying platform here because the results (data) are our focus, not the compiled software.
-POSIX is ubiquitous and fixed versions of low-level software (e.g., core GNU tools) are installable on most POSIX systems; each internally corrects for differences affecting its functionality.
+POSIX is ubiquitous and fixed versions of low-level software (e.g., core GNU tools) are installable on most POSIX systems; each internally corrects for differences affecting its functionality (partly as part of the GNU portability library).
On GNU/Linux hosts, Maneage builds precise versions of the GNU Compiler Collection (GCC), GNU Binutils and GNU C library (glibc).
However, glibc is not installable on some POSIX OSs (e.g., macOS).
The C library is linked with all programs.
This dependence can hypothetically hinder exact reproducibility \emph{of results}, but we have not encountered this so far.
-When present, the non-reproducibility of high-level science results due to differing C libraries has so far been traced to known sources of error in the analysis (like measurement errors); further study would be useful.
+With everything else under precise control, the effect of differing Kernel and C libraries on high-level science results can now be systematically studied with Maneage.
%Thirdly, publishing a project's reproducible data lineage immediately after publication enables others to continue with follow-up papers, which may provide unwanted competition against the original authors.
%We propose these solutions: