diff options
-rw-r--r-- | paper.tex | 19 | ||||
-rw-r--r-- | reproduce/analysis/make/initialize.mk | 2 | ||||
-rw-r--r-- | reproduce/software/config/installation/texlive.conf | 12 | ||||
-rw-r--r-- | reproduce/software/make/high-level.mk | 12 |
4 files changed, 22 insertions, 23 deletions
@@ -48,12 +48,12 @@ %% Abstract {\noindent\mpregular The era of big data has ushered an era of big responsibility. - In the absence of reproducibility, as a test on controlling the data lineage, the result's integrity will be subject to perpetual debate. - Maneage (management + lineage) is introduced here as a host to the computational and narrative components of an analysis. - Analysis steps are added to a new project with lineage in mind, thus facilitating the project's execution and testing as the project evolves, while being friendly to publishing and archival because it is wholly in machine\--action\-able, and human\--read\-able, plain-text. - Maneage is founded on the principles of completeness (e.g., no dependency beyond a POSIX-compatible operating system, no administrator privileges, or no network connection), modular and straight-forward design, temporal lineage and free software. - The lineage is not limited to downloading the inputs and processing them automatically, but also includes building the necessary software with fixed versions and build configurations. - Additionally, Maneage also builds the final PDF report of the project, establishing direct and automatic links between the data analysis and the narrative, with the precision of a word in a sentence. + In the absence of reproducibility, as a test on the reported data lineage, the result's integrity will be subject to perpetual debate. + To address this problem, we introduce Maneage (management + lineage) which has already been tested and used in several scientific papers. + Maneage is founded on the principles of completeness (e.g., no dependency beyond a POSIX-compatible operating system, no administrator privileges, or no network connection), modular and straight-forward design, temporal lineage and free software, to enable precise reproducibility. + The Maneage lineage, or workflow, is in machine\--action\-able, and human\--read\-able, plain-text format, facilitating version-control, publication, archival, or automatic parsing to extract data provenance. + The lineage is not limited to high-level processing, but also includes building the necessary software from source with fixed versions and build configurations. + Additionally, the project's final visualizations and narrative report are also included, establishing direct, and parse-able, links between the data analysis and the narrative or plots, with the precision of a word in a sentence or a point in a plot. Maneage enables incremental projects, where a new project can branch off an existing one, with moderate changes to enable experimentation on published methods. Once Maneage is implemented in a sufficiently wide scale, it can aid in automatic and optimized workflow creation through machine learning, or automating data management plans. Maneage was a recipient of the research data alliance (RDA) Europe Adoption Grant in 2019. @@ -86,6 +86,7 @@ What operations were done on those inputs? How were the configurations or traini How did the quantitative results get visualized into the final demonstration plots, figures or narrative/qualitative interpretation? May there be a bias in the visualization? See Figure \ref{fig:questions} for a more detailed visual representation of such questions for various stages of the workflow. +\tonote{Johan: add some general references.} In data science and database management, this type of metadata are commonly known as \emph{data provenance}, and the lower-level implementation is \emph{data lineage} (for more on the definitions, see Section \ref{sec:definitions}). Data lineage is being increasingly demanded for integrity checking from both the scientific and industrial/legal domains. @@ -798,13 +799,13 @@ Once the improvements become substantial, new paper(s) will be written to comple \section{Discussion} \label{sec:discussion} - - +\section{Summary and conclusion} +\label{sec:conclusion} %% Acknowledgements \section{Acknowledgments} -The authors wish to thank Pedram Ashofteh Ardakani, Zahra Sharbaf and Surena Fatemi for their useful suggestions and feedback on Maneage and this paper and to David Valls-Gabaud, Ignacio Trujillo, Johan Knapen, Roland Bacon for their support. +The authors wish to thank Pedram Ashofteh Ardakani, Elham Saremi, Zahra Sharbaf and Surena Fatemi for their useful suggestions and feedback on Maneage and this paper and to David Valls-Gabaud, Ignacio Trujillo, Johan Knapen, Roland Bacon for their support. We also thank Julia Aguilar-Cabello for designing the Maneage logo. Work on the reproducible paper template has been funded by the Japanese Ministry of Education, Culture, Sports, Science, and Technology ({\small MEXT}) scholarship and its Grant-in-Aid for Scientific Research (21244012, 24253003), the European Research Council (ERC) advanced grant 339659-MUSICOS, European Union’s Horizon 2020 research and innovation programme under Marie Sklodowska-Curie grant agreement No 721463 to the SUNDIAL ITN, and from the Spanish Ministry of Economy and Competitiveness (MINECO) under grant number AYA2016-76219-P. diff --git a/reproduce/analysis/make/initialize.mk b/reproduce/analysis/make/initialize.mk index ce4e488..29afbfe 100644 --- a/reproduce/analysis/make/initialize.mk +++ b/reproduce/analysis/make/initialize.mk @@ -324,7 +324,7 @@ $(packagecontents): paper.pdf | $(texdir) # file. TIP: you can use the same strategy for other LaTeX packages # that may cause problems on the arXiv server. cp tex/build/build/paper.bbl $$dir/ - tltopdir=.local/texlive/2019/texmf-dist/tex/latex + tltopdir=.local/texlive/maneage/texmf-dist/tex/latex find $$tltopdir/biblatex/ -maxdepth 1 -type f -print0 \ | xargs -0 cp -t $$dir diff --git a/reproduce/software/config/installation/texlive.conf b/reproduce/software/config/installation/texlive.conf index 2e86ac4..b5075c6 100644 --- a/reproduce/software/config/installation/texlive.conf +++ b/reproduce/software/config/installation/texlive.conf @@ -9,19 +9,19 @@ # this notice are preserved. This file is offered as-is, without any # warranty. selected_scheme scheme-basic -TEXDIR @installdir@/texlive/2019 -TEXMFCONFIG @installdir@/texlive2019/texmf-config +TEXDIR @installdir@/texlive/maneage +TEXMFCONFIG @installdir@/texlive-maneage/texmf-config TEXMFLOCAL @installdir@/texlive/texmf-local -TEXMFSYSCONFIG @installdir@/texlive/2019/texmf-config -TEXMFSYSVAR @installdir@/texlive/2019/texmf-var -TEXMFVAR @installdir@/texlive2019/texmf-var +TEXMFSYSCONFIG @installdir@/texlive/maneage/texmf-config +TEXMFSYSVAR @installdir@/texlive/maneage/texmf-var +TEXMFVAR @installdir@/texlive-maneage/texmf-var instopt_adjustpath 0 instopt_adjustrepo 1 instopt_letter 0 instopt_portable 0 instopt_write18_restricted 1 tlpdbopt_autobackup 1 -tlpdbopt_backupdir @installdir@/texlive/2019/backups +tlpdbopt_backupdir @installdir@/texlive/maneage/backups tlpdbopt_create_formats 1 tlpdbopt_desktop_integration 1 tlpdbopt_file_assocs 1 diff --git a/reproduce/software/make/high-level.mk b/reproduce/software/make/high-level.mk index cdaaf7d..b0847a9 100644 --- a/reproduce/software/make/high-level.mk +++ b/reproduce/software/make/high-level.mk @@ -1261,12 +1261,10 @@ $(itidir)/texlive-ready-tlmgr: reproduce/software/config/installation/texlive.co # don't want the configure script to fail if it can't run. if ./install-tl --profile=texlive.conf -repository $(tlmirror); then - # Put a symbolic link of the TeX Live executables in `ibdir'. The - # main problem is that the year and build system (for example - # `x86_64-linux') are also in the directory names, making it hard - # to be generic. We are using wildcards here, but only in this - # Makefile, not in any other. - ln -fs $(idir)/texlive/20*/bin/*/* $(ibdir)/ + # Put a symbolic link of the TeX Live executables in `ibdir' to + # avoid all the complexities of its sub-directories and additions + # to PATH. + ln -fs $(idir)/texlive/maneage/bin/*/* $(ibdir)/ # Register that the build was successful. echo "TeX Live is ready." > $@ @@ -1326,7 +1324,7 @@ $(itidir)/texlive: reproduce/software/config/installation/texlive-packages.conf # Make a symbolic link of all the TeX Live executables in the bin # directory so we don't have to modify `PATH'. - ln -fs $(idir)/texlive/20*/bin/*/* $(ibdir)/ + ln -fs $(idir)/texlive/maneage/bin/*/* $(ibdir)/ # Get all the necessary versions. texlive=$$(pdflatex --version | awk 'NR==1' | sed 's/.*(\(.*\))/\1/' \ |