aboutsummaryrefslogtreecommitdiff
path: root/paper.tex
diff options
context:
space:
mode:
Diffstat (limited to 'paper.tex')
-rw-r--r--paper.tex19
1 files changed, 10 insertions, 9 deletions
diff --git a/paper.tex b/paper.tex
index 43413a9..99d9d77 100644
--- a/paper.tex
+++ b/paper.tex
@@ -48,12 +48,12 @@
%% Abstract
{\noindent\mpregular
The era of big data has ushered an era of big responsibility.
- In the absence of reproducibility, as a test on controlling the data lineage, the result's integrity will be subject to perpetual debate.
- Maneage (management + lineage) is introduced here as a host to the computational and narrative components of an analysis.
- Analysis steps are added to a new project with lineage in mind, thus facilitating the project's execution and testing as the project evolves, while being friendly to publishing and archival because it is wholly in machine\--action\-able, and human\--read\-able, plain-text.
- Maneage is founded on the principles of completeness (e.g., no dependency beyond a POSIX-compatible operating system, no administrator privileges, or no network connection), modular and straight-forward design, temporal lineage and free software.
- The lineage is not limited to downloading the inputs and processing them automatically, but also includes building the necessary software with fixed versions and build configurations.
- Additionally, Maneage also builds the final PDF report of the project, establishing direct and automatic links between the data analysis and the narrative, with the precision of a word in a sentence.
+ In the absence of reproducibility, as a test on the reported data lineage, the result's integrity will be subject to perpetual debate.
+ To address this problem, we introduce Maneage (management + lineage) which has already been tested and used in several scientific papers.
+ Maneage is founded on the principles of completeness (e.g., no dependency beyond a POSIX-compatible operating system, no administrator privileges, or no network connection), modular and straight-forward design, temporal lineage and free software, to enable precise reproducibility.
+ The Maneage lineage, or workflow, is in machine\--action\-able, and human\--read\-able, plain-text format, facilitating version-control, publication, archival, or automatic parsing to extract data provenance.
+ The lineage is not limited to high-level processing, but also includes building the necessary software from source with fixed versions and build configurations.
+ Additionally, the project's final visualizations and narrative report are also included, establishing direct, and parse-able, links between the data analysis and the narrative or plots, with the precision of a word in a sentence or a point in a plot.
Maneage enables incremental projects, where a new project can branch off an existing one, with moderate changes to enable experimentation on published methods.
Once Maneage is implemented in a sufficiently wide scale, it can aid in automatic and optimized workflow creation through machine learning, or automating data management plans.
Maneage was a recipient of the research data alliance (RDA) Europe Adoption Grant in 2019.
@@ -86,6 +86,7 @@ What operations were done on those inputs? How were the configurations or traini
How did the quantitative results get visualized into the final demonstration plots, figures or narrative/qualitative interpretation?
May there be a bias in the visualization?
See Figure \ref{fig:questions} for a more detailed visual representation of such questions for various stages of the workflow.
+\tonote{Johan: add some general references.}
In data science and database management, this type of metadata are commonly known as \emph{data provenance}, and the lower-level implementation is \emph{data lineage} (for more on the definitions, see Section \ref{sec:definitions}).
Data lineage is being increasingly demanded for integrity checking from both the scientific and industrial/legal domains.
@@ -798,13 +799,13 @@ Once the improvements become substantial, new paper(s) will be written to comple
\section{Discussion}
\label{sec:discussion}
-
-
+\section{Summary and conclusion}
+\label{sec:conclusion}
%% Acknowledgements
\section{Acknowledgments}
-The authors wish to thank Pedram Ashofteh Ardakani, Zahra Sharbaf and Surena Fatemi for their useful suggestions and feedback on Maneage and this paper and to David Valls-Gabaud, Ignacio Trujillo, Johan Knapen, Roland Bacon for their support.
+The authors wish to thank Pedram Ashofteh Ardakani, Elham Saremi, Zahra Sharbaf and Surena Fatemi for their useful suggestions and feedback on Maneage and this paper and to David Valls-Gabaud, Ignacio Trujillo, Johan Knapen, Roland Bacon for their support.
We also thank Julia Aguilar-Cabello for designing the Maneage logo.
Work on the reproducible paper template has been funded by the Japanese Ministry of Education, Culture, Sports, Science, and Technology ({\small MEXT}) scholarship and its Grant-in-Aid for Scientific Research (21244012, 24253003), the European Research Council (ERC) advanced grant 339659-MUSICOS, European Union’s Horizon 2020 research and innovation programme under Marie Sklodowska-Curie grant agreement No 721463 to the SUNDIAL ITN, and from the Spanish Ministry of Economy and Competitiveness (MINECO) under grant number AYA2016-76219-P.