From 842fd2f0f6b17ce040e6c60dff6b9eb4db9e318e Mon Sep 17 00:00:00 2001 From: Mohammad Akhlaghi Date: Fri, 1 May 2020 02:10:03 +0100 Subject: Abstract re-written to better highlight the uniqueness of Maneage This abstract is a first step in order to put more focus on the research aspects of Maneage. --- paper.tex | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/paper.tex b/paper.tex index 758c418..63db54e 100644 --- a/paper.tex +++ b/paper.tex @@ -52,15 +52,15 @@ %% Abstract {\noindent\mpregular - The era of big data has ushered an era of big responsibility. - In the absence of reproducibility, as a test on understanding data lineage, the result can be the subject of perpetual debate. - To address this problem, we introduce Maneage (management + lineage) which is founded on the principles of completeness (e.g., no dependency beyond a POSIX-compatible operating system, no administrator privileges, and no network connection), modular and straightforward design, temporal provenance, scalability, and free software. - A project using Maneage is fully stored in machine\--action\-able, and human\--read\-able plain-text format, facilitating version-control, publication, archival, and automatic parsing to extract data provenance. - The provided lineage is not limited to high-level processing, but also includes building the necessary software from source with fixed versions and build configurations. - Additionally, a project's final visualizations and narrative report are also included, establishing direct links between the analysis and the narrative or visualizations, to the precision of a word within a sentence or a point in a plot. - Maneage also enables incremental projects, where a new project can branch off an existing one, with moderate changes to enable experimentation on published methods. - Once Maneage is implemented in a sufficiently wide scale, automatic and optimized workflow creation through machine learning, or automating data management plans, can easily be set up. - Maneage was a recipient of a research data alliance (RDA) Europe Adoption Grant in 2019, and has already been tested and used in several scientific papers, including the present one, with snapshot \projectversion. + Over the last 30 years, many reproducible workflow solutions have been proposed, mostly using the common high-level technology of the day. + Thus providing immediate reproducibility, but problematic in the long-term because high-level technologies evolve. + Scientists are accountable to their results decades later, and don't have the resources to re-write their projects. + This creates generational gaps between scientists and makes it hard to build upon previous work. + In this paper, we report the result of our research project on a fundamentally new design that is founded on the principles of completeness (e.g., no dependency beyond a POSIX-compatible operating system, no administrator privileges, and no network connection), with modular and straightforward design, temporal provenance, scalability, and free software. + It is called Maneage (managing+lineage) and is stored in machine-actionable, and human-readable plain-text format. + Facilitating version-control, publication, archival, and automatic parsing to extract data provenance. + It can build its environment automatically, possibly, in virtual machines, containers or any future technology as a binary blob for immediate/fast reproduction. + Maneage has already been used in several scientific publications including the present one, with snapshot \projectversion. \horizontalline \noindent -- cgit v1.2.1