diff options
| -rw-r--r-- | paper.tex | 18 | 
1 files changed, 9 insertions, 9 deletions
| @@ -52,15 +52,15 @@  %% Abstract  {\noindent\mpregular -  The era of big data has ushered an era of big responsibility. -  In the absence of reproducibility, as a test on understanding data lineage, the result can be the subject of perpetual debate. -  To address this problem, we introduce Maneage (management + lineage) which is founded on the principles of completeness (e.g., no dependency beyond a POSIX-compatible operating system, no administrator privileges, and no network connection), modular and straightforward design, temporal provenance, scalability, and free software. -  A project using Maneage is fully stored in machine\--action\-able, and human\--read\-able plain-text format, facilitating version-control, publication, archival, and automatic parsing to extract data provenance. -  The provided lineage is not limited to high-level processing, but also includes building the necessary software from source with fixed versions and build configurations. -  Additionally, a project's final visualizations and narrative report are also included, establishing direct links between the analysis and the narrative or visualizations, to the precision of a word within a sentence or a point in a plot. -  Maneage also enables incremental projects, where a new project can branch off an existing one, with moderate changes to enable experimentation on published methods. -  Once Maneage is implemented in a sufficiently wide scale, automatic and optimized workflow creation through machine learning, or automating data management plans, can easily be set up. -  Maneage was a recipient of a research data alliance (RDA) Europe Adoption Grant in 2019,  and has already been tested and used in several scientific papers, including the present one, with snapshot \projectversion. +  Over the last 30 years, many reproducible workflow solutions have been proposed, mostly using the common high-level technology of the day. +  Thus providing immediate reproducibility, but problematic in the long-term because high-level technologies evolve. +  Scientists are accountable to their results decades later, and don't have the resources to re-write their projects. +  This creates generational gaps between scientists and makes it hard to build upon previous work. +  In this paper, we report the result of our research project on a fundamentally new design that is founded on the principles of completeness (e.g., no dependency beyond a POSIX-compatible operating system, no administrator privileges, and no network connection), with modular and straightforward design, temporal provenance, scalability, and free software. +  It is called Maneage (managing+lineage) and is stored in machine-actionable, and human-readable plain-text format. +  Facilitating version-control, publication, archival, and automatic parsing to extract data provenance. +  It can build its environment automatically, possibly, in virtual machines, containers or any future technology as a binary blob for immediate/fast reproduction. +  Maneage has already been used in several scientific publications including the present one, with snapshot \projectversion.    \horizontalline    \noindent | 
