diff options
-rw-r--r-- | paper.tex | 25 |
1 files changed, 12 insertions, 13 deletions
@@ -24,7 +24,7 @@ -\title{Acheiving long-term and archive-able reproducibility} +\title{Towards long-term archivable reproducibility} \author{\large\mpregular \authoraffil{Mohammad Akhlaghi}{1,2,3}, \large\mpregular \authoraffil{Ra\'ul Infante-Sainz}{1,2}, \large\mpregular \authoraffil{Boudewijn F. Roukema}{4,3}, @@ -50,24 +50,23 @@ \thispagestyle{firstpage} \maketitle -%% Abstract +%% Abstract % max 250 words for CiSE {\noindent\mpregular %% CONTEXT Many reproducible workflow solutions have been proposed over the recent decades. - Most use the popular high-level technologies when they were created, providing an immediate solution which is unlikely to be sustainable in the long term. - Indeed, decades later, scientists lack the resources to re-write their projects while still being accountable for their results. - This creates generational gaps and, due to obsolete technologies, impedes reproducibility or building upon previous work. + Most use the high-level technologies that were popular when they were created, providing an immediate solution which is unlikely to be sustainable in the long term. + Decades later, scientists lack the resources to rewrite their projects, while still being accountable for their results. + This creates generational gaps, which, together with technological obsolescence, impede reproducibility and building upon previous work. %% AIM - We aim to introduce a set of criteria to address this problem and demonstrate their practicality. + We aim to introduce a set of criteria to address this problem and to demonstrate their practicality. %% METHOD - The criteria are: completeness (i.e., no dependency beyond a POSIX-compatible operating system, no administrator privileges, no network connection and primarily stored in plain-text); modular design; temporal provenance; scalability; and free-and-open-source software. - Their usefulness is tested through an implementation: "Maneage" (managing+lineage). + The criteria have been tested in several research publications and can be summarized as: completeness (no dependency beyond a POSIX-compatible operating system, no administrator privileges, no network connection and storage primarily in plain-text); modular design; linking analysis with narrative, temporal provenance; scalability; and free-and-open-source software. %% RESULTS - It is stored in machine-actionable and human-readable plain-text, enabling version-control, cheap archival, automatic parsing to extract data provenance, and peer-reviewable verification. - Furthermore, we show that these criteria are not limited to long-term reproducibility but also the immediate/fast regime. - It has been tested in several research publications including the present one, with snapshot \projectversion. - %% CONCLUSION - We conclude that requiring longevity from solutions is realistic, and discuss the benefits of these criteria for scientific progress. + Through an implementation, called "Maneage" (managing+lineage), we find that storing the project in machine-actionable and human-readable plain-text, enables version-control, cheap archiving, automatic parsing to extract data provenance, and peer-reviewable verification. + Furthermore, we show that these criteria are not limited to long-term reproducibility but also provide immediate, fast short-term reproducibility. + %%CONCLUSION + We conclude that requiring longevity from solutions is realistic. + We discuss the benefits of these criteria for scientific progress. \horizontalline |