diff options
Diffstat (limited to 'tex/src')
-rw-r--r-- | tex/src/appendix-existing-solutions.tex | 14 | ||||
-rw-r--r-- | tex/src/appendix-existing-tools.tex | 2 |
2 files changed, 10 insertions, 6 deletions
diff --git a/tex/src/appendix-existing-solutions.tex b/tex/src/appendix-existing-solutions.tex index 5166703..2396b1b 100644 --- a/tex/src/appendix-existing-solutions.tex +++ b/tex/src/appendix-existing-solutions.tex @@ -101,10 +101,12 @@ Hence in 2006 SEP moved to a new Python-based framework called Madagascar, see A -\subsection{Apache Taverna (2003)} +\subsection{Taverna (2003)} \label{appendix:taverna} -Apache Taverna\footnote{\inlinecode{\url{https://taverna.incubator.apache.org}}} \citeappendix{oinn04} is a workflow management system written in Java with a graphical user interface which is still being used and developed. -A workflow is defined as a directed graph, where nodes are called ``processors''. +Taverna\footnote{\inlinecode{\url{https://github.com/taverna}}} \citeappendix{oinn04} was a workflow management system written in Java with a graphical user interface. +In 2014 it was sponsored by the Apache Incubator project and called ``Apache Taverna'', but its developers \href{https://lists.apache.org/thread.html/r559e0dd047103414fbf48a6ce1bac2e17e67504c546300f2751c067c\%40\%3Cdev.taverna.apache.org\%3E}{voted} to \emph{retire} it in 2020 because development has come to a standstill (as of April 2021, latest public Github commit was in 2016). + +In Taverna, a workflow is defined as a directed graph, where nodes are called ``processors''. Each Processor transforms a set of inputs into a set of outputs and they are defined in the Scufl language (an XML-based language, where each step is an atomic task). Other components of the workflow are ``Data links'' and ``Coordination constraints''. The main user interface is graphical, where users move processors in the given space and define links between their inputs and outputs (manually constructing a lineage, as in the @@ -179,7 +181,7 @@ the lineage figure shown in the main paper). Figure \ref{fig:datalineage}). \fi Each actor is connected to others through Ptolemy II\footnote{\inlinecode{\url{https://ptolemy.berkeley.edu}}} \citeappendix{eker03}. -In many aspects, the usage of Kepler and its issues for long-term reproducibility is like Apache Taverna (see Section \ref{appendix:taverna}). +In many aspects, the usage of Kepler and its issues for long-term reproducibility is like Taverna (see Section \ref{appendix:taverna}). @@ -334,7 +336,7 @@ the first figure in the main body of the paper, Figure \ref{fig:datalineage}, \fi where you can click on the given Zenodo link and be taken to the raw data that created the plot. -However, instead of a long and hard to read hash, we simply point to the plotted file's source as a Zenodo DOI (which has long term funding for logevity). +However, instead of a long and hard to read hash, we simply point to the plotted file's source as a Zenodo DOI (which has long term funding for longevity). Unfortunately, most parts of the web page are not complete as of January 2021. The VCR web page contains an example PDF\footnote{\inlinecode{\url{http://vcr.stanford.edu/paper.pdf}}} that is generated with this system, but the linked VCR repository\footnote{\inlinecode{\url{http://vcr-stat.stanford.edu}}} did not exist (again, as of January 2021). @@ -386,7 +388,7 @@ It just captures the environment, it does not store \emph{how} that environment The Research object\footnote{\inlinecode{\url{http://www.researchobject.org}}} is collection of meta-data ontologies, to describe aggregation of resources, or workflows, see \citeappendix{bechhofer13} and \citeappendix{belhajjame15}. It thus provides resources to link various workflow/analysis components (see Appendix \ref{appendix:existingtools}) into a final workflow. -Ref.\/~\citeappendix{bechhofer13} describes how a workflow in Apache Taverna (Appendix \ref{appendix:taverna}) can be translated into research objects. +Ref.\/~\citeappendix{bechhofer13} describes how a workflow in Taverna (Appendix \ref{appendix:taverna}) can be translated into research objects. The important thing is that the research object concept is not specific to any special workflow, it is just a metadata bundle/standard which is only as robust in reproducing the result as the running workflow. Therefore if implemented over a complete workflow like Maneage, it can be very useful in analysing/optimizing the workflow, finding common components between many Maneage'd workflows, or translating to other complete workflows. diff --git a/tex/src/appendix-existing-tools.tex b/tex/src/appendix-existing-tools.tex index 0c9a1c2..a773322 100644 --- a/tex/src/appendix-existing-tools.tex +++ b/tex/src/appendix-existing-tools.tex @@ -441,8 +441,10 @@ GWL has two high-level concepts called ``processes'' and ``workflows'' where the Nextflow\footnote{\inlinecode{\url{https://www.nextflow.io}}} \citeappendix{tommaso17} workflow language with a command-line interface that is written in Java. \subsubsection{Generic workflow specifications (CWL and WDL)} +\label{appendix:genericworkflows} Due to the variety of custom workflows used in existing reproducibility solution (like those of Appendix \ref{appendix:existingsolutions}), some attempts have been made to define common workflow standards like the Common workflow language (CWL\footnote{\inlinecode{\url{https://www.commonwl.org}}}, with roots in Make, formatted in YAML or JSON) and Workflow Description Language (WDL\footnote{\inlinecode{\url{https://openwdl.org}}}, formatted in JSON). These are primarily specifications/standards rather than software. +At an even higher level solutions like Canonical Workflow Frameworks for Research (CWFR) are being proposed\footnote{\inlinecode{\href{https://codata.org/wp-content/uploads/2021/01/CWFR-position-paper-v3.pdf}{https://codata.org/wp-content/uploads/2021/01/}}\\\inlinecode{\href{https://codata.org/wp-content/uploads/2021/01/CWFR-position-paper-v3.pdf}{CWFR-position-paper-v3.pdf}}}. With these standards, ideally, translators can be written between the various workflow systems to make them more interoperable. In conclusion, shell scripts and Make are very common and extensively used by users of Unix-based OSs (which are most commonly used for computations). |