From 7bdbef3bf8998942bac86102438c1f32d2415bde Mon Sep 17 00:00:00 2001 From: Boud Roukema Date: Thu, 23 Apr 2020 00:29:26 +0200 Subject: 4.3.4 Project analysis - the analysis itself Reduction by about 20 words - minor rewording. --- paper.tex | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/paper.tex b/paper.tex index 9bc5a14..4e0d177 100644 --- a/paper.tex +++ b/paper.tex @@ -526,13 +526,13 @@ Other formats can easily be added. \subsubsection{The analysis} \label{sec:analysis} -The basic concepts behind organizing the analysis into modular subMakefiles have already been discussed above. -We will thus describe it here with the practical example of replicating Figure 1C of M20, with some enhancements in Figure \ref{fig:toolsperyear}. -As shown in Figure \ref{fig:datalineage}, in this project we have broken this goal into two subMakefiles: \inlinecode{format.mk} and \inlinecode{demo-plot.mk}. -The former is in charge of converting the Excel-formatted input into the simple comma-separated value (CSV) format, and the latter is in charge of generating the table to build Figure \ref{fig:toolsperyear}. +The organizing of the analysis into modular subMakefiles is discussed above. +We illustrate this here with the practical example of replicating Figure~1C of M20, with some enhancements, in Figure~\ref{fig:toolsperyear}. +As shown in Figure~\ref{fig:datalineage}, for this example we split this goal into two subMakefiles: \inlinecode{format.mk} and \inlinecode{demo-plot.mk}. +The former converts the Excel-formatted input into comma-separated value (CSV) format, and the latter generates the table to build Figure \ref{fig:toolsperyear}. In a real project, subMakefiles could, and will, be much more complex. -Figure \ref{fig:topmake} shows how the two subMakefiles are placed as values to the \inlinecode{makesrc} variable of \inlinecode{top-make.mk}, without their suffix (see Section \ref{sec:valuesintext}). -Note that their location after the standard starting subMakefiles (initialization and download) and before the standard ending subMakefiles (verification and final paper) is important, along with their order. +Figure \ref{fig:topmake} shows how the two subMakefiles are placed as values in the \inlinecode{makesrc} variable of \inlinecode{top-make.mk}, without their suffixes (see Section \ref{sec:valuesintext}). +Their location after the standard starting subMakefiles (initialization and download) and before the standard ending subMakefiles (verification and final paper) is important, along with their order. \begin{figure}[t] \begin{center} @@ -554,22 +554,22 @@ Note that their location after the standard starting subMakefiles (initializatio To enhance the original plot, Figure \ref{fig:toolsperyear} also shows the number of papers that were studied each year. Furthermore, its horizontal axis shows the full range of the data (starting from \menkefirstyear), while the original starts from 1997. -This was probably because they did not have sufficient data for older papers, for example, in \menkenumpapersdemoyear, they only had \menkenumpapersdemocount{} papers. -Note that both the numbers of the previous sentence (\menkenumpapersdemoyear{} and \menkenumpapersdemocount), and the dataset's oldest year (mentioned above: \menkefirstyear) are automatically generated \LaTeX{} macros, see Section \ref{sec:valuesintext}. -They are \emph{not} typeset manually in this narrative explanation. +This was probably because the authors judged the earlier years' data to be too noisy. For example, in \menkenumpapersdemoyear, only \menkenumpapersdemocount{} papers were analysed. +Both the numbers in the previous sentence (\menkenumpapersdemoyear{} and \menkenumpapersdemocount), and the dataset's oldest year (mentioned above: \menkefirstyear) are automatically generated \LaTeX{} macros, see Section \ref{sec:valuesintext}. +These are \emph{not} typeset manually in this narrative explanation. This step (generating the macros) is shown schematically in Figure \ref{fig:datalineage} with the arrow from \inlinecode{tools-per-year.txt} to \inlinecode{demo-plot.tex}. To create Figure \ref{fig:toolsperyear}, we used the PGFPlots package within \LaTeX{}. -Therefore, the necessary analysis output to feed into PGFPlots was a simple plain-text table with 3 columns (year, paper per year, tool fraction per year). +Therefore, the necessary analysis output to feed into PGFPlots was a plain-text table with 3 columns (year, paper per year, tool fraction per year). This table is shown in the lineage graph of Figure \ref{fig:datalineage} as \inlinecode{tools-per-year.txt} and The PGFPlots source to generate this figure is located in \inlinecode{tex\-/src\-/figure\--tools\--per\--year\-.tex}. If another plotting tool was desired (for example \emph{Python}'s Matplotlib, or Gnuplot), the built graphic file (for example \inlinecode{tools-per-year.pdf}) could be the target instead of the raw table. -Note that \inlinecode{tools-per-year.txt} is a value-added table with only \menkenumyears{} rows (one row for every year). +The file \inlinecode{tools-per-year.txt} is a value-added table with only \menkenumyears{} rows (one row for every year). The original dataset had \menkenumorigrows{} rows (one row for each year of each journal). We see in Figure \ref{fig:datalineage} that it is defined as a Make \emph{target} in \inlinecode{demo-plot.mk} and that its prerequisite is \inlinecode{menke20-table-3.txt} (schematically shown by the arrow connecting them). -Note that both the row numbers mentioned at the start of this paragraph are also macros. -Again from Figure \ref{fig:datalineage}, we see that \inlinecode{menke20-table-3.txt} is a target in \inlinecode{format.mk} and its prerequisite is the input file \inlinecode{menke20.xlsx}. -The input files (which come from outside the project) are all \emph{targets} in \inlinecode{download.mk} and futher discussed in Section \ref{sec:download}. +Both the row counts mentioned at the start of this paragraph are again macros. +In Figure \ref{fig:datalineage}, we see that \inlinecode{menke20-table-3.txt} is a target in \inlinecode{format.mk} and its prerequisite is the input file \inlinecode{menke20.xlsx}. +The input files (which come from outside the project) are all \emph{targets} in \inlinecode{download.mk} and futher discussed in Section \ref{sec:download}. -- cgit v1.2.1