aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--README-pipeline.md55
-rwxr-xr-xconfigure108
-rw-r--r--paper.tex86
-rw-r--r--reproduce/config/gnuastro/astconvertt.conf31
-rw-r--r--reproduce/config/gnuastro/aststatistics.conf34
-rw-r--r--reproduce/config/pipeline/INPUTS.mk9
-rw-r--r--reproduce/config/pipeline/LOCAL.mk.in4
-rw-r--r--reproduce/config/pipeline/delete-me-wfpc2-quant.mk2
-rw-r--r--reproduce/config/pipeline/dependency-versions.mk1
-rw-r--r--reproduce/config/pipeline/web.mk6
-rw-r--r--reproduce/src/make/delete-me.mk71
-rw-r--r--reproduce/src/make/dependencies.mk7
-rw-r--r--reproduce/src/make/download.mk57
-rw-r--r--reproduce/src/make/initialize.mk9
-rw-r--r--tex/delete-me-wfpc2.tex34
15 files changed, 414 insertions, 100 deletions
diff --git a/README-pipeline.md b/README-pipeline.md
index ff15094..6effa30 100644
--- a/README-pipeline.md
+++ b/README-pipeline.md
@@ -516,6 +516,7 @@ advanced in later stages of your work.
them.
- Delete marked part(s) in `configure`.
+ - Delete the `reproduce/config/gnuastro` directory.
- Delete `astnoisechisel` from the value of `top-level-programs` in `reproduce/src/make/dependencies.mk`. You can keep the rule to build `astnoisechisel`, since its not in the `top-level-programs` list, it (and all the dependencies that are only needed by Gnuastro) will be ignored.
- Delete marked parts in `reproduce/src/make/initialize.mk`.
- Delete `and Gnuastro \gnuastroversion` from `tex/preamble-style.tex`.
@@ -526,51 +527,31 @@ advanced in later stages of your work.
commented thoroughly and reading over the comments should guide you on
what to add/remove and where.
- - **Input dataset (can be done later)**: The user manages the top-level
- directory of the input data through the variables set in
- `reproduce/config/pipeline/LOCAL.mk.in` (the user actually edits a
- `LOCAL.mk` file that is created by `configure` from the `.mk.in` file,
- but the `.mk` file is not under version control). Datasets are usually
- large and the users might already have their copy don't need to
- download them). So you can define a variable (all in capital letters)
- in `reproduce/config/pipeline/LOCAL.mk.in`. For example if you are
- working on data from the XDF survey, use `XDF`. You can use this
- variable to identify the location of the raw inputs on the running
- system. Here, we'll assume its name is `SURVEY`. Afterwards, change
- any occurrence of `SURVEY` in the whole pipeline with the new
- name. You can find the occurrences with a simple command like the ones
- shown below. We follow the Make convention here that all
- `ONLY-CAPITAL` variables are those directly set by the user and all
- `small-caps` variables are set by the pipeline designer. All variables
- that also depend on this survey have a `survey` in their name. Hence,
- also correct all these occurrences to your new name in small-caps. Of
- course, ignore/delete those occurrences that are irrelevant, like
- those in this file. Note that in the raw version of this template no
- target depends on these files, so they are ignored. Afterwards, set
- the webpage and correct the filenames in
- `reproduce/src/make/download.mk` if necessary.
-
- ```shell
- $ grep -r SURVEY ./
- $ grep -r survey ./
- ```
-
- - **Other input datasets (can be done later)**: Add any other input
- datasets that may be necessary for your research to the pipeline based
- on the example above.
+ - **Input dataset (can be done later)**: The input datasets are managed
+ through the `reproduce/config/pipeline/INPUTS.mk` file. It is best to
+ gather all the information regarding all the input datasets into this
+ one central file. To ensure that the proper dataset is being
+ downloaded and used by the pipeline, its best to also get an MD5
+ checksum (https://en.wikipedia.org/wiki/MD5) of the file and include
+ that in thsi file so you can check it in the pipeline. The preparation
+ of the input datasets is done in
+ `reproduce/src/make/download.mk`. Have a look there to see how these
+ values are to be used. This information about the input datasets is
+ also used in the initial `configure` script (to inform the users), so
+ also modify that file.
- **Delete dummy parts (can be done later)**: The template pipeline
- contains some parts that are only for the initial/test run, not for
- any real analysis. The respective files to remove and parts to fix are
- discussed here.
+ contains some parts that are only for the initial/test run, mainly as
+ a demonstration of important steps. They not for any real
+ analysis. You can remove these parts in the file below
- `paper.tex`: Delete the text of the abstract and the paper's main
body, *except* the "Acknowledgments" section. This reproduction
pipeline was designed by funding from many grants, so its necessary
to acknowledge them in your final research.
- - `Makefile`: Delete the two lines containing `delete-me` in the
- `foreach` loops. Just make sure the other lines that end in `\` are
+ - `Makefile`: Delete the lines containing `delete-me` in the `foreach`
+ loops. Just make sure the other lines that end in `\` are
immediately after each other.
- Delete all `delete-me*` files in the following directories:
diff --git a/configure b/configure
index c33d646..2922365 100755
--- a/configure
+++ b/configure
@@ -42,6 +42,7 @@ topdir=$(pwd)
installedlink=.local
lbdir=reproduce/build
cdir=reproduce/config
+optionaldir="/optional/path"
pdir=$cdir/pipeline
pconf=$pdir/LOCAL.mk
@@ -100,7 +101,7 @@ function create_file_with_notice() {
# Since the build directory will go into a symbolic link, we want it to be
# an absolute address. With this function we can make sure of that.
function absolute_dir() {
- echo "$(cd "$(dirname "$inbdir")" && pwd )/$(basename "$inbdir")"
+ echo "$(cd "$(dirname "$1")" && pwd )/$(basename "$1")"
}
@@ -179,7 +180,8 @@ fi
# the web address.
if [ $rewritepconfig = yes ]; then
if type wget > /dev/null 2>/dev/null; then
- downloader="wget --no-use-server-timestamps -O";
+ wgetname=$(which wget)
+ downloader="$wgetname --no-use-server-timestamps -O";
else
cat <<EOF
@@ -256,11 +258,59 @@ fi
+# Input directory
+# ---------------
+indir=$optionaldir
+wfpc2name=$(awk '!/^#/ && $1=="WFPC2IMAGE" {print $3}' $pdir/INPUTS.mk)
+wfpc2md5=$(awk '!/^#/ && $1=="WFPC2MD5" {print $3}' $pdir/INPUTS.mk)
+wfpc2size=$(awk '!/^#/ && $1=="WFPC2SIZE" {print $3}' $pdir/INPUTS.mk)
+wfpc2url=$(awk '!/^#/ && $1=="WFPC2URL" {print $3}' $pdir/INPUTS.mk)
+if [ $rewritepconfig = yes ]; then
+ cat <<EOF
+
+----------------------------------
+(OPTIONAL) Input dataset directory
+----------------------------------
+
+This pipeline needs the dataset(s) listed below. If you already have them,
+please specify the directory hosting them on this system. If you don't,
+they will be downloaded automatically. Each file is shown with its total
+volume and its 128-bit MD5 checksum in parenthesis.
+
+ $wfpc2name ($wfpc2size, $wfpc2md5):
+ A 100x100 Hubble Space Telescope WFPC II image used in the FITS
+ standard webpage as a demonstration of this file format.
+ URL: $wfpc2url/$wfpc2name
+
+ $uitname ($uitsize, $uitmd5):
+ A 512x512 Astro1 Ultraviolet Imaging Telescope image used in the FITS
+ standard webpage as a demonstration of this file format.
+ URL: $uiturl/$uitname
+
+NOTE: This directory, or the datasets above, are optional. If it doesn't
+exist, the files will be downloaded in the build directory and used.
+
+TIP: If you have these files in multiple directories on your system and
+don't want to download them or make duplicates, you can create symbolic
+links to them and put those symbolic links in the given top-level
+directory.
+
+EOF
+ read -p"(OPTIONAL) Input datasets directory ($indir): " inindir
+ if [ x$inindir != x ]; then
+ indir=$inindir
+ echo " -- Using '$indir'"
+ fi
+fi
+
+
+
+
+
# Dependency tarball directory
# ----------------------------
if [ $rewritepconfig = yes ]; then
- junkddir="/optional/path"
- ddir=$junkddir
+ ddir=$optionaldir
cat <<EOF
---------------------------------------
@@ -282,7 +332,6 @@ EOF
ddir=$tmpddir
echo " -- Using '$ddir'"
fi
- echo
fi
@@ -292,7 +341,7 @@ fi
# Memory mapping minimum size
# ---------------------------
if [ $rewritegconfig = yes ]; then
- defaultminmapsize=1000000000
+ defaultminmapsize=10000000000
minmapsize=$defaultminmapsize
cat <<EOF
@@ -329,18 +378,57 @@ fi
if [ $rewritepconfig = yes ]; then
create_file_with_notice $pconf
sed -e's|@bdir[@]|'"$bdir"'|' \
+ -e's|@indir[@]|'"$indir"'|' \
-e's|@ddir[@]|'"$ddir"'|' \
-e's|@downloader[@]|'"$downloader"'|' \
$pconf.in >> $pconf
else
# Read the values from existing configuration file.
- inbdir=$(awk '$1=="BDIR" {print $NF}' $pconf)
- ddir=$(awk '$1=="DEPENDENCIES-DIR" {print $NF}' $pconf)
- downloader=$(awk '$1=="DOWNLOADER" {print $NF}' $pconf)
+ inbdir=$(awk '$1=="BDIR" {print $3}' $pconf)
+ downloader=$(awk '$1=="DOWNLOADER" {print $3}' $pconf)
+
+ # Make sure all necessary variables have a value
+ err=0
+ verr=0
+ novalue=""
+ if [ x"$inbdir" = x ]; then novalue="BDIR, "; fi
+ if [ x"$downloader" = x ]; then novalue="$novalue"DOWNLOADER; fi
+ if [ x"$novalue" != x ]; then verr=1; err=1; fi
# Make sure `bdir' is an absolute path and it exists.
+ berr=0
+ ierr=0
bdir=$(absolute_dir $inbdir)
- if ! [ -d $bdir ]; then mkdir $bdir; fi
+
+ if ! [ -d $bdir ]; then if ! mkdir $bdir; then berr=1; err=1; fi; fi
+ if [ $err = 1 ]; then
+ cat <<EOF
+
+#################################################################
+######## ERORR reading existing configuration file ############
+#################################################################
+EOF
+ if [ $verr = 1 ]; then
+ cat <<EOF
+
+These variables have no value: $novalue.
+EOF
+ fi
+ if [ $berr = 1 ]; then
+ cat <<EOF
+
+Couldn't create the build directory '$bdir' (value to 'BDIR') in
+'$pconf'.
+EOF
+ fi
+
+ cat <<EOF
+
+Please run the configure script again (accepting to re-write existing
+configuration file) so all the values can be filled and checked.
+#################################################################
+EOF
+ fi
fi
diff --git a/paper.tex b/paper.tex
index 53176cd..32a3465 100644
--- a/paper.tex
+++ b/paper.tex
@@ -53,7 +53,9 @@
\textsl{Keywords}: Add some keywords for your research here.
- \textsl{Reproducible paper}: Reproduction pipeline \pipelineversion{}
+ \textsl{Reproducible paper}: All quantitave results (numbers and plots)
+ in this paper are exactly reproducible with reproduction pipeline
+ \pipelineversion{}
(\url{https://gitlab.com/makhlaghi/reproducible-paper}).}
%% To add the first page's headers.
@@ -69,8 +71,8 @@ Congratulations on running the reproduction pipeline! You can now follow
the checklist in the \texttt{README.md} file to customize this pipeline to
your exciting research project.
-Just don't forget to \emph{never} use any numbers or fixed strings (for
-example database urls like \url{\websurvey}) directly within your \LaTeX{}
+Just don't forget to \emph{never} use numbers or fixed strings (for example
+database urls like \url{\wfpctwourl}) directly within your \LaTeX{}
source. Read them directly from your configuration files or outputs of the
programs as part of the reproduction pipeline and import them into \LaTeX{}
as macros through the \texttt{tex/pipeline.tex} file. See the several
@@ -83,14 +85,12 @@ or
in this way, will let you focus clearly on your science and not have to
worry about fixing this or that number/name in the text.
-Just as a demonstration of creating plots within \LaTeX{} (using the
-{\small PGFP}lots package), in Figure \ref{deleteme} we show a simple
-plot, where the Y axis is the square of the X axis. The minimum value
-in this distribution is $\deletememin$, and $\deletememax$ is the
-maximum. Take a look into the \LaTeX{} source and you'll see these
-numbers are actually macros that were calculated from the same dataset
-(they will change if the dataset, or function that produced it,
-changes).
+Figure \ref{deleteme} shows a simple plot as a demonstration of creating
+plots within \LaTeX{} (using the {\small PGFP}lots package). The minimum
+value in this distribution is $\deletememin$, and $\deletememax$ is the
+maximum. Take a look into the \LaTeX{} source and you'll see these numbers
+are actually macros that were calculated from the same dataset (they will
+change if the dataset, or function that produced it, changes).
The individual {\small PDF} file of Figure \ref{deleteme} is available
under the \texttt{tex/build/tikz/} directory of your build directory. You
@@ -100,15 +100,6 @@ progress or after publishing the work). If you want to directly use the
KZ} decide if it should be remade or not, you can also comment the
\texttt{makepdf} macro at the top of this \LaTeX{} source file.
-{\small PGFP}lots is a great tool to build the plots within \LaTeX{} and
-removes the necessity to add further dependencies (to create the plots) to
-your reproduction pipeline. High-level language libraries like Matplotlib
-do exist to also generate plots. However, bare in mind that they require
-many dependencies (Python, Numpy and etc). Installing these dependencies
-from source (after several years when the binaries are no longer available
-in common repositories), is not easy and will harm the reproducibility of
-your paper.
-
\begin{figure}[t]
\includetikz{delete-me}
@@ -116,10 +107,39 @@ your paper.
demonstration.}
\end{figure}
+Figure \ref{deleteme-wfpc2} is another demonstration of showing images
+(datasets) using PGFPlots. It shows a small crop of an image from the
+Wide-Field Planetary Camera 2, on board the Hubble Space Telescope from
+1993 to 2009. This cropped image is one of the sample FITS files from the
+FITS file standard
+webpage\footnote{\url{https://fits.gsfc.nasa.gov/fits_samples.html}}. Just
+as another basic reporting of measurements on this dataset within the paper
+without using numbers in the \LaTeX{} source, the mean is
+$\deletemewfpctwomean$ and the median is $\deletemewfpctwomedian$. The
+skewness in the histogram of Figure \ref{deleteme-wfpc2}(b) explains this
+difference between the mean and median. Also, the value of quantile
+$\deletemewfpcquantile$ (set in the pipeline configuration file
+\texttt{delete-me-wfpc2-quant.mk}) is $\deletemewfpctwoquantile$. The
+dataset was prepared for demonstration here with Gnuastro's
+\textsf{Convert\-Type} program and the histogram and basic statstics were
+generated with Gnuastro's \textsf{Statistics} program.
+
+{\small PGFP}lots\footnote{\url{https://ctan.org/pkg/pgfplots}} is a great
+tool to build the plots within \LaTeX{} and removes the necessity to add
+further dependencies (to create the plots) to your reproduction
+pipeline. There are high-level language libraries like Matplotlib which
+also generate plots. However, the problem is that they require many
+dependencies (Python, Numpy and etc). Installing these dependencies from
+source, is not easy and will harm the reproducibility of your paper. Note
+that after several years, the binary files of these high-level libraries,
+that you easily install today, will no longer be available in common
+repositories. Therefore building the libraries from source is the only
+option to reproduce your results.
+
Furthermore, since {\small PGFP}lots is built by \LaTeX{} it respects all
-the properties of your text (for example line width and fonts and etc), so
-the final plot blends in your paper much more nicely. It also has a
-wonderful
+the properties of your text (for example line width and fonts and
+etc). Therefore the final plot blends in your paper much more nicely. It
+also has a wonderful
manual\footnote{\url{http://mirrors.ctan.org/graphics/pgf/contrib/pgfplots/doc/pgfplots.pdf}}.
This pipeline also defines two \LaTeX{} macros that allow you to mark text
@@ -135,7 +155,15 @@ existing coauthors (who are just interested in the new parts or notes) and
new co-authors (who don't want to be distracted by these issues in their
first time reading).
+\begin{figure}[t]
+ \includetikz{delete-me-wfpc2}
+ \captionof{figure}{\label{deleteme-wfpc2} (a) An example image of the
+ Wide-Field Planetary Camera 2, on board the Hubble Space Telescope from
+ 1993 to 2009. This is one of the sample images from the FITS standard
+ webpage, kept as examples for this file format. (b) Histogram of pixel
+ values in (a).}
+\end{figure}
@@ -177,12 +205,12 @@ SUNDIAL ITN, and from the Spanish Ministry of Economy and Competitiveness
The following free software tools were also critical component of this
research (in alphabetical order): Bzip2 \bziptwoversion, CFITSIO
-\cfitsioversion, CMake \cmakeversion, cURL \curlversion, Git \gitversion,
-GNU Bash \bashversion, GNU Coreutils \coreutilsversion, GNU AWK
-\gawkversion, GNU Grep \grepversion, GNU Libtool \libtoolversion, GNU Make
-\makeversion, GNU Sed \sedversion, GNU Scientific Library (GSL)
-\gslversion, GNU Tar \tarversion, GNU Which \whichversion, Lzip
-\lzipversion, GPL Ghostscript \ghostscriptversion, Libgit2
+\cfitsioversion, CMake \cmakeversion, cURL \curlversion, Discoteq flock
+\flockversion, Git \gitversion, GNU Bash \bashversion, GNU Coreutils
+\coreutilsversion, GNU AWK \gawkversion, GNU Grep \grepversion, GNU Libtool
+\libtoolversion, GNU Make \makeversion, GNU Sed \sedversion, GNU Scientific
+Library (GSL) \gslversion, GNU Tar \tarversion, GNU Which \whichversion,
+Lzip \lzipversion, GPL Ghostscript \ghostscriptversion, Libgit2
\libgitwoversion, Libtiff \libtiffversion, WCSLIB \wcslibversion, XZ Utils
\xzversion, and ZLib \zlibversion. The final paper was produced with \TeX{}
Live \texliveversion, using the following packages: \TeX{} \textexversion,
diff --git a/reproduce/config/gnuastro/astconvertt.conf b/reproduce/config/gnuastro/astconvertt.conf
new file mode 100644
index 0000000..fc3ba04
--- /dev/null
+++ b/reproduce/config/gnuastro/astconvertt.conf
@@ -0,0 +1,31 @@
+# Default parameters (System) for ConvertType.
+# ConvertType is part of GNU Astronomy Utitlies.
+#
+# Use the long option name of each parameter followed by a value. The name
+# and value should be separated by atleast one white-space character (for
+# example ` '[space], or tab). Lines starting with `#' are ignored.
+#
+# For more information, please run these commands:
+#
+# $ astconvertt --help # Full list of options, short doc.
+# $ astconvertt -P # Print all options and used values.
+# $ info astconvertt # All options and input/output.
+# $ info gnuastro "Configuration files" # How to use configuration files.
+#
+# Copying and distribution of this file, with or without modification, are
+# permitted in any medium without royalty provided the copyright notice and
+# this notice are preserved. This file is offered as-is, without any
+# warranty.
+
+# Input:
+
+# Output:
+ quality 100
+ widthincm 10.0
+ borderwidth 1
+ output jpg
+
+# Flux:
+ invert 0
+
+# Common options
diff --git a/reproduce/config/gnuastro/aststatistics.conf b/reproduce/config/gnuastro/aststatistics.conf
new file mode 100644
index 0000000..0bf3b83
--- /dev/null
+++ b/reproduce/config/gnuastro/aststatistics.conf
@@ -0,0 +1,34 @@
+# Default parameters (System) for Statistics.
+# Statistics is part of GNU Astronomy Utitlies.
+#
+# Use the long option name of each parameter followed by a value. The name
+# and value should be separated by atleast one white-space character (for
+# example ` '[space], or tab). Lines starting with `#' are ignored.
+#
+# For more information, please run these commands:
+#
+# $ aststatistics --help # Full list of options, short doc.
+# $ aststatistics -P # Print all options and used values.
+# $ info aststatistics # All options and input/output.
+# $ info gnuastro "Configuration files" # How to use configuration files.
+#
+# Copying and distribution of this file, with or without modification, are
+# permitted in any medium without royalty provided the copyright notice and
+# this notice are preserved. This file is offered as-is, without any
+# warranty.
+
+# Input image:
+
+# Sky and its STD settings
+ khdu 1
+ meanmedqdiff 0.005
+ outliersigma 10
+ outliersclip 3,0.2
+ smoothwidth 3
+ sclipparams 3,0.1
+
+# Histogram and CFP settings
+ numasciibins 70
+ asciiheight 10
+ numbins 100
+ mirrordist 1.5
diff --git a/reproduce/config/pipeline/INPUTS.mk b/reproduce/config/pipeline/INPUTS.mk
new file mode 100644
index 0000000..3522ecc
--- /dev/null
+++ b/reproduce/config/pipeline/INPUTS.mk
@@ -0,0 +1,9 @@
+# Input files necessary for this pipeline.
+#
+# This file is read by the configure script and running Makefiles.
+
+
+WFPC2IMAGE = WFPC2ASSNu5780205bx.fits
+WFPC2MD5 = a4791e42cd1045892f9c41f11b50bad8
+WFPC2SIZE = 62kb
+WFPC2URL = https://fits.gsfc.nasa.gov/samples
diff --git a/reproduce/config/pipeline/LOCAL.mk.in b/reproduce/config/pipeline/LOCAL.mk.in
index d6bf2c0..89e3e23 100644
--- a/reproduce/config/pipeline/LOCAL.mk.in
+++ b/reproduce/config/pipeline/LOCAL.mk.in
@@ -1,4 +1,8 @@
# Local pipeline configuration.
+#
+# This is just a template for the `./configure' script to fill in. Please
+# don't make any change to this file.
BDIR = @bdir@
+INDIR = @indir@
DEPENDENCIES-DIR = @ddir@
DOWNLOADER = @downloader@
diff --git a/reproduce/config/pipeline/delete-me-wfpc2-quant.mk b/reproduce/config/pipeline/delete-me-wfpc2-quant.mk
new file mode 100644
index 0000000..2ff7456
--- /dev/null
+++ b/reproduce/config/pipeline/delete-me-wfpc2-quant.mk
@@ -0,0 +1,2 @@
+# Number of samples to create
+delete-me-wfpc2-quantile = 0.65
diff --git a/reproduce/config/pipeline/dependency-versions.mk b/reproduce/config/pipeline/dependency-versions.mk
index f85cdbf..dc45b81 100644
--- a/reproduce/config/pipeline/dependency-versions.mk
+++ b/reproduce/config/pipeline/dependency-versions.mk
@@ -5,6 +5,7 @@ bash-version = 4.4.18
bzip2-version = 1.0.6
cmake-version = 3.12.4
coreutils-version = 8.30
+flock-version = 0.2.3
gawk-version = 4.2.1
ghostscript-version = 9.26
git-version = 2.19.1
diff --git a/reproduce/config/pipeline/web.mk b/reproduce/config/pipeline/web.mk
deleted file mode 100644
index 5af11a7..0000000
--- a/reproduce/config/pipeline/web.mk
+++ /dev/null
@@ -1,6 +0,0 @@
-# Web server(s) hosting the input data for this pipeline.
-#
-# This is the web page containing the files that must be located in the
-# `SURVEY' directory of `reproduce/config/pipeline/LOCAL.mk' on the local
-# system.
-web-survey = https://some.webpage.com/example/server
diff --git a/reproduce/src/make/delete-me.mk b/reproduce/src/make/delete-me.mk
index 67f0440..9227fde 100644
--- a/reproduce/src/make/delete-me.mk
+++ b/reproduce/src/make/delete-me.mk
@@ -25,8 +25,7 @@
# Dummy dataset
# -------------
#
-# We will use AWK's random number generator to generate a random dataset to
-# be imported by PGFPlots for a plot in the paper.
+# We will use AWK to generate a table showing X and X^2 and draw its plot.
dmdir = $(texdir)/delete-me
dm = $(dmdir)/data.txt
$(dmdir): | $(texdir); mkdir $@
@@ -43,6 +42,60 @@ $(dm): $(pconfdir)/delete-me-num.mk | $(dmdir)
+# WFPC2 image PDF
+# -----------------
+#
+# For an example image, we'll make a PDF copy of the WFPC II image to
+# display in the paper.
+wfpc2dir = $(texdir)/delete-me-wfpc2
+$(wfpc2dir): | $(texdir); mkdir $@
+wfpc2 = $(wfpc2dir)/wfpc2.pdf
+$(wfpc2): $(indir)/$(WFPC2IMAGE) | $(wfpc2dir)
+
+ # When the plotted values are re-made, it is necessary to also
+ # delete the TiKZ externalized files so the plot is also re-made.
+ rm -f $(tikzdir)/delete-me-wfpc2.pdf
+
+ # Convert the dataset to a PDF.
+ astconvertt --fluxhigh=4 $< -h0 -o$@
+
+
+
+
+
+# Histogram of WFPC2 image
+# ------------------------
+#
+# For an example plot, we'll show the pixel value histogram also.
+wfpc2hist = $(wfpc2dir)/wfpc2-hist.txt
+$(wfpc2hist): $(indir)/$(WFPC2IMAGE) | $(wfpc2dir)
+
+ # When the plotted values are re-made, it is necessary to also
+ # delete the TiKZ externalized files so the plot is also re-made.
+ rm -f $(tikzdir)/delete-me-wfpc2.pdf
+
+ # Generate the pixel value distribution
+ aststatistics --lessthan=5 $< -h0 --histogram -o$@
+
+
+
+
+
+# Basic statistics
+# ----------------
+#
+# This is just as a demonstration on how to get analysic configuration
+# parameters from variables defined in `reproduce/config/pipeline'.
+wfpc2stats = $(wfpc2dir)/wfpc2-stats.txt
+$(wfpc2stats): $(indir)/$(WFPC2IMAGE) $(pconfdir)/delete-me-wfpc2-quant.mk \
+ | $(wfpc2dir)
+ aststatistics $< -h0 --mean --median \
+ --quantile=$(delete-me-wfpc2-quantile) > $@
+
+
+
+
+
# TeX macros
# ----------
#
@@ -50,7 +103,7 @@ $(dm): $(pconfdir)/delete-me-num.mk | $(dmdir)
#
# NOTE: In LaTeX you cannot use any non-alphabetic character in a variable
# name.
-$(mtexdir)/delete-me.tex: $(dm)
+$(mtexdir)/delete-me.tex: $(dm) $(wfpc2) $(wfpc2hist) $(wfpc2stats)
# Write the number of random values used.
echo "\newcommand{\deletemenum}{$(delete-me-num)}" > $@
@@ -67,6 +120,16 @@ $(mtexdir)/delete-me.tex: $(dm)
{if($$2>max) max=$$2; if($$2<min) min=$$2;}
END{print min, max}' $(dm));
v=$$(echo "$$mm" | awk '{printf "%.3f", $$1}');
- echo "\newcommand{\deletememin}{$$v}" >> $@;
+ echo "\newcommand{\deletememin}{$$v}" >> $@
v=$$(echo "$$mm" | awk '{printf "%.3f", $$2}');
echo "\newcommand{\deletememax}{$$v}" >> $@
+
+ # Write the statistics of the WFPC2 image as a macro.
+ q=$(delete-me-wfpc2-quantile)
+ echo "\newcommand{\deletemewfpcquantile}{$$q}" >> $@
+ mean=$$(awk '{printf("%.2f", $$1)}' $(wfpc2stats))
+ echo "\newcommand{\deletemewfpctwomean}{$$mean}" >> $@
+ median=$$(awk '{printf("%.2f", $$2)}' $(wfpc2stats))
+ echo "\newcommand{\deletemewfpctwomedian}{$$median}" >> $@
+ quantile=$$(awk '{printf("%.2f", $$3)}' $(wfpc2stats))
+ echo "\newcommand{\deletemewfpctwoquantile}{$$quantile}" >> $@
diff --git a/reproduce/src/make/dependencies.mk b/reproduce/src/make/dependencies.mk
index 8ed359b..a784883 100644
--- a/reproduce/src/make/dependencies.mk
+++ b/reproduce/src/make/dependencies.mk
@@ -43,7 +43,7 @@ ildir = $(BDIR)/dependencies/installed/lib
ilidir = $(BDIR)/dependencies/installed/lib/built
# Define the top-level programs to build (installed in `.local/bin').
-top-level-programs = gawk gs grep sed git astnoisechisel texlive-ready
+top-level-programs = gawk gs grep sed git flock astnoisechisel texlive-ready
all: $(foreach p, $(top-level-programs), $(ibdir)/$(p))
# Other basic environment settings: We are only including the host
@@ -75,6 +75,7 @@ LD_LIBRARY_PATH := $(ildir)
tarballs = $(foreach t, cfitsio-$(cfitsio-version).tar.gz \
cmake-$(cmake-version).tar.gz \
curl-$(curl-version).tar.gz \
+ flock-$(flock-version).tar.xz \
gawk-$(gawk-version).tar.lz \
ghostscript-$(ghostscript-version).tar.gz \
git-$(git-version).tar.xz \
@@ -111,6 +112,7 @@ $(tarballs): $(tdir)/%:
w=https://heasarc.gsfc.nasa.gov/FTP/software/fitsio/c/cfitsio$$v.tar.gz
elif [ $$n = cmake ]; then w=https://cmake.org/files/v3.12
elif [ $$n = curl ]; then w=https://curl.haxx.se/download
+ elif [ $$n = flock ]; then w=https://github.com/discoteq/flock/releases/download/v$(flock-version)
elif [ $$n = gawk ]; then w=http://ftp.gnu.org/gnu/gawk
elif [ $$n = ghostscript ]; then w=https://github.com/ArtifexSoftware/ghostpdl-downloads/releases/download/gs926
elif [ $$n = git ]; then w=https://mirrors.edge.kernel.org/pub/software/scm/git
@@ -244,6 +246,9 @@ $(ibdir)/libtool: $(tdir)/libtool-$(libtool-version).tar.xz
$(ibdir)/gs: $(tdir)/ghostscript-$(ghostscript-version).tar.gz
$(call gbuild, $<, ghostscript-$(ghostscript-version))
+$(ibdir)/flock: $(tdir)/flock-$(flock-version).tar.xz
+ $(call gbuild, $<, flock-$(flock-version), static)
+
$(ibdir)/git: $(tdir)/git-$(git-version).tar.xz \
$(ilidir)/zlib
$(call gbuild, $<, git-$(git-version), static)
diff --git a/reproduce/src/make/download.mk b/reproduce/src/make/download.mk
index 9617a45..180d2cf 100644
--- a/reproduce/src/make/download.mk
+++ b/reproduce/src/make/download.mk
@@ -25,20 +25,51 @@
-# Download SURVEY data
+# Download input data
# --------------------
#
-# Data from a survey (for example an imaging survey) usually have a special
-# file-name format which should be set here in the `foreach' loop. Note
-# that the `foreach' function needs the backslash (`\') at the end of the
-# line when it is broken into multiple lines.
-all-survey = $(foreach f, $(filters-survey), \
- $(SURVEY)/a-special-format-$(f).fits \
- $(SURVEY)/a-possibly-additional-$(f)-format.fits )
-$(SURVEY):; mkdir $@
-$(all-survey): $(SURVEY)/%: | $(SURVEY) $(lockdir)
- flock $(lockdir)/download -c "$(DOWNLOADER) $@ $(web-survey)/$*"
+# The input dataset properties are defined in `$(pconfdir)/INPUTS.mk'. For
+# this template pipeline we only have one dataset to enable easy
+# processing, so all the extra checks in this rule may seem
+# redundant.
+#
+# However, in a real project, you will need more than one dataset. In that
+# case, just add them to the target list and add an `elif' statement to
+# define it in the recipe.
+#
+# Download lock file: Most systems have a single connection to the
+# internet, therefore downloading is inherently done in series. As a
+# result, when more than one dataset is necessary for download, if they are
+# done in parallel, the speed will be slower than downloading them in
+# series. We thus use the `flock' program to tie/lock the downloading
+# process with a file and make sure that only one downloading event is in
+# progress at every moment.
+$(indir):; mkdir $@
+inputdatasets = $(foreach i, $(WFPC2IMAGE), $(indir)/$(i))
+$(inputdatasets): $(indir)/%: | $(indir) $(lockdir)
+
+ # Set the necessary parameters for this input file.
+ if [ $* = $(WFPC2IMAGE) ]; then url=$(WFPC2URL); mdf=$(WFPC2MD5);
+ else
+ echo; echo; echo "Not recognized input dataset: '$*'."
+ echo; echo; exit 1
+ fi
+
+ # Download (or make the link to) the input dataset.
+ if [ -f $(INDIR)/$* ]; then
+ ln -s $(INDIR)/$* $@
+ else
+ flock $(lockdir)/download $(DOWNLOADER) $@ $$url/$*
+ fi
+ # Check the md5 sum to see if this is the proper dataset.
+ sum=$$(md5sum $@ | awk '{print $$1}')
+ if [ $$sum != $$mdf ]; then
+ wrongname=$(dir $@)/wrong-$(notdir $@)
+ mv $@ $$wrongname
+ echo; echo; echo "Wrong MD5 checksum for '$*' in $$wrongname"
+ echo; echo; exit 1
+ fi
@@ -49,5 +80,5 @@ $(all-survey): $(SURVEY)/%: | $(SURVEY) $(lockdir)
#
# It is very important to mention the address where the data were
# downloaded in the final report.
-$(mtexdir)/download.tex: $(pconfdir)/web.mk | $(mtexdir)
- @echo "\\newcommand{\\websurvey}{$(web-survey)}" > $@
+$(mtexdir)/download.tex: $(pconfdir)/INPUTS.mk | $(mtexdir)
+ echo "\\newcommand{\\wfpctwourl}{$(WFPC2URL)}" > $@
diff --git a/reproduce/src/make/initialize.mk b/reproduce/src/make/initialize.mk
index 694aca0..41a5e05 100644
--- a/reproduce/src/make/initialize.mk
+++ b/reproduce/src/make/initialize.mk
@@ -34,6 +34,7 @@
# parallel. Also, some programs may not be thread-safe, therefore it will
# be necessary to put a lock on them. This pipeline uses the `flock'
# program to achieve this.
+indir = $(BDIR)/inputs
texdir = $(BDIR)/tex
srcdir = reproduce/src
lockdir = $(BDIR)/locks
@@ -224,6 +225,14 @@ $(mtexdir)/initialize.tex: | $(mtexdir)
fi; \
echo "\newcommand{\\bziptwoversion}{$(bzip2-version)}" >> $@
+ # Unfortunately we couldn't find a way to retrieve the version of
+ # the discoteq `flock' that we are using here. So we'll just repot
+ # the version we downloaded and installed.
+ echo "\newcommand{\\flockversion}{$(flock-version)}" >> $@
+
+
+
+
# Versions of libraries.
$(call lvcheck, fitsio.h, $(cfitsio-version), CFITSIO, cfitsioversion)
diff --git a/tex/delete-me-wfpc2.tex b/tex/delete-me-wfpc2.tex
new file mode 100644
index 0000000..95b3105
--- /dev/null
+++ b/tex/delete-me-wfpc2.tex
@@ -0,0 +1,34 @@
+\begin{tikzpicture}
+
+ %% The displayed WFPC2 image.
+ \node[anchor=south west] (img) at (0,0)
+ {\includegraphics[width=0.5\linewidth]
+ {\bdir/tex/delete-me-wfpc2/wfpc2.pdf}};
+
+ %% Its label
+ \node[anchor=south west] at (0.45\linewidth,0.45\linewidth)
+ {\textcolor{white}{a}};
+
+ %% This histogram.
+ \begin{axis}[at={(0.52\linewidth,0.1\linewidth)},
+ no markers,
+ axis on top,
+ xmode=normal,
+ ymode=normal,
+ yticklabels={},
+ scale only axis,
+ xlabel=Pixel value,
+ width=0.5\linewidth,
+ height=0.412\linewidth,
+ enlarge y limits=false,
+ enlarge x limits=false,
+ ]
+ \addplot [const plot mark mid, fill=red]
+ table [x index=0, y index=1]
+ {\bdir/tex/delete-me-wfpc2/wfpc2-hist.txt}
+ \closedcycle;
+ \end{axis}
+
+ %% The histogram's label
+ \node[anchor=south west] at (0.95\linewidth,0.45\linewidth) {b};
+\end{tikzpicture}