diff options
| -rw-r--r-- | README-pipeline.md | 55 | ||||
| -rwxr-xr-x | configure | 108 | ||||
| -rw-r--r-- | paper.tex | 86 | ||||
| -rw-r--r-- | reproduce/config/gnuastro/astconvertt.conf | 31 | ||||
| -rw-r--r-- | reproduce/config/gnuastro/aststatistics.conf | 34 | ||||
| -rw-r--r-- | reproduce/config/pipeline/INPUTS.mk | 9 | ||||
| -rw-r--r-- | reproduce/config/pipeline/LOCAL.mk.in | 4 | ||||
| -rw-r--r-- | reproduce/config/pipeline/delete-me-wfpc2-quant.mk | 2 | ||||
| -rw-r--r-- | reproduce/config/pipeline/dependency-versions.mk | 1 | ||||
| -rw-r--r-- | reproduce/config/pipeline/web.mk | 6 | ||||
| -rw-r--r-- | reproduce/src/make/delete-me.mk | 71 | ||||
| -rw-r--r-- | reproduce/src/make/dependencies.mk | 7 | ||||
| -rw-r--r-- | reproduce/src/make/download.mk | 57 | ||||
| -rw-r--r-- | reproduce/src/make/initialize.mk | 9 | ||||
| -rw-r--r-- | tex/delete-me-wfpc2.tex | 34 | 
15 files changed, 414 insertions, 100 deletions
| diff --git a/README-pipeline.md b/README-pipeline.md index ff15094..6effa30 100644 --- a/README-pipeline.md +++ b/README-pipeline.md @@ -516,6 +516,7 @@ advanced in later stages of your work.       them.     - Delete marked part(s) in `configure`. +   - Delete the `reproduce/config/gnuastro` directory.     - Delete `astnoisechisel` from the value of `top-level-programs` in `reproduce/src/make/dependencies.mk`. You can keep the rule to build `astnoisechisel`, since its not in the `top-level-programs` list, it (and all the dependencies that are only needed by Gnuastro) will be ignored.     - Delete marked parts in `reproduce/src/make/initialize.mk`.     - Delete `and Gnuastro \gnuastroversion` from `tex/preamble-style.tex`. @@ -526,51 +527,31 @@ advanced in later stages of your work.       commented thoroughly and reading over the comments should guide you on       what to add/remove and where. - - **Input dataset (can be done later)**: The user manages the top-level -     directory of the input data through the variables set in -     `reproduce/config/pipeline/LOCAL.mk.in` (the user actually edits a -     `LOCAL.mk` file that is created by `configure` from the `.mk.in` file, -     but the `.mk` file is not under version control). Datasets are usually -     large and the users might already have their copy don't need to -     download them). So you can define a variable (all in capital letters) -     in `reproduce/config/pipeline/LOCAL.mk.in`. For example if you are -     working on data from the XDF survey, use `XDF`. You can use this -     variable to identify the location of the raw inputs on the running -     system. Here, we'll assume its name is `SURVEY`. Afterwards, change -     any occurrence of `SURVEY` in the whole pipeline with the new -     name. You can find the occurrences with a simple command like the ones -     shown below. We follow the Make convention here that all -     `ONLY-CAPITAL` variables are those directly set by the user and all -     `small-caps` variables are set by the pipeline designer. All variables -     that also depend on this survey have a `survey` in their name. Hence, -     also correct all these occurrences to your new name in small-caps. Of -     course, ignore/delete those occurrences that are irrelevant, like -     those in this file. Note that in the raw version of this template no -     target depends on these files, so they are ignored. Afterwards, set -     the webpage and correct the filenames in -     `reproduce/src/make/download.mk` if necessary. - -     ```shell -     $ grep -r SURVEY ./ -     $ grep -r survey ./ -     ``` - - - **Other input datasets (can be done later)**: Add any other input -     datasets that may be necessary for your research to the pipeline based -     on the example above. + - **Input dataset (can be done later)**: The input datasets are managed +     through the `reproduce/config/pipeline/INPUTS.mk` file. It is best to +     gather all the information regarding all the input datasets into this +     one central file. To ensure that the proper dataset is being +     downloaded and used by the pipeline, its best to also get an MD5 +     checksum (https://en.wikipedia.org/wiki/MD5) of the file and include +     that in thsi file so you can check it in the pipeline. The preparation +     of the input datasets is done in +     `reproduce/src/make/download.mk`. Have a look there to see how these +     values are to be used. This information about the input datasets is +     also used in the initial `configure` script (to inform the users), so +     also modify that file.   - **Delete dummy parts (can be done later)**: The template pipeline -     contains some parts that are only for the initial/test run, not for -     any real analysis. The respective files to remove and parts to fix are -     discussed here. +     contains some parts that are only for the initial/test run, mainly as +     a demonstration of important steps. They not for any real +     analysis. You can remove these parts in the file below       - `paper.tex`: Delete the text of the abstract and the paper's main         body, *except* the "Acknowledgments" section. This reproduction         pipeline was designed by funding from many grants, so its necessary         to acknowledge them in your final research. -     - `Makefile`: Delete the two lines containing `delete-me` in the -       `foreach` loops. Just make sure the other lines that end in `\` are +     - `Makefile`: Delete the lines containing `delete-me` in the `foreach` +       loops. Just make sure the other lines that end in `\` are         immediately after each other.       - Delete all `delete-me*` files in the following directories: @@ -42,6 +42,7 @@ topdir=$(pwd)  installedlink=.local  lbdir=reproduce/build  cdir=reproduce/config +optionaldir="/optional/path"  pdir=$cdir/pipeline  pconf=$pdir/LOCAL.mk @@ -100,7 +101,7 @@ function create_file_with_notice() {  # Since the build directory will go into a symbolic link, we want it to be  # an absolute address. With this function we can make sure of that.  function absolute_dir() { -    echo "$(cd "$(dirname "$inbdir")" && pwd )/$(basename "$inbdir")" +    echo "$(cd "$(dirname "$1")" && pwd )/$(basename "$1")"  } @@ -179,7 +180,8 @@ fi  # the web address.  if [ $rewritepconfig = yes ]; then      if type wget > /dev/null 2>/dev/null; then -        downloader="wget --no-use-server-timestamps -O"; +        wgetname=$(which wget) +        downloader="$wgetname --no-use-server-timestamps -O";      else          cat <<EOF @@ -256,11 +258,59 @@ fi +# Input directory +# --------------- +indir=$optionaldir +wfpc2name=$(awk '!/^#/ && $1=="WFPC2IMAGE" {print $3}' $pdir/INPUTS.mk) +wfpc2md5=$(awk  '!/^#/ && $1=="WFPC2MD5"   {print $3}' $pdir/INPUTS.mk) +wfpc2size=$(awk '!/^#/ && $1=="WFPC2SIZE"  {print $3}' $pdir/INPUTS.mk) +wfpc2url=$(awk  '!/^#/ && $1=="WFPC2URL"   {print $3}' $pdir/INPUTS.mk) +if [ $rewritepconfig = yes ]; then +    cat <<EOF + +---------------------------------- +(OPTIONAL) Input dataset directory +---------------------------------- + +This pipeline needs the dataset(s) listed below. If you already have them, +please specify the directory hosting them on this system. If you don't, +they will be downloaded automatically. Each file is shown with its total +volume and its 128-bit MD5 checksum in parenthesis. + +  $wfpc2name ($wfpc2size, $wfpc2md5): +    A 100x100 Hubble Space Telescope WFPC II image used in the FITS +    standard webpage as a demonstration of this file format. +    URL: $wfpc2url/$wfpc2name + +  $uitname ($uitsize, $uitmd5): +    A 512x512 Astro1 Ultraviolet Imaging Telescope image used in the FITS +    standard webpage as a demonstration of this file format. +    URL: $uiturl/$uitname + +NOTE: This directory, or the datasets above, are optional. If it doesn't +exist, the files will be downloaded in the build directory and used. + +TIP: If you have these files in multiple directories on your system and +don't want to download them or make duplicates, you can create symbolic +links to them and put those symbolic links in the given top-level +directory. + +EOF +    read -p"(OPTIONAL) Input datasets directory ($indir): " inindir +    if [ x$inindir != x ]; then +        indir=$inindir +        echo " -- Using '$indir'" +    fi +fi + + + + +  # Dependency tarball directory  # ----------------------------  if [ $rewritepconfig = yes ]; then -    junkddir="/optional/path" -    ddir=$junkddir +    ddir=$optionaldir      cat <<EOF  --------------------------------------- @@ -282,7 +332,6 @@ EOF          ddir=$tmpddir          echo " -- Using '$ddir'"      fi -    echo  fi @@ -292,7 +341,7 @@ fi  # Memory mapping minimum size  # ---------------------------  if [ $rewritegconfig = yes ]; then -    defaultminmapsize=1000000000 +    defaultminmapsize=10000000000      minmapsize=$defaultminmapsize      cat <<EOF @@ -329,18 +378,57 @@ fi  if [ $rewritepconfig = yes ]; then      create_file_with_notice $pconf      sed -e's|@bdir[@]|'"$bdir"'|'              \ +        -e's|@indir[@]|'"$indir"'|'            \          -e's|@ddir[@]|'"$ddir"'|'              \          -e's|@downloader[@]|'"$downloader"'|'  \          $pconf.in >> $pconf  else      # Read the values from existing configuration file. -    inbdir=$(awk     '$1=="BDIR"             {print $NF}' $pconf) -    ddir=$(awk       '$1=="DEPENDENCIES-DIR" {print $NF}' $pconf) -    downloader=$(awk '$1=="DOWNLOADER"       {print $NF}' $pconf) +    inbdir=$(awk     '$1=="BDIR"             {print $3}' $pconf) +    downloader=$(awk '$1=="DOWNLOADER"       {print $3}' $pconf) + +    # Make sure all necessary variables have a value +    err=0 +    verr=0 +    novalue="" +    if [ x"$inbdir"     = x ]; then novalue="BDIR, ";              fi +    if [ x"$downloader" = x ]; then novalue="$novalue"DOWNLOADER;  fi +    if [ x"$novalue"   != x ]; then verr=1; err=1;                 fi      # Make sure `bdir' is an absolute path and it exists. +    berr=0 +    ierr=0      bdir=$(absolute_dir $inbdir) -    if ! [ -d $bdir ]; then mkdir $bdir; fi + +    if ! [ -d $bdir  ]; then if ! mkdir $bdir; then berr=1; err=1; fi; fi +    if [ $err = 1 ]; then +        cat <<EOF + +################################################################# +########  ERORR reading existing configuration file  ############ +################################################################# +EOF +        if [ $verr = 1 ]; then +            cat <<EOF + +These variables have no value: $novalue. +EOF +        fi +        if [ $berr = 1 ]; then +           cat <<EOF + +Couldn't create the build directory '$bdir' (value to 'BDIR') in +'$pconf'. +EOF +        fi + +        cat <<EOF + +Please run the configure script again (accepting to re-write existing +configuration file) so all the values can be filled and checked. +################################################################# +EOF +    fi  fi @@ -53,7 +53,9 @@    \textsl{Keywords}: Add some keywords for your research here. -  \textsl{Reproducible paper}: Reproduction pipeline \pipelineversion{} +  \textsl{Reproducible paper}: All quantitave results (numbers and plots) +  in this paper are exactly reproducible with reproduction pipeline +  \pipelineversion{}    (\url{https://gitlab.com/makhlaghi/reproducible-paper}).}  %% To add the first page's headers. @@ -69,8 +71,8 @@ Congratulations on running the reproduction pipeline! You can now follow  the checklist in the \texttt{README.md} file to customize this pipeline to  your exciting research project. -Just don't forget to \emph{never} use any numbers or fixed strings (for -example database urls like \url{\websurvey}) directly within your \LaTeX{} +Just don't forget to \emph{never} use numbers or fixed strings (for example +database urls like \url{\wfpctwourl}) directly within your \LaTeX{}  source. Read them directly from your configuration files or outputs of the  programs as part of the reproduction pipeline and import them into \LaTeX{}  as macros through the \texttt{tex/pipeline.tex} file. See the several @@ -83,14 +85,12 @@ or  in this way, will let you focus clearly on your science and not have to  worry about fixing this or that number/name in the text. -Just as a demonstration of creating plots within \LaTeX{} (using the -{\small PGFP}lots package), in Figure \ref{deleteme} we show a simple -plot, where the Y axis is the square of the X axis. The minimum value -in this distribution is $\deletememin$, and $\deletememax$ is the -maximum. Take a look into the \LaTeX{} source and you'll see these -numbers are actually macros that were calculated from the same dataset -(they will change if the dataset, or function that produced it, -changes). +Figure \ref{deleteme} shows a simple plot as a demonstration of creating +plots within \LaTeX{} (using the {\small PGFP}lots package). The minimum +value in this distribution is $\deletememin$, and $\deletememax$ is the +maximum. Take a look into the \LaTeX{} source and you'll see these numbers +are actually macros that were calculated from the same dataset (they will +change if the dataset, or function that produced it, changes).  The individual {\small PDF} file of Figure \ref{deleteme} is available  under the \texttt{tex/build/tikz/} directory of your build directory. You @@ -100,15 +100,6 @@ progress or after publishing the work). If you want to directly use the    KZ} decide if it should be remade or not, you can also comment the  \texttt{makepdf} macro at the top of this \LaTeX{} source file. -{\small PGFP}lots is a great tool to build the plots within \LaTeX{} and -removes the necessity to add further dependencies (to create the plots) to -your reproduction pipeline. High-level language libraries like Matplotlib -do exist to also generate plots. However, bare in mind that they require -many dependencies (Python, Numpy and etc). Installing these dependencies -from source (after several years when the binaries are no longer available -in common repositories), is not easy and will harm the reproducibility of -your paper. -  \begin{figure}[t]    \includetikz{delete-me} @@ -116,10 +107,39 @@ your paper.      demonstration.}  \end{figure} +Figure \ref{deleteme-wfpc2} is another demonstration of showing images +(datasets) using PGFPlots. It shows a small crop of an image from the +Wide-Field Planetary Camera 2, on board the Hubble Space Telescope from +1993 to 2009. This cropped image is one of the sample FITS files from the +FITS file standard +webpage\footnote{\url{https://fits.gsfc.nasa.gov/fits_samples.html}}. Just +as another basic reporting of measurements on this dataset within the paper +without using numbers in the \LaTeX{} source, the mean is +$\deletemewfpctwomean$ and the median is $\deletemewfpctwomedian$. The +skewness in the histogram of Figure \ref{deleteme-wfpc2}(b) explains this +difference between the mean and median. Also, the value of quantile +$\deletemewfpcquantile$ (set in the pipeline configuration file +\texttt{delete-me-wfpc2-quant.mk}) is $\deletemewfpctwoquantile$. The +dataset was prepared for demonstration here with Gnuastro's +\textsf{Convert\-Type} program and the histogram and basic statstics were +generated with Gnuastro's \textsf{Statistics} program. + +{\small PGFP}lots\footnote{\url{https://ctan.org/pkg/pgfplots}} is a great +tool to build the plots within \LaTeX{} and removes the necessity to add +further dependencies (to create the plots) to your reproduction +pipeline. There are high-level language libraries like Matplotlib which +also generate plots. However, the problem is that they require many +dependencies (Python, Numpy and etc). Installing these dependencies from +source, is not easy and will harm the reproducibility of your paper. Note +that after several years, the binary files of these high-level libraries, +that you easily install today, will no longer be available in common +repositories. Therefore building the libraries from source is the only +option to reproduce your results. +  Furthermore, since {\small PGFP}lots is built by \LaTeX{} it respects all -the properties of your text (for example line width and fonts and etc), so -the final plot blends in your paper much more nicely. It also has a -wonderful +the properties of your text (for example line width and fonts and +etc). Therefore the final plot blends in your paper much more nicely. It +also has a wonderful  manual\footnote{\url{http://mirrors.ctan.org/graphics/pgf/contrib/pgfplots/doc/pgfplots.pdf}}.  This pipeline also defines two \LaTeX{} macros that allow you to mark text @@ -135,7 +155,15 @@ existing coauthors (who are just interested in the new parts or notes) and  new co-authors (who don't want to be distracted by these issues in their  first time reading). +\begin{figure}[t] +  \includetikz{delete-me-wfpc2} +  \captionof{figure}{\label{deleteme-wfpc2} (a) An example image of the +    Wide-Field Planetary Camera 2, on board the Hubble Space Telescope from +    1993 to 2009. This is one of the sample images from the FITS standard +    webpage, kept as examples for this file format. (b) Histogram of pixel +    values in (a).} +\end{figure} @@ -177,12 +205,12 @@ SUNDIAL ITN, and from the Spanish Ministry of Economy and Competitiveness  The following free software tools were also critical component of this  research (in alphabetical order): Bzip2 \bziptwoversion, CFITSIO -\cfitsioversion, CMake \cmakeversion, cURL \curlversion, Git \gitversion, -GNU Bash \bashversion, GNU Coreutils \coreutilsversion, GNU AWK -\gawkversion, GNU Grep \grepversion, GNU Libtool \libtoolversion, GNU Make -\makeversion, GNU Sed \sedversion, GNU Scientific Library (GSL) -\gslversion, GNU Tar \tarversion, GNU Which \whichversion, Lzip -\lzipversion, GPL Ghostscript \ghostscriptversion, Libgit2 +\cfitsioversion, CMake \cmakeversion, cURL \curlversion, Discoteq flock +\flockversion, Git \gitversion, GNU Bash \bashversion, GNU Coreutils +\coreutilsversion, GNU AWK \gawkversion, GNU Grep \grepversion, GNU Libtool +\libtoolversion, GNU Make \makeversion, GNU Sed \sedversion, GNU Scientific +Library (GSL) \gslversion, GNU Tar \tarversion, GNU Which \whichversion, +Lzip \lzipversion, GPL Ghostscript \ghostscriptversion, Libgit2  \libgitwoversion, Libtiff \libtiffversion, WCSLIB \wcslibversion, XZ Utils  \xzversion, and ZLib \zlibversion. The final paper was produced with \TeX{}  Live \texliveversion, using the following packages: \TeX{} \textexversion, diff --git a/reproduce/config/gnuastro/astconvertt.conf b/reproduce/config/gnuastro/astconvertt.conf new file mode 100644 index 0000000..fc3ba04 --- /dev/null +++ b/reproduce/config/gnuastro/astconvertt.conf @@ -0,0 +1,31 @@ +# Default parameters (System) for ConvertType. +# ConvertType is part of GNU Astronomy Utitlies. +# +# Use the long option name of each parameter followed by a value. The name +# and value should be separated by atleast one white-space character (for +# example ` '[space], or tab). Lines starting with `#' are ignored. +# +# For more information, please run these commands: +# +#  $ astconvertt --help                  # Full list of options, short doc. +#  $ astconvertt -P                      # Print all options and used values. +#  $ info astconvertt                    # All options and input/output. +#  $ info gnuastro "Configuration files" # How to use configuration files. +# +# Copying and distribution of this file, with or without modification, are +# permitted in any medium without royalty provided the copyright notice and +# this notice are preserved.  This file is offered as-is, without any +# warranty. + +# Input: + +# Output: + quality              100 + widthincm            10.0 + borderwidth          1 + output               jpg + +# Flux: + invert               0 + +# Common options diff --git a/reproduce/config/gnuastro/aststatistics.conf b/reproduce/config/gnuastro/aststatistics.conf new file mode 100644 index 0000000..0bf3b83 --- /dev/null +++ b/reproduce/config/gnuastro/aststatistics.conf @@ -0,0 +1,34 @@ +# Default parameters (System) for Statistics. +# Statistics is part of GNU Astronomy Utitlies. +# +# Use the long option name of each parameter followed by a value. The name +# and value should be separated by atleast one white-space character (for +# example ` '[space], or tab). Lines starting with `#' are ignored. +# +# For more information, please run these commands: +# +#  $ aststatistics --help                # Full list of options, short doc. +#  $ aststatistics -P                    # Print all options and used values. +#  $ info aststatistics                  # All options and input/output. +#  $ info gnuastro "Configuration files" # How to use configuration files. +# +# Copying and distribution of this file, with or without modification, are +# permitted in any medium without royalty provided the copyright notice and +# this notice are preserved.  This file is offered as-is, without any +# warranty. + +# Input image: + +# Sky and its STD settings + khdu                 1 + meanmedqdiff     0.005 + outliersigma        10 + outliersclip     3,0.2 + smoothwidth          3 + sclipparams      3,0.1 + +# Histogram and CFP settings + numasciibins        70 + asciiheight         10 + numbins            100 + mirrordist         1.5 diff --git a/reproduce/config/pipeline/INPUTS.mk b/reproduce/config/pipeline/INPUTS.mk new file mode 100644 index 0000000..3522ecc --- /dev/null +++ b/reproduce/config/pipeline/INPUTS.mk @@ -0,0 +1,9 @@ +# Input files necessary for this pipeline. +# +# This file is read by the configure script and running Makefiles. + + +WFPC2IMAGE = WFPC2ASSNu5780205bx.fits +WFPC2MD5   = a4791e42cd1045892f9c41f11b50bad8 +WFPC2SIZE  = 62kb +WFPC2URL   = https://fits.gsfc.nasa.gov/samples diff --git a/reproduce/config/pipeline/LOCAL.mk.in b/reproduce/config/pipeline/LOCAL.mk.in index d6bf2c0..89e3e23 100644 --- a/reproduce/config/pipeline/LOCAL.mk.in +++ b/reproduce/config/pipeline/LOCAL.mk.in @@ -1,4 +1,8 @@  # Local pipeline configuration. +# +# This is just a template for the `./configure' script to fill in. Please +# don't make any change to this file.  BDIR             = @bdir@ +INDIR            = @indir@  DEPENDENCIES-DIR = @ddir@  DOWNLOADER       = @downloader@ diff --git a/reproduce/config/pipeline/delete-me-wfpc2-quant.mk b/reproduce/config/pipeline/delete-me-wfpc2-quant.mk new file mode 100644 index 0000000..2ff7456 --- /dev/null +++ b/reproduce/config/pipeline/delete-me-wfpc2-quant.mk @@ -0,0 +1,2 @@ +# Number of samples to create +delete-me-wfpc2-quantile = 0.65 diff --git a/reproduce/config/pipeline/dependency-versions.mk b/reproduce/config/pipeline/dependency-versions.mk index f85cdbf..dc45b81 100644 --- a/reproduce/config/pipeline/dependency-versions.mk +++ b/reproduce/config/pipeline/dependency-versions.mk @@ -5,6 +5,7 @@ bash-version        = 4.4.18  bzip2-version       = 1.0.6  cmake-version       = 3.12.4  coreutils-version   = 8.30 +flock-version       = 0.2.3  gawk-version        = 4.2.1  ghostscript-version = 9.26  git-version         = 2.19.1 diff --git a/reproduce/config/pipeline/web.mk b/reproduce/config/pipeline/web.mk deleted file mode 100644 index 5af11a7..0000000 --- a/reproduce/config/pipeline/web.mk +++ /dev/null @@ -1,6 +0,0 @@ -# Web server(s) hosting the input data for this pipeline. -# -# This is the web page containing the files that must be located in the -# `SURVEY' directory of `reproduce/config/pipeline/LOCAL.mk' on the local -# system. -web-survey = https://some.webpage.com/example/server diff --git a/reproduce/src/make/delete-me.mk b/reproduce/src/make/delete-me.mk index 67f0440..9227fde 100644 --- a/reproduce/src/make/delete-me.mk +++ b/reproduce/src/make/delete-me.mk @@ -25,8 +25,7 @@  # Dummy dataset  # -------------  # -# We will use AWK's random number generator to generate a random dataset to -# be imported by PGFPlots for a plot in the paper. +# We will use AWK to generate a table showing X and X^2 and draw its plot.  dmdir = $(texdir)/delete-me  dm    = $(dmdir)/data.txt  $(dmdir): | $(texdir); mkdir $@ @@ -43,6 +42,60 @@ $(dm): $(pconfdir)/delete-me-num.mk | $(dmdir) +# WFPC2 image PDF +# ----------------- +# +# For an example image, we'll make a PDF copy of the WFPC II image to +# display in the paper. +wfpc2dir = $(texdir)/delete-me-wfpc2 +$(wfpc2dir): | $(texdir); mkdir $@ +wfpc2 = $(wfpc2dir)/wfpc2.pdf +$(wfpc2): $(indir)/$(WFPC2IMAGE) | $(wfpc2dir) + +        # When the plotted values are re-made, it is necessary to also +        # delete the TiKZ externalized files so the plot is also re-made. +	rm -f $(tikzdir)/delete-me-wfpc2.pdf + +        # Convert the dataset to a PDF. +	astconvertt --fluxhigh=4 $< -h0 -o$@ + + + + + +# Histogram of WFPC2 image +# ------------------------ +# +# For an example plot, we'll show the pixel value histogram also. +wfpc2hist = $(wfpc2dir)/wfpc2-hist.txt +$(wfpc2hist): $(indir)/$(WFPC2IMAGE) | $(wfpc2dir) + +        # When the plotted values are re-made, it is necessary to also +        # delete the TiKZ externalized files so the plot is also re-made. +	rm -f $(tikzdir)/delete-me-wfpc2.pdf + +        # Generate the pixel value distribution +	aststatistics --lessthan=5 $< -h0 --histogram -o$@ + + + + + +# Basic statistics +# ---------------- +# +# This is just as a demonstration on how to get analysic configuration +# parameters from variables defined in `reproduce/config/pipeline'. +wfpc2stats = $(wfpc2dir)/wfpc2-stats.txt +$(wfpc2stats): $(indir)/$(WFPC2IMAGE) $(pconfdir)/delete-me-wfpc2-quant.mk \ +              | $(wfpc2dir) +	aststatistics $< -h0 --mean --median                        \ +	              --quantile=$(delete-me-wfpc2-quantile) > $@ + + + + +  # TeX macros  # ----------  # @@ -50,7 +103,7 @@ $(dm): $(pconfdir)/delete-me-num.mk | $(dmdir)  #  # NOTE: In LaTeX you cannot use any non-alphabetic character in a variable  # name. -$(mtexdir)/delete-me.tex: $(dm) +$(mtexdir)/delete-me.tex: $(dm) $(wfpc2) $(wfpc2hist) $(wfpc2stats)          # Write the number of random values used.  	echo "\newcommand{\deletemenum}{$(delete-me-num)}" > $@ @@ -67,6 +120,16 @@ $(mtexdir)/delete-me.tex: $(dm)  	           {if($$2>max) max=$$2; if($$2<min) min=$$2;}  	           END{print min, max}' $(dm));  	v=$$(echo "$$mm" | awk '{printf "%.3f", $$1}'); -	echo "\newcommand{\deletememin}{$$v}"             >> $@; +	echo "\newcommand{\deletememin}{$$v}"             >> $@  	v=$$(echo "$$mm" | awk '{printf "%.3f", $$2}');  	echo "\newcommand{\deletememax}{$$v}"             >> $@ + +        # Write the statistics of the WFPC2 image as a macro. +	q=$(delete-me-wfpc2-quantile) +	echo "\newcommand{\deletemewfpcquantile}{$$q}"            >> $@ +	mean=$$(awk     '{printf("%.2f", $$1)}' $(wfpc2stats)) +	echo "\newcommand{\deletemewfpctwomean}{$$mean}"          >> $@ +	median=$$(awk   '{printf("%.2f", $$2)}' $(wfpc2stats)) +	echo "\newcommand{\deletemewfpctwomedian}{$$median}"      >> $@ +	quantile=$$(awk '{printf("%.2f", $$3)}' $(wfpc2stats)) +	echo "\newcommand{\deletemewfpctwoquantile}{$$quantile}"  >> $@ diff --git a/reproduce/src/make/dependencies.mk b/reproduce/src/make/dependencies.mk index 8ed359b..a784883 100644 --- a/reproduce/src/make/dependencies.mk +++ b/reproduce/src/make/dependencies.mk @@ -43,7 +43,7 @@ ildir  = $(BDIR)/dependencies/installed/lib  ilidir = $(BDIR)/dependencies/installed/lib/built  # Define the top-level programs to build (installed in `.local/bin'). -top-level-programs = gawk gs grep sed git astnoisechisel texlive-ready +top-level-programs = gawk gs grep sed git flock astnoisechisel texlive-ready  all: $(foreach p, $(top-level-programs), $(ibdir)/$(p))  # Other basic environment settings: We are only including the host @@ -75,6 +75,7 @@ LD_LIBRARY_PATH := $(ildir)  tarballs = $(foreach t, cfitsio-$(cfitsio-version).tar.gz             \                          cmake-$(cmake-version).tar.gz                 \                          curl-$(curl-version).tar.gz                   \ +	                flock-$(flock-version).tar.xz                 \  	                gawk-$(gawk-version).tar.lz                   \  	                ghostscript-$(ghostscript-version).tar.gz     \  	                git-$(git-version).tar.xz                     \ @@ -111,6 +112,7 @@ $(tarballs): $(tdir)/%:  	    w=https://heasarc.gsfc.nasa.gov/FTP/software/fitsio/c/cfitsio$$v.tar.gz  	  elif [ $$n = cmake       ]; then w=https://cmake.org/files/v3.12  	  elif [ $$n = curl        ]; then w=https://curl.haxx.se/download +	  elif [ $$n = flock       ]; then w=https://github.com/discoteq/flock/releases/download/v$(flock-version)  	  elif [ $$n = gawk        ]; then w=http://ftp.gnu.org/gnu/gawk  	  elif [ $$n = ghostscript ]; then w=https://github.com/ArtifexSoftware/ghostpdl-downloads/releases/download/gs926  	  elif [ $$n = git         ]; then w=https://mirrors.edge.kernel.org/pub/software/scm/git @@ -244,6 +246,9 @@ $(ibdir)/libtool: $(tdir)/libtool-$(libtool-version).tar.xz  $(ibdir)/gs: $(tdir)/ghostscript-$(ghostscript-version).tar.gz  	$(call gbuild, $<, ghostscript-$(ghostscript-version)) +$(ibdir)/flock: $(tdir)/flock-$(flock-version).tar.xz +	$(call gbuild, $<, flock-$(flock-version), static) +  $(ibdir)/git: $(tdir)/git-$(git-version).tar.xz \                $(ilidir)/zlib  	$(call gbuild, $<, git-$(git-version), static) diff --git a/reproduce/src/make/download.mk b/reproduce/src/make/download.mk index 9617a45..180d2cf 100644 --- a/reproduce/src/make/download.mk +++ b/reproduce/src/make/download.mk @@ -25,20 +25,51 @@ -# Download SURVEY data +# Download input data  # --------------------  # -# Data from a survey (for example an imaging survey) usually have a special -# file-name format which should be set here in the `foreach' loop. Note -# that the `foreach' function needs the backslash (`\') at the end of the -# line when it is broken into multiple lines. -all-survey = $(foreach f, $(filters-survey),                                 \ -                          $(SURVEY)/a-special-format-$(f).fits               \ -                          $(SURVEY)/a-possibly-additional-$(f)-format.fits ) -$(SURVEY):; mkdir $@ -$(all-survey): $(SURVEY)/%: | $(SURVEY) $(lockdir) -	flock $(lockdir)/download -c "$(DOWNLOADER) $@ $(web-survey)/$*" +# The input dataset properties are defined in `$(pconfdir)/INPUTS.mk'. For +# this template pipeline we only have one dataset to enable easy +# processing, so all the extra checks in this rule may seem +# redundant. +# +# However, in a real project, you will need more than one dataset. In that +# case, just add them to the target list and add an `elif' statement to +# define it in the recipe. +# +# Download lock file: Most systems have a single connection to the +# internet, therefore downloading is inherently done in series. As a +# result, when more than one dataset is necessary for download, if they are +# done in parallel, the speed will be slower than downloading them in +# series. We thus use the `flock' program to tie/lock the downloading +# process with a file and make sure that only one downloading event is in +# progress at every moment. +$(indir):; mkdir $@ +inputdatasets = $(foreach i, $(WFPC2IMAGE), $(indir)/$(i)) +$(inputdatasets): $(indir)/%: | $(indir) $(lockdir) + +        # Set the necessary parameters for this input file. +	if   [ $* = $(WFPC2IMAGE) ]; then url=$(WFPC2URL); mdf=$(WFPC2MD5); +	else +	echo; echo; echo "Not recognized input dataset: '$*'." +	echo; echo; exit 1 +	fi + +        # Download (or make the link to) the input dataset. +	if [ -f $(INDIR)/$* ]; then +	  ln -s $(INDIR)/$* $@ +	else +	  flock $(lockdir)/download $(DOWNLOADER) $@ $$url/$* +	fi +        # Check the md5 sum to see if this is the proper dataset. +	sum=$$(md5sum $@ | awk '{print $$1}') +	if [ $$sum != $$mdf ]; then +	  wrongname=$(dir $@)/wrong-$(notdir $@) +	  mv $@ $$wrongname +	  echo; echo; echo "Wrong MD5 checksum for '$*' in $$wrongname" +	  echo; echo; exit 1 +	fi @@ -49,5 +80,5 @@ $(all-survey): $(SURVEY)/%: | $(SURVEY) $(lockdir)  #  # It is very important to mention the address where the data were  # downloaded in the final report. -$(mtexdir)/download.tex: $(pconfdir)/web.mk | $(mtexdir) -	@echo "\\newcommand{\\websurvey}{$(web-survey)}" > $@ +$(mtexdir)/download.tex: $(pconfdir)/INPUTS.mk | $(mtexdir) +	echo "\\newcommand{\\wfpctwourl}{$(WFPC2URL)}" > $@ diff --git a/reproduce/src/make/initialize.mk b/reproduce/src/make/initialize.mk index 694aca0..41a5e05 100644 --- a/reproduce/src/make/initialize.mk +++ b/reproduce/src/make/initialize.mk @@ -34,6 +34,7 @@  # parallel. Also, some programs may not be thread-safe, therefore it will  # be necessary to put a lock on them. This pipeline uses the `flock'  # program to achieve this. +indir       = $(BDIR)/inputs  texdir      = $(BDIR)/tex  srcdir      = reproduce/src  lockdir     = $(BDIR)/locks @@ -224,6 +225,14 @@ $(mtexdir)/initialize.tex: | $(mtexdir)  	fi;                                                                \  	echo "\newcommand{\\bziptwoversion}{$(bzip2-version)}" >> $@ +        # Unfortunately we couldn't find a way to retrieve the version of +        # the discoteq `flock' that we are using here. So we'll just repot +        # the version we downloaded and installed. +	echo "\newcommand{\\flockversion}{$(flock-version)}" >> $@ + + + +          # Versions of libraries.  	$(call lvcheck, fitsio.h, $(cfitsio-version), CFITSIO, cfitsioversion) diff --git a/tex/delete-me-wfpc2.tex b/tex/delete-me-wfpc2.tex new file mode 100644 index 0000000..95b3105 --- /dev/null +++ b/tex/delete-me-wfpc2.tex @@ -0,0 +1,34 @@ +\begin{tikzpicture} + +  %% The displayed WFPC2 image. +  \node[anchor=south west] (img) at (0,0) +       {\includegraphics[width=0.5\linewidth] +         {\bdir/tex/delete-me-wfpc2/wfpc2.pdf}}; + +  %% Its label +  \node[anchor=south west] at (0.45\linewidth,0.45\linewidth) +       {\textcolor{white}{a}}; + +  %% This histogram. +  \begin{axis}[at={(0.52\linewidth,0.1\linewidth)}, +      no markers, +      axis on top, +      xmode=normal, +      ymode=normal, +      yticklabels={}, +      scale only axis, +      xlabel=Pixel value, +      width=0.5\linewidth, +      height=0.412\linewidth, +      enlarge y limits=false, +      enlarge x limits=false, +      ] +    \addplot [const plot mark mid, fill=red] +    table [x index=0, y index=1] +    {\bdir/tex/delete-me-wfpc2/wfpc2-hist.txt} +    \closedcycle; +  \end{axis} + +  %% The histogram's label +  \node[anchor=south west] at (0.95\linewidth,0.45\linewidth) {b}; +\end{tikzpicture} | 
