aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--.file-metadatabin5484 -> 5337 bytes
-rw-r--r--README-hacking.md480
-rw-r--r--README.md22
-rwxr-xr-xconfigure117
-rwxr-xr-xfor-group5
-rw-r--r--paper.tex100
-rw-r--r--reproduce/config/gnuastro/gnuastro.conf15
-rw-r--r--reproduce/config/pipeline/INPUTS.mk2
-rw-r--r--reproduce/config/pipeline/LOCAL.mk.in2
-rw-r--r--reproduce/config/pipeline/dependency-numpy-scipy.cfg2
-rw-r--r--reproduce/config/pipeline/pdf-build.mk6
-rw-r--r--reproduce/config/pipeline/texlive.conf5
-rwxr-xr-xreproduce/src/bash/download-multi-try8
-rw-r--r--reproduce/src/bash/git-post-checkout2
-rw-r--r--reproduce/src/bash/git-pre-commit2
-rw-r--r--reproduce/src/make/dependencies-basic.mk30
-rw-r--r--reproduce/src/make/dependencies-build-rules.mk4
-rw-r--r--reproduce/src/make/dependencies-python.mk2
-rw-r--r--reproduce/src/make/dependencies.mk27
-rw-r--r--reproduce/src/make/download.mk6
-rw-r--r--reproduce/src/make/initialize.mk40
-rw-r--r--reproduce/src/make/paper.mk17
-rw-r--r--reproduce/src/make/top.mk31
-rw-r--r--tex/src/preamble-pgfplots.tex4
24 files changed, 457 insertions, 472 deletions
diff --git a/.file-metadata b/.file-metadata
index 557f8cb..7e8d8dd 100644
--- a/.file-metadata
+++ b/.file-metadata
Binary files differ
diff --git a/README-hacking.md b/README-hacking.md
index 56f613b..e663ee1 100644
--- a/README-hacking.md
+++ b/README-hacking.md
@@ -4,49 +4,48 @@ Reproducible paper template
Copyright (C) 2018-2019 Mohammad Akhlaghi <mohammad@akhlaghi.org>
See the end of the file for license conditions.
-This project contains a **fully working template** for a high-level
-research reproduction pipeline, or reproducible paper, as defined in the
-link below. If the link below is not accessible at the time of reading,
-please see the appendix at the end of this file for a portion of its
-introduction. Some [slides](http://akhlaghi.org/pdf/reproducible-paper.pdf)
-are also available to help demonstrate the concept implemented here.
+This project contains a **fully working template** for doing reproducible
+research (or writing a reproducible paper) as defined in the link below. If
+the link below is not accessible at the time of reading, please see the
+appendix at the end of this file for a portion of its introduction. Some
+[slides](http://akhlaghi.org/pdf/reproducible-paper.pdf) are also available
+to help demonstrate the concept implemented here.
http://akhlaghi.org/reproducible-science.html
This template is created with the aim of supporting reproducible research
by making it easy to start a project in this framework. As shown below, it
-is very easy to customize this template reproducible paper pipeline for any
-particular research/job and expand it as it starts and evolves. It can be
-run with no modification (as described in `README.md`) as a demonstration
-and customized for use in any project as fully described below.
-
-The pipeline will download and build all the necessary libraries and
-programs for working in a closed environment (highly independent of the
-host operating system) with fixed versions of the necessary
-dependencies. The tarballs for building the local environment are also
-collected in a [separate
+is very easy to customize this reproducible paper template for any
+particular (research) project and expand it as it starts and evolves. It
+can be run with no modification (as described in `README.md`) as a
+demonstration and customized for use in any project as fully described
+below.
+
+A project designed using this template will download and build all the
+necessary libraries and programs for working in a closed environment
+(highly independent of the host operating system) with fixed versions of
+the necessary dependencies. The tarballs for building the local environment
+are also collected in a [separate
repository](https://gitlab.com/makhlaghi/reproducible-paper-dependencies). The
-[final reproducible paper
-output](https://gitlab.com/makhlaghi/reproducible-paper-output/raw/master/paper.pdf)
-of this pipeline is also present in [a separate
-repository](https://gitlab.com/makhlaghi/reproducible-paper-output). Notice
-the last paragraph of the Acknowledgments where all the dependencies are
-mentioned with their versions.
+final output of the project is [a
+paper](https://gitlab.com/makhlaghi/reproducible-paper-output/raw/master/paper.pdf).
+Notice the last paragraph of the Acknowledgments where all the necessary
+software are mentioned with their versions.
Below, we start with a discussion of why Make was chosen as the high-level
-language/framework for this research reproduction pipeline and how to learn
-and master Make easily (and freely). The general architecture and design of
-the pipeline is then discussed to help you navigate the files and their
-contents. This is followed by a checklist for the easy/fast customization
-of this pipeline to your exciting research. We continue with some tips and
-guidelines on how to manage or extend the pipeline as your research grows
-based on our experiences with it so far. The main body concludes with a
-description of possible future improvements that are planned for the
-pipeline (but not yet implemented). As discussed above, we end with a short
-introduction on the necessity of reproducible science in the appendix.
+language/framework for project management and how to learn and master Make
+easily (and freely). The general architecture and design of the project is
+then discussed to help you navigate the files and their contents. This is
+followed by a checklist for the easy/fast customization of this template to
+your exciting research. We continue with some tips and guidelines on how to
+manage or extend your project as it grows based on our experiences with it
+so far. The main body concludes with a description of possible future
+improvements that are planned for the template (but not yet
+implemented). As discussed above, we end with a short introduction on the
+necessity of reproducible science in the appendix.
Please don't forget to share your thoughts, suggestions and criticisms on
-this pipeline. Maintaining and designing this pipeline is itself a separate
+this template. Maintaining and designing this template is itself a separate
project, so please join us if you are interested. Once it is mature enough,
we will describe it in a paper (written by all contributors) for a formal
introduction to the community.
@@ -59,7 +58,7 @@ Why Make?
---------
When batch processing is necessary (no manual intervention, as in a
-reproduction pipeline), shell scripts are usually the first solution that
+reproducible project), shell scripts are usually the first solution that
come to mind. However, the inherent complexity and non-linearity of
progress in a scientific project (where experimentation is key) make it
hard to manage the script(s) as the project evolves. For example, a script
@@ -79,18 +78,18 @@ to find in the end.
The Make paradigm, on the other hand, starts from the end: the final
*target*. It builds a dependency tree internally, and finds where it should
-start each time the pipeline is run. Therefore, in the scenario above, a
+start each time the project is run. Therefore, in the scenario above, a
researcher that has just added the final 10% of steps of her research to
her Makefile, will only have to run those extra steps. With Make, it is
also trivial to change the processing of any intermediate (already written)
*rule* (or step) in the middle of an already written analysis: the next
time Make is run, only rules that are affected by the changes/additions
-will be re-run, not the whole analysis/pipeline.
+will be re-run, not the whole analysis/project.
This greatly speeds up the processing (enabling creative changes), while
keeping all the dependencies clearly documented (as part of the Make
language), and most importantly, enabling full reproducibility from scratch
-with no changes in the pipeline code that was working during the
+with no changes in the project code that was working during the
research. This will allow robust results and let the scientists get to what
they do best: experiment and be critical to the methods/analysis without
having to waste energy and time on technical problems that come up as a
@@ -117,9 +116,9 @@ Make is a +40 year old software that is still evolving, therefore many
implementations of Make exist. The only difference in them is some extra
features over the [standard
definition](https://pubs.opengroup.org/onlinepubs/009695399/utilities/make.html)
-(which is shared in all of them). This pipeline has been created for GNU
+(which is shared in all of them). This template has been created for GNU
Make which is the most common, most actively developed, and most advanced
-implementation. Just note that this pipeline downloads, builds, internally
+implementation. Just note that this template downloads, builds, internally
installs, and uses its own dependencies (including GNU Make), so you don't
have to have it installed before you try it out.
@@ -168,41 +167,38 @@ your hands off the keyboard!).
-Published works using this pipeline
+Published works using this template
-----------------------------------
The links below will guide you to some of the works that have already been
-published using the method of this pipeline. Note that this pipeline is
-evolving, so some small details may be different in them, but they can be
-used as a good working model to build your own.
+published with (earlier versions of) this template. Note that this template
+is evolving, so some small details may be different in them, but they can
+be used as a good working model to build your own.
- Section 7.3 of Bacon et
al. ([2017](http://adsabs.harvard.edu/abs/2017A%26A...608A...1B), A&A
- 608, A1): The version controlled reproduction pipeline is available [on
+ 608, A1): The version controlled project source is available [on
GitLab](https://gitlab.com/makhlaghi/muse-udf-origin-only-hst-magnitudes)
- and a snapshot of the pipeline along with all the necessary input
+ and a snapshot of the project along with all the necessary input
datasets and outputs is available in
[zenodo.1164774](https://doi.org/10.5281/zenodo.1164774).
- Section 4 of Bacon et
al. ([2017](http://adsabs.harvard.edu/abs/2017A%26A...608A...1B), A&A,
- 608, A1): The version controlled reproduction pipeline is available [on
+ 608, A1): The version controlled project is available [on
GitLab](https://gitlab.com/makhlaghi/muse-udf-photometry-astrometry) and
- a snapshot of the pipeline along with all the necessary input datasets
- is available in
- [zenodo.1163746](https://doi.org/10.5281/zenodo.1163746).
+ a snapshot of the project along with all the necessary input datasets is
+ available in [zenodo.1163746](https://doi.org/10.5281/zenodo.1163746).
- Akhlaghi & Ichikawa
([2015](http://adsabs.harvard.edu/abs/2015ApJS..220....1A), ApJS, 220,
- 1): The version controlled reproduction pipeline is available [on
+ 1): The version controlled project is available [on
GitLab](https://gitlab.com/makhlaghi/NoiseChisel-paper). This is the
- very first (and much less mature) implementation of this pipeline: the
- history of this template pipeline started more than two years after that
- paper was published. It is a very rudimentary/initial implementation,
- thus it is only included here for historical reasons. However, the
- pipeline is complete and accurate and uploaded to arXiv along with the
- paper. See the more recent implementations if you want to get ideas for
- your version of this pipeline.
+ very first (and much less mature!) implementation of this template: the
+ history of this template started more than two years after this paper
+ was published. It is a very rudimentary/initial implementation, thus it
+ is only included here for historical reasons. However, the project
+ source is complete, accurate and uploaded to arXiv along with the paper.
@@ -211,22 +207,21 @@ used as a good working model to build your own.
Citation
--------
-A paper will be published to fully describe this reproduction
-pipeline. Until then, if this pipeline is useful in your work, please cite
-the paper that implemented the first version of this pipeline: Akhlaghi &
-Ichikawa ([2015](http://adsabs.harvard.edu/abs/2015ApJS..220....1A), ApJS,
-220, 1).
+A paper will be published to fully describe this reproducible paper
+template. Until then, if you used this template in your work, please cite
+the paper that implemented its first version: Akhlaghi & Ichikawa
+([2015](http://adsabs.harvard.edu/abs/2015ApJS..220....1A), ApJS, 220, 1).
The experience gained with this template after several more implementations
-will be used to make this pipeline robust enough for a complete and useful
-paper to introduce to the community afterwards.
+will be used to make it robust enough for a complete and useful paper to
+introduce to the community afterwards.
Also, when your paper is published, don't forget to add a notice in your
own paper (in coordination with the publishing editor) that the paper is
fully reproducible and possibly add a sentence or paragraph in the end of
the paper shortly describing the concept. This will help spread the word
-and encourage other scientists to also publish their reproduction
-pipelines.
+and encourage other scientists to also manage and publish their projects in
+a reproducible manner.
@@ -237,19 +232,19 @@ pipelines.
-Reproduction pipeline architecture
-==================================
+Project architecture
+====================
-In order to adopt this pipeline to your research, it is important to first
-understand its architecture so you can navigate your way in the directories
-and understand how to implement your research project within its
-framework. But before reading this theoretical discussion, please run the
-pipeline (described in `README.md`: first run `./configure`, then
+In order to customize this template to your research, it is important to
+first understand its architecture so you can navigate your way in the
+directories and understand how to implement your research project within
+its framework. But before reading this theoretical discussion, please run
+the template (described in `README.md`: first run `./configure`, then
`.local/bin/make -j8`) without any change, just to see how it works.
In order to obtain a reproducible result it is important to have an
identical environment (for example same versions of the programs that it
-will use). Therefore, the pipeline builds its own dependencies during the
+will use). Therefore, the projects builds its own dependencies during the
`./configure` step. Building of the dependencies is managed by
`reproduce/src/make/dependencies-basic.mk` and
`reproduce/src/make/dependencies.mk`. These Makefiles are called by the
@@ -258,10 +253,9 @@ downloading and building the most basic tools like GNU Tar, GNU Bash, GNU
Make, and GNU Compiler Collection (GCC). Therefore it must only contain
very basic and portable Make and shell features. The second is called after
the first, thus enabling usage of the modern and advanced features of GNU
-Bash, GNU Make and other low-level GNU tools, similar to the rest of the
-pipeline. Later, if you add a new program/library for your research, you
-will need to include a rule on how to download and build it (in
-`reproduce/src/make/dependencies.mk`).
+Bash, GNU Make and other low-level GNU tools. Later, if you add a new
+program/library for your research, you will need to include a rule on how
+to download and build it (mostly in `reproduce/src/make/dependencies.mk`).
After it finishes, `./configure` will create the following symbolic links
in the project's top source directory: 1) `Makefile` in the top directory
@@ -294,11 +288,11 @@ To keep the source and (intermediate) built files separate, you _must_
define a top-level build directory variable (or `$(BDIR)`) to host all the
intermediate files (it was defined in `./configure`). This directory
doesn't need to be version controlled or even synchronized, or backed-up in
-other servers: its contents are all products of the pipeline, and can be
-easily re-created any time. As you define targets for your new rules, it is
-thus important to place them all under sub-directories of `$(BDIR)`. As
-mentioned above, you always have fast access to this "build"-directory with
-the `.build` symbolic link.
+other servers: its contents are all products, and can be easily re-created
+any time. As you define targets for your new rules, it is thus important to
+place them all under sub-directories of `$(BDIR)`. As mentioned above, you
+always have fast access to this "build"-directory with the `.build`
+symbolic link.
In this architecture, we have two types of Makefiles that are loaded into
the top `Makefile`: _configuration-Makefiles_ (only independent
@@ -309,11 +303,11 @@ The configuration-Makefiles are those that satisfy this wildcard:
`reproduce/config/pipeline/*.mk`. These Makefiles don't actually have any
rules, they just have values for various free parameters throughout the
analysis/processing. Open a few of them to see for yourself. These
-Makefiles must only contain raw Make variables (pipeline
-configurations). By "raw" we mean that the Make variables in these files
-must not depend on variables in any other configuration-Makefile. This is
-because we don't want to assume any order in reading them. It is also very
-important to *not* define any rule, or other Make construct, in these
+Makefiles must only contain raw Make variables (project configurations). By
+"raw" we mean that the Make variables in these files must not depend on
+variables in any other configuration-Makefile. This is because we don't
+want to assume any order in reading them. It is also very important to
+*not* define any rule, or other Make construct, in these
configuration-Makefiles.
This enables you to set these configure-Makefiles as a prerequisite to any
@@ -342,13 +336,13 @@ aren't directly a prerequisite of other workhorse-Makefile targets, they
can be a pre-requisite of that intermediate LaTeX macro file and thus be
called when necessary. Otherwise, they will be ignored by Make.
-This pipeline also has a mode to share the build directory between several
+This template also has a mode to share the build directory between several
users of a Unix group (when working on large computer clusters). In this
-scenario, each user can have their own cloned pipeline source, but share
-the large built files between each other. To do this, it is necessary for
-all built files to give full permission to group members while not allowing
-any other users access to the contents. Therefore the `./configure` and
-Make steps must be called with special conditions which are managed in the
+scenario, each user can have their own cloned project source, but share the
+large built files between each other. To do this, it is necessary for all
+built files to give full permission to group members while not allowing any
+other users access to the contents. Therefore the `./configure` and Make
+steps must be called with special conditions which are managed in the
`for-group` script.
Let's see how this design is implemented. When `./configure` finishes: By
@@ -360,9 +354,9 @@ configuration-Makefile `reproduce/config/pipeline/LOCAL.mk` which was also
built by `./configure` (based on the `LOCAL.mk.in` template).
The next non-commented set of lines define the ultimate target of the whole
-pipeline (`paper.pdf`). But to avoid mistakes, a sanity check is necessary
+project (`paper.pdf`). But to avoid mistakes, a sanity check is necessary
to see if Make is being run with the same group settings as the configure
-script (for example when the pipeline is configured for group access using
+script (for example when the project is configured for group access using
the `./for-group` script, but Make isn't). Therefore we use a Make
conditional to define the `all` target based on the group permissions being
consistent between the initial configuration and the current run.
@@ -378,7 +372,7 @@ proper order.
Finally, we'll just import all the configuration-Makefiles with a wildcard
(while ignoring `LOCAL.mk` that was imported before). Also, all
workhorse-Makefiles are imported in the proper order using a Make `foreach`
-loop. This finishes the general view of the pipeline's implementation.
+loop. This finishes the general view of the template's implementation.
In short, to keep things modular, readable and manageable, follow these
recommendations: 1) Set clear-to-understand names for the
@@ -393,15 +387,14 @@ possible.
The `reproduce/src/make/paper.mk` Makefile must be the final Makefile that
is included. This workhorse Makefile ends with the rule to build
-`paper.pdf` (final target of the whole reproduction pipeline). If you look
-in it, you will notice that it starts with a rule to create
-`$(mtexdir)/pipeline.tex` (`mtexdir` is just a shorthand name for
-`$(BDIR)/tex/macros` mentioned before). `$(mtexdir)/pipeline.tex` is the
-connection between the processing/analysis steps of the pipeline, and the
-steps to build the final PDF. As you see, `$(mtexdir)/pipeline.tex` only
-instructs LaTeX to import the LaTeX macros of each high-level processing
-step during the analysis (the separate work-horse Makefiles that you
-defined and included).
+`paper.pdf` (final target of the whole project). If you look in it, you
+will notice that it starts with a rule to create `$(mtexdir)/pipeline.tex`
+(`mtexdir` is just a shorthand name for `$(BDIR)/tex/macros` mentioned
+before). `$(mtexdir)/pipeline.tex` is the connection between the
+processing/analysis steps of the project, and the steps to build the final
+PDF. As you see, `$(mtexdir)/pipeline.tex` only instructs LaTeX to import
+the LaTeX macros of each high-level processing step during the analysis
+(the separate work-horse Makefiles that you defined and included).
During the research, it often happens that you want to test a step that is
not a prerequisite of any higher-level operation. In such cases, you can
@@ -449,54 +442,54 @@ mind are listed below.
-Checklist to customize the pipeline
-===================================
+Customization checklist
+=======================
-Take the following steps to fully customize this pipeline for your research
+Take the following steps to fully customize this template for your research
project. After finishing the list, be sure to run `./configure` and `make`
to see if everything works correctly before expanding it. If you notice
anything missing or any in-correct part (probably a change that has not
been explained here), please let us know to correct it.
-As described above, the concept of a reproduction pipeline heavily relies
-on [version
+As described above, the concept of reproducibility (during a project)
+heavily relies on [version
control](https://en.wikipedia.org/wiki/Version_control). Currently this
-pipeline uses Git as its main version control system. If you are not already
-familiar with Git, please read the first three chapters of the [ProGit
-book](https://git-scm.com/book/en/v2) which provides a wonderful practical
-understanding of the basics. You can read later chapters as you get more
-advanced in later stages of your work.
+template uses Git as its main version control system. If you are not
+already familiar with Git, please read the first three chapters of the
+[ProGit book](https://git-scm.com/book/en/v2) which provides a wonderful
+practical understanding of the basics. You can read later chapters as you
+get more advanced in later stages of your work.
- **Get this repository and its history** (if you don't already have it):
Arguably the easiest way to start is to clone this repository as shown
- below. The main branch of this pipeline is called `pipeline`. This
+ below. The main branch of this template is called `template`. This
allows you to use the common branch name `master` for your own
- research, while keeping up to date with improvements in the pipeline.
+ research, while keeping up to date with improvements in the template.
```shell
- $ git clone https://gitlab.com/makhlaghi/reproducible-paper.git
+ $ git clone git://git.sv.gnu.org/reproduce
$ mv reproducible-paper my-project-name # Your own directory name.
$ cd my-project-name # Go into the cloned directory.
- $ git tag | xargs git tag -d # Delete all pipeline tags.
- $ git config remote.origin.tagopt --no-tags # No tags in future fetch/pull from this pipeline.
- $ git remote rename origin pipeline-origin # Rename the pipeline's remote.
+ $ git tag | xargs git tag -d # Delete all template tags.
+ $ git config remote.origin.tagopt --no-tags # No tags in future fetch/pull from this template.
+ $ git remote rename origin template-origin # Rename the template's remote.
$ git checkout -b master # Create, enter master branch.
```
- - **Test the pipeline**: Before making any changes, it is important to
- test the pipeline and see if everything works properly with the
- commands below. If there is any problem in the `./configure` or `make`
- steps, please contact us to fix the problem before continuing. Since
- the building of dependencies in `./configure` can take long, you can
- take the next few steps (editing the files) while its working (they
- don't affect the configuration). After `make` is finished, open
- `paper.pdf` and if it looks fine, you are ready to start customizing
- the pipeline for your project. But before that, clean all the extra
- pipeline outputs with `make clean` as shown below.
+ - **Test the template**: Before making any changes, it is important to
+ test it and see if everything works properly with the commands
+ below. If there is any problem in the `./configure` or `make` steps,
+ please contact us to fix the problem before continuing. Since the
+ building of dependencies in `./configure` can take long, you can take
+ the next few steps (editing the files) while its working (they don't
+ affect the configuration). After `make` is finished, open `paper.pdf`
+ and if it looks fine, you are ready to start customizing the template
+ for your project. But before that, clean all the extra template
+ outputs with `make clean` as shown below.
```shell
$ ./configure # Set top directories and build dependencies.
- $ .local/bin/make # Run the pipeline.
+ $ .local/bin/make # Do the (mainly symbolic) processing and build paper
# Open 'paper.pdf' and see if everything is ok.
$ .local/bin/make clean # Delete high-level outputs.
@@ -526,7 +519,7 @@ advanced in later stages of your work.
finishing this checklist and doing your first commit.
- **Gnuastro**: GNU Astronomy Utilities (Gnuastro) is currently a
- dependency of the pipeline which will be built and used. The main
+ dependency of the template which will be built and used. The main
reason for this is to demonstrate how critically important it is to
version your scientific tools. If you don't need Gnuastro for your
research, you can simply remove the parts enclosed in marked parts in
@@ -550,10 +543,10 @@ advanced in later stages of your work.
through the `reproduce/config/pipeline/INPUTS.mk` file. It is best to
gather all the information regarding all the input datasets into this
one central file. To ensure that the proper dataset is being
- downloaded and used by the pipeline, it is also recommended get an
- [MD5 checksum](https://en.wikipedia.org/wiki/MD5) of the file and
- include that in `INPUTS.mk` so you can check it in the pipeline. The
- preparation of the input datasets is done in
+ downloaded and used by the project, it is also recommended get an [MD5
+ checksum](https://en.wikipedia.org/wiki/MD5) of the file and include
+ that in `INPUTS.mk` so the project can check it automatically. The
+ preparation/downloading of the input datasets is done in
`reproduce/src/make/download.mk`. Have a look there to see how these
values are to be used. This information about the input datasets is
also used in the initial `configure` script (to inform the users), so
@@ -565,15 +558,15 @@ advanced in later stages of your work.
$ grep -ir wfpc2 ./*
```
- - **Delete dummy parts (can be done later)**: The template pipeline
- contains some parts that are only for the initial/test run, mainly as
- a demonstration of important steps. They not for any real
- analysis. You can remove these parts in the file below
+ - **Delete dummy parts (can be done later)**: The template contains some
+ parts that are only for the initial/test run, mainly as a
+ demonstration of important steps. They not for any real analysis. You
+ can remove these parts in the file below
- `paper.tex`: Delete the text of the abstract and the paper's main
- body, *except* the "Acknowledgments" section. This reproduction
- pipeline was designed by funding from many grants, so its necessary
- to acknowledge them in your final research.
+ body, *except* the "Acknowledgments" section. This tempmlate was
+ designed by funding from many grants, so its necessary to
+ acknowledge them in your final research.
- `Makefile`: Delete the lines containing `delete-me` in the `foreach`
loop. Just make sure the other lines that end in `\` are immediately
@@ -588,14 +581,14 @@ advanced in later stages of your work.
```
- **`README.md`**: Correct all the `XXXXX` place holders (name of your
- project, your own name, address of pipeline's online/remote
+ project, your own name, address of the template's online/remote
repository, link to download dependencies and etc). Generally, read
over the text and update it where necessary to fit your project. Don't
forget that this is the first file that is displayed on your online
repository and also your colleagues will first be drawn to read this
file. Therefore, make it as easy as possible for them to start
with. Also check and update this file one last time when you are ready
- to publish your work (and its reproduction pipeline).
+ to publish your project's paper/source.
- **Copyright and License notice**: To be usable/modifiable by others
after publication, _all_ the "copyright-able" files in your project
@@ -620,16 +613,16 @@ advanced in later stages of your work.
changes in the steps above and you are in the `master` branch. So, you
can officially make your first commit in your project's history. But
before that you need to make sure that there are no problems in the
- pipeline (this is a good habit to always re-build the system before a
+ project (this is a good habit to always re-build the system before a
commit to be sure it works as expected).
```shell
$ .local/bin/make clean # Delete outputs ('make distclean' for everything)
- $ .local/bin/make # Build the pipeline to ensure everything is fine.
+ $ .local/bin/make # Build the project to ensure everything is fine.
$ git add -u # Stage all the changes.
$ git status # Make sure everything is fine.
$ git commit # Your first commit, add a nice description.
- $ git tag -a v0 # Tag this as the zero-th version of your pipeline.
+ $ git tag -a v0 # Tag this as the zero-th version of your project.
```
- **Push to the remote**: Push your first commit and its tag to the remote
@@ -648,46 +641,46 @@ advanced in later stages of your work.
questions. Any time you are ready to push your commits to the remote
repository, you can simply use `git push`.
- - **Feedback**: As you use the pipeline you will notice many things that
+ - **Feedback**: As you use the template you will notice many things that
if implemented from the start would have been very useful for your
work. This can be in the actual scripting and architecture of the
- pipeline or in useful implementation and usage tips, like those
+ template, or useful implementation and usage tips, like those
below. In any case, please share your thoughts and suggestions with
us, so we can add them here for everyone's benefit.
- - **Keep pipeline up-to-date**: In time, this pipeline is going to become
+ - **Keep template up-to-date**: In time, this template is going to become
more and more mature and robust (thanks to your feedback and the
feedback of other users). Bugs will be fixed and new/improved features
will be added. So every once and a while, you can run the commands
- below to pull new work that is done in this pipeline. If the changes
- are useful for your work, you can merge them with your own customized
- pipeline to benefit from them. Just pay **very close attention** to
- resolving possible **conflicts** which might happen in the merge
- (updated general pipeline settings that you have customized).
+ below to pull new work that is done in this template. If the changes
+ are useful for your work, you can merge them with your project to
+ benefit from them. Just pay **very close attention** to resolving
+ possible **conflicts** which might happen in the merge (updated
+ settings that you have customized in the template).
```shell
- $ git checkout pipeline
- $ git pull pipeline-origin pipeline # Get recent work in this pipeline.
+ $ git checkout template
+ $ git pull template-origin template # Get recent work in the template
$ git log XXXXXX..XXXXXX --reverse # Inspect new work (replace XXXXXXs with hashs mentioned in output of previous command).
$ git log --oneline --graph --decorate --all # General view of branches.
$ git checkout master # Go to your top working branch.
- $ git merge pipeline # Import all the work into master.
+ $ git merge template # Import all the work into master.
```
- - **Adding this project to a fork of your pipeline**: As you and your
- colleagues continue your project in this pipeline, it will be
- necessary to have separate forks/clones of it. But when you clone your
- own project on a different system, or a colleague clones it to
- collaborate with you, the clone won't have the `pipeline-origin`
- remote that you started the project with. As shown in the previous
- point, you need this remote to be able to pull recent updates from
- this pipeline. The steps below, will setup the `pipeline-origin`
- remote, and a `pipeline` branch to track it, on the new clone.
+ - **Adding this project to a fork of your project**: As you and your
+ colleagues continue your project, it will be necessary to have
+ separate forks/clones of it. But when you clone your own project on a
+ different system, or a colleague clones it to collaborate with you,
+ the clone won't have the `template-origin` remote that you started the
+ project with. As shown in the previous point, you need this remote to
+ be able to pull recent updates from this template. The steps below,
+ will setup the `template-origin` remote, and a `templage` branch to
+ track it, on the new clone.
```shell
- $ git remote add pipeline-origin https://gitlab.com/makhlaghi/reproducible-paper.git
- $ git fetch pipeline-origin
- $ git checkout --track pipeline-origin/pipeline
+ $ git remote add template-origin git://git.sv.gnu.org/reproduce
+ $ git fetch template-origin
+ $ git checkout --track template-origin/template
```
- **Pre-publication: add notice on reproducibility**: Add a notice
@@ -704,13 +697,14 @@ advanced in later stages of your work.
-Usage tips: designing your pipeline/workflow
-============================================
+Tips for designing your project
+===============================
The following is a list of design points, tips, or recommendations that
-have been learned after some experience with this pipeline. Please don't
-hesitate to share any experience you gain after using this pipeline with
-us. In this way, we can add it here for the benefit of others.
+have been learned after some experience with this type of project
+management. Please don't hesitate to share any experience you gain after
+using it with us. In this way, we can add it here (with full giving credit)
+for the benefit of others.
- **Modularity**: Modularity is the key to easy and clean growth of a
project. So it is always best to break up a job into as many
@@ -721,17 +715,17 @@ us. In this way, we can add it here for the benefit of others.
a good sign that you should break up the rule into its main
components. Try to only have one major processing step per rule.
- - *Context-based (many) Makefiles*: This pipeline is designed to allow
- the easy inclusion of many Makefiles (in `reproduce/src/make/*.mk`)
- for maximal modularity. So keep the rules for closely related parts
- of the processing in separate Makefiles.
+ - *Context-based (many) Makefiles*: This design allows easy inclusion of
+ many Makefiles (in `reproduce/src/make/*.mk`) for maximal
+ modularity. So keep the rules for closely related parts of the
+ processing in separate Makefiles.
- *Descriptive names*: Be very clear and descriptive with the naming of
the files and the variables because a few months after the
processing, it will be very hard to remember what each one was
for. Also this helps others (your collaborators or other people
- reading the pipeline after it is published) to more easily understand
- your work and find their way around.
+ reading the project source after it is published) to more easily
+ understand your work and find their way around.
- *Naming convention*: As the project grows, following a single standard
or convention in naming the files is very useful. Try best to use
@@ -773,7 +767,7 @@ us. In this way, we can add it here for the benefit of others.
doing something, how you are doing it, and what you expect the result
to be. Write the comments as if it was what you would say to describe
the variable, recipe or rule to a friend sitting beside you. When
- writing the pipeline it is very tempting to just steam ahead with
+ writing the project it is very tempting to just steam ahead with
commands and codes, but be patient and write comments before the
rules or recipes. This will also allow you to think more about what
you should be doing. Also, in several months when you come back to
@@ -825,8 +819,8 @@ us. In this way, we can add it here for the benefit of others.
multiple copies of them for intermediate steps is not possible), one
solution is the following strategy. Set a small plain text file as
the actual target and delete the large file when it is no longer
- needed by the pipeline (in the last rule that needs it). Below is a
- simple demonstration of doing this, where we use Gnuastro's
+ needed by the project (in the last rule that needs it). Below is a
+ simple demonstration of doing this. In it, we use Gnuastro's
Arithmetic program to add all pixels of the input image with 2 and
create `large1.fits`. We then subtract 2 from `large1.fits` to create
`large2.fits` and delete `large1.fits` in the same rule (when its no
@@ -846,35 +840,36 @@ us. In this way, we can add it here for the benefit of others.
to define a wrapper in `reproduce/src/make/initialize.mk`. This
wrapper will replace `$(subst .txt,,XXXXX)`. Therefore, it will be
possible to greatly simplify this repetitive statement and make the
- code even more readable throughout the whole pipeline.
-
-
- - **Dependencies**: It is critically important to exactly document, keep
- and check the versions of the programs you are using in the pipeline.
-
- - *Check versions*: In `reproduce/src/make/initialize.mk`, check the
- versions of the programs you are using.
-
- - *Keep the source tarball of dependencies*: keep a tarball of the
- necessary version of all your dependencies (and also a copy of the
- higher-level libraries they depend on). Software evolves very fast
- and only in a few years, a feature might be changed or removed from
- the mainstream version or the software server might go down. To be
- safe, keep a copy of the tarballs. Software tarballs are rarely over
- a few megabytes, very insignificant compared to the data. If you
- intend to release the pipeline in a place like Zenodo, then you can
- create your submission early (before public release) and upload/keep
- all the necessary tarballs (and data)
- there. [zenodo.1163746](https://doi.org/10.5281/zenodo.1163746) is
+ code even more readable throughout the whole project.
+
+
+ - **Software tarballs and raw inputs**: It is critically important to
+ document the raw inputs to your project (software tarballs and raw
+ input data):
+
+ - *Keep the source tarball of dependencies*: After configuration
+ finishes, the `.build/dependencies/tarballs` directory will contain
+ all the software tarballs that were necessary for your project. You
+ can mirror the contents of this directory to keep a backup of all the
+ software tarballs used in your project (possibly as another version
+ controlled repository) that is also published with your project. Note
+ that software webpages are not written in stone and can suddenly go
+ offline or not be accessible in some conditions. This backup is thus
+ very important. If you intend to release your project in a place like
+ Zenodo, you can upload/keep all the necessary tarballs (and data)
+ there with your
+ project. [zenodo.1163746](https://doi.org/10.5281/zenodo.1163746) is
one example of how the data, Gnuastro (main software used) and all
- major Gnuastro's dependencies have been uploaded with the pipeline.
+ major Gnuastro's dependencies have been uploaded with the project's
+ source. Just note that this is only possible for free and open-source
+ software.
- *Keep your input data*: The input data is also critical to the
- pipeline, so like the above for software, make sure you have a backup
- of them.
+ project's reproducibility, so like the above for software, make sure
+ you have a backup of them, or their persistent identifiers (PIDs).
- **Version control**: It is important (and extremely useful) to have the
- history of your pipeline under version control. So try to make commits
+ history of your project under version control. So try to make commits
regularly (after any meaningful change/step/result), while not
forgetting the following notes.
@@ -882,36 +877,38 @@ us. In this way, we can add it here for the benefit of others.
make a more human-friendly output of `git describe`: for example
`v1-4-gaafdb04` states that we are on commit `aafdb04` which is 4
commits after tag `v1`. The output of `git describe` is included in
- your final PDF as part of this pipeline. Also, if you use
+ your final PDF as part of this project. Also, if you use
reproducibility-friendly software like Gnuastro, this value will also
be included in all output files, see the description of `COMMIT` in
[Output
headers](https://www.gnu.org/software/gnuastro/manual/html_node/Output-headers.html).
- In the checklist above, we tagged the first commit of your pipeline
+ In the checklist above, you tagged the first commit of your project
with `v0`. Here is one suggestion on when to tag: when you have fully
- adopted the pipeline and have got the first (initial) results, you
+ adopted the template and have got the first (initial) results, you
can make a `v1` tag. Subsequently when you first start reporting the
- results to your colleagues, you can tag the commit as `v2`. Afterwards
- when you submit to a paper, it can be tagged `v3` and so on.
+ results to your colleagues, you can tag the commit as `v2` and
+ increment the version on every later circulation, or referee
+ submission.
- - *Pipeline outputs*: During your research, it is possible to checkout a
+ - *Project outputs*: During your research, it is possible to checkout a
specific commit and reproduce its results. However, the processing
can be time consuming. Therefore, it is useful to also keep track of
- the final outputs of your pipeline (at minimum, the paper's PDF) in
+ the final outputs of your project (at minimum, the paper's PDF) in
important points of history. However, keeping a snapshot of these
(most probably large volume) outputs in the main history of the
- pipeline can unreasonably bloat it. It is thus recommended to make a
- separate Git repo to keep those files and keep this pipeline's volume
- as small as possible. For example if your main pipeline is called
- `my-exciting-project`, the name of the outputs pipeline can be
+ project can unreasonably bloat it. It is thus recommended to make a
+ separate Git repo to keep those files and keep your project's source
+ as small as possible. For example if your project is called
+ `my-exciting-project`, the name of the outputs repository can be
`my-exciting-project-output`. This enables easy sharing of the output
files with your co-authors (with necessary permissions) and not
- having to bloat your email archive with extra attachments (you can
- just share the link to the online repo in your communications). After
- the research is published, you can also release the outputs pipeline,
- or you can just delete it if it is too large or un-necessary (it was
- just for convenience, and fully reproducible after all). This
- pipeline's output is available for demonstration in the separate
+ having to bloat your email archive with extra attachments also (you
+ can just share the link to the online repo in your
+ communications). After the research is published, you can also
+ release the outputs repository, or you can just delete it if it is
+ too large or un-necessary (it was just for convenience, and fully
+ reproducible after all). For example this template's output is
+ available for demonstration in the separate
[reproducible-paper-output](https://gitlab.com/makhlaghi/reproducible-paper-output)
repository.
@@ -934,15 +931,14 @@ future are listed below, please join us if you are interested.
Package management
------------------
-It is important to have control of the environment of the reproduction
-pipeline. The current reproducible paper template builds the higher-level
-programs (for example GNU Bash, GNU Make, GNU AWK and domain-specific
-software) it needs, then sets `PATH` so the analysis is done only with the
-pipeline's built software. But currently the configuration of each program
-is in the Makefile rules that build it. This is not good because a change
-in the build configuration does not automatically cause a re-build. Also,
-each separate project on a system needs to have its own built tools (that
-can waste a lot of space).
+It is important to have control of the environment of the project. The
+current template builds the higher-level programs (for example GNU Bash,
+GNU Make, GNU AWK and domain-specific software) it needs, then sets `PATH`
+so the analysis is done only with the project's built software. But
+currently the configuration of each program is in the Makefile rules that
+build it. This is not good because a change in the build configuration does
+not automatically cause a re-build. Also, each separate project on a system
+needs to have its own built tools (that can waste a lot of space).
A good solution is based on the [Nix package
manager](https://nixos.org/nix/about.html): a separate file is present for
@@ -961,9 +957,9 @@ webpage):
/nix/store/b6gvzjyb2pg0kjfwrjmg1vfhh54ad73z-firefox-33.1/
```
-The important thing is that the "store" is *not* in the pipeline's search
+The important thing is that the "store" is *not* in the project's search
path. After the complete installation of the software, symbolic links are
-made to populate the pipeline's program and library search paths without a
+made to populate each project's program and library search paths without a
hash. This hash will be unique to that particular software and its
particular configuration. So simply by searching for this hash in the
installed directory, we can find the installed files of that software to
@@ -985,8 +981,8 @@ Appendix: Necessity of exact reproduction in scientific research
In case [the link above](http://akhlaghi.org/reproducible-science.html) is
not accessible at the time of reading, here is a copy of the introduction
-of that link, describing the necessity for a reproduction pipeline like
-this (copied on February 7th, 2018):
+of that link, describing the necessity for a reproducible project like this
+(copied on February 7th, 2018):
The most important element of a "scientific" statement/result is the fact
that others should be able to falsify it. The Tsunami of data that has
@@ -1021,7 +1017,7 @@ order of operations: this is contrary to the scientific spirit.
Copyright information
---------------------
This file is part of the reproducible paper template
- https://gitlab.com/makhlaghi/reproducible-paper
+ http://savannah.nongnu.org/projects/reproduce
This template is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the Free
diff --git a/README.md b/README.md
index 46c7286..212e178 100644
--- a/README.md
+++ b/README.md
@@ -1,11 +1,11 @@
-Reproduction pipeline for paper XXXXXXX
-=======================================
+Reproducible source for paper XXXXXXX
+=====================================
Copyright (C) 2018-2019 Mohammad Akhlaghi <mohammad@akhlaghi.org>
See the end of the file for license conditions.
-This is the reproduction pipeline for the paper titled "**XXXXXX**", by
-XXXXXXXX et al. (**IN PREPARATION**). To learn more about the purpose,
+This is the reproducible project source for the paper titled "**XXXXXX**",
+by XXXXXXXX et al. (**IN PREPARATION**). To learn more about the purpose,
principles and technicalities of this reproducible paper, please see
`README-hacking.md`.
@@ -24,7 +24,7 @@ $ .local/bin/make -j8
```
For a general introduction to reproducible science as implemented in this
-pipeline, please see the [principles of reproducible
+project, please see the [principles of reproducible
science](http://akhlaghi.org/reproducible-science.html), and a
[reproducible paper
template](https://gitlab.com/makhlaghi/reproducible-paper) that is based on
@@ -34,24 +34,24 @@ it.
-Running the pipeline
+Building the project
--------------------
-This pipeline was designed to have as few dependencies as possible.
+This project was designed to have as few dependencies as possible.
1. Necessary dependencies:
1.1: Minimal software building tools like C compiler, Make, and other
tools found on any Unix-like operating system (GNU/Linux, BSD, Mac
OS, and others). All necessary dependencies will be built from
- source (for use only within this pipeline) by the `./configure'
+ source (for use only within this project) by the `./configure'
script (next step).
1.2: (OPTIONAL) Tarball of dependencies. If they are already present (in
a directory given at configuration time), they will be
used. Otherwise, a downloader (`wget` or `curl`) will be necessary
to download any necessary tarball. The necessary tarballs are also
- collected in the link below for easy download. [[TO PIPELINE
+ collected in the link below for easy download. [[TO PROJECT
DESIGNERS: it is STRONGLY RECOMMENDED to keep a backup of all the
necessary software tarballs you need for the project (possibly in
another Git repository). For example see [this template's
@@ -65,8 +65,8 @@ This pipeline was designed to have as few dependencies as possible.
recommended to set directories outside the current directory. Please
read the description of each necessary input clearly and set the best
value. Note that the configure script also downloads, builds and locally
- installs (only for this pipeline, no root privileges necessary) many
- programs (pipeline dependencies). So it may take a while to complete.
+ installs (only for this project, no root privileges necessary) many
+ programs (project dependencies). So it may take a while to complete.
```shell
$ ./configure
diff --git a/configure b/configure
index 19a5acd..8091b4e 100755
--- a/configure
+++ b/configure
@@ -1,6 +1,6 @@
#! /bin/bash
#
-# Necessary preparations/configurations for the reproduction pipeline.
+# Necessary preparations/configurations for the reproducible project.
#
# Copyright (C) 2018-2019 Mohammad Akhlaghi <mohammad@akhlaghi.org>
#
@@ -56,15 +56,15 @@ information printed before them). Alternatively, if you have already
configured this script for your system, you can use the '--existing-conf'
to use its values directly.
-RECOMMENDATION: If this is the first time you are running this pipeline,
+RECOMMENDATION: If this is the first time you are running this template,
please don't use the options and let the script explain each parameter in
full detail by simply running './configure'.
-The only mandatory value for this script is the local build directory. This
-is where all the pipeline's outputs will be stored. Optionally, you can
-also provide directories that host input data, or software source codes. If
-the necessary files don't exist there, the template will automatically
-download them.
+The only mandatory value is the local build directory. This is where all
+the (temporary) built files will be stored. Optionally, you can also
+provide directories that host input data, or software source codes. If the
+necessary files don't exist there, the template will automatically download
+them.
With the options below you can modify the default behavior. Just note that
you should not put an '=' sign between an option name and its value.
@@ -216,14 +216,13 @@ function create_file_with_notice() {
if echo "# IMPORTANT: file can be RE-WRITTEN after './configure'" > "$1"
then
echo "#" >> "$1"
- echo "# This file was created during the reproduction" >> "$1"
- echo "# pipeline's configuration ('./configure'). Therefore," >> "$1"
- echo "# it is not under version control and any manual " >> "$1"
- echo "# changes to it will be over-written if the pipeline " >> "$1"
- echo "# is re-configured." >> "$1"
+ echo "# This file was created during configuration" >> "$1"
+ echo "# ('./configure'). Therefore, it is not under version" >> "$1"
+ echo "# control and any manual changes to it will be" >> "$1"
+ echo "# over-written if the project re-configured." >> "$1"
echo "#" >> "$1"
else
- echo; echo "Can't write to "$1""; echo;
+ echo; echo "Can't write to $1"; echo;
exit 1
fi
}
@@ -256,15 +255,13 @@ function absolute_dir() {
# on and is prepared on what will happen next.
cat <<EOF
------------------------------------------
-Reproduction pipeline local configuration
------------------------------------------
+-----------------------------
+Project's local configuration
+-----------------------------
Local configuration includes things like top-level directories, or
-processing steps.
-
-It is STRONGLY recommended to read the comments, and set the best values
-for your system (where necessary).
+processing steps. It is STRONGLY recommended to read the comments, and set
+the best values for your system (where necessary).
EOF
@@ -275,7 +272,7 @@ EOF
# What to do with possibly existing configuration file
# ----------------------------------------------------
#
-# `LOCAL.mk' is the top-most local configuration for the pipeline. If it
+# `LOCAL.mk' is the top-most local configuration for the project. If it
# already exists when this script is run, we'll make a copy of it as backup
# (for example the user might have ran `./configure' by mistake).
printnotice=yes
@@ -309,7 +306,7 @@ if [ $rewritepconfig = no ]; then
status="configured for '$oldgroupname' group"
confcommand="./for-group $oldgroupname configure"
fi
- echo "Previous pipeline was $status!"
+ echo "Project was previously $status!"
echo "Either enable re-write of this configuration file,"
echo "or re-run this configuration like this:"
echo
@@ -353,7 +350,7 @@ if [ $rewritepconfig = yes ]; then
Couldn't find GNU Wget, or cURL on this system. These programs are used for
downloading necessary programs and data if they aren't already present (in
directories that you can specify with this configure script). Therefore if
-the necessary files are not present, the pipeline will crash.
+the necessary files are not present, the project will crash.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
@@ -444,7 +441,7 @@ if [ $rewritepconfig = yes ] && [ x"$input_dir" = x ]; then
(OPTIONAL) Input dataset directory
----------------------------------
-This pipeline needs the dataset(s) listed below. If you already have them,
+This project needs the dataset(s) listed below. If you already have them,
please specify the directory hosting them on this system. If you don't,
they will be downloaded automatically. Each file is shown with its total
volume and its 128-bit MD5 checksum in parenthesis.
@@ -491,14 +488,14 @@ if [ $rewritepconfig = yes ] && [ x"$software_dir" = x ]; then
(OPTIONAL) Software tarball directory
---------------------------------------
-To ensure an identical build environment, the pipeline will use its own
+To ensure an identical build environment, the project will use its own
build of the programs it needs. Therefore the tarball of the relevant
-programs are necessary for this pipeline. If a tarball isn't present in the
-specified directory, *IT WILL BE DOWNLOADED* by the pipeline.
+programs are necessary. If a tarball isn't present in the specified
+directory, *IT WILL BE DOWNLOADED* automatically.
-Therefore, if you don't specify any directory here, or it doesn't contain
-the tarball of a dependency, it is necessary to have an internet
-connection. The pipeline will download the tarballs it needs automatically.
+If you don't specify any directory here, or it doesn't contain the tarball
+of a dependency, it is necessary to have an internet connection. The
+project will download the tarballs it needs automatically.
EOF
read -p"(OPTIONAL) Directory of dependency tarballs ($ddir): " tmpddir
@@ -515,7 +512,7 @@ fi
# Write the parameters into the local configuration file.
if [ $rewritepconfig = yes ]; then
- # Make the pipeline configuration's initial comments.
+ # Add commented notice.
create_file_with_notice $pconf
# Write the values.
@@ -601,16 +598,20 @@ if [ $rewritegconfig = yes ]; then
else
ingversion=$(awk '$1=="onlyversion" {print $NF}' $glconf)
if [ x$ingversion != x$gversion ]; then
- echo "______________________________________________________"
- echo "!!!!!!!!!!!!!!!!!!CONFIGURATION ERROR!!!!!!!!!!!!!!!!!"
- echo
- echo "Gnuastro's version in '$glconf' ($ingversion) doesn't match the tarball version that this pipeline was designed to use in '$depverfile' ($gversion). Please re-run after removing the former file:"
- echo
- echo " $ rm $glconf"
- echo " $ ./configure"
- echo
- echo "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
- echo
+ cat <<EOF
+______________________________________________________
+!!!!!!!!!!!!!!!!!!CONFIGURATION ERROR!!!!!!!!!!!!!!!!!
+
+Gnuastro's version in '$glconf' ($ingversion) doesn't match the tarball
+version that this project was designed to use in '$depverfile'
+($gversion). Please re-run after removing the former file:
+
+ $ rm $glconf
+ $ ./configure
+
+!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
+
+EOF
exit 1
fi
fi
@@ -786,8 +787,8 @@ Building dependencies ...
Necessary dependency programs and libraries will be built in $tsec sec.
NOTE: the built software will NOT BE INSTALLED on your system (no root
-access is required). They are only for local usage by this reproduction
-pipeline. They will be installed in:
+access is required). They are only for local usage by this project. They
+will be installed in:
$depdir/installed
@@ -823,10 +824,10 @@ fi
# Build `flock' as first program
# ------------------------------
#
-# Flock (or file-lock) is a unique program in the pipeline that is
-# necessary to serialize the (generally parallel) processing of make when
-# necessary. GNU/Linux machines have it as part of their `util-linux'
-# programs. But to be consistent, we will be using our own build.
+# Flock (or file-lock) is a unique program that is necessary to serialize
+# the (generally parallel) processing of make when necessary. GNU/Linux
+# machines have it as part of their `util-linux' programs. But to be
+# consistent in non-GNU/Linux systems, we will be using our own build.
#
# The reason its sepecial is that we need it to serialize the download
# process of the dependency tarballs.
@@ -903,7 +904,7 @@ if [ $host_cc = 0 ]; then
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
This system's C compiler (called with 'gcc') can't include
-'sys/cdefs.h. Because of this, this pipeline can't build its custom GCC to
+'sys/cdefs.h. Because of this, the project can't build its custom GCC to
ensure better reproducibility. We strongly recommend installing the proper
package (for your operating system) that installs this necessary file. For
example on some Debian-based GNU/Linux distros, you need these two
@@ -911,8 +912,8 @@ packages: 'gcc-multilib' and 'g++-multilib'.
However, since GCC is pretty low-level, this configuration script will
continue in 5 seconds and use your system's C compiler (it won't build a
-custom GCC). But please consider installing the necessary package to
-complete your C compiler, then re-run the pipeline.
+custom GCC). But please consider installing the necessary package(s) to
+complete your C compiler, then re-run './configure'.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
EOF
@@ -980,10 +981,10 @@ numthreads=$($instdir/bin/nproc)
#
# TeX Live is managed over the internet, so if there isn't any, or it
# suddenly gets cut, it can't be built. However, when TeX Live isn't
-# installed, the pipeline and can do all its processing independent of
-# it. It will just stop at the stage when all the processing is complete
-# and it is only necessary to build the PDF. So we don't want to stop the
-# pipeline if its not present.
+# installed, the project can do all its processing independent of it. It
+# will just stop at the stage when all the processing is complete and it is
+# only necessary to build the PDF. So we don't want to stop the project's
+# configuration and building if its not present.
texlive_result=$(cat $itidir/texlive-ready-tlmgr)
if [ x"$texlive_result" = x"NOT!" ]; then
cat <<EOF
@@ -1001,9 +1002,9 @@ Therefore, if you don't need the final PDF, and just want to do the
analysis, you can safely ignore this warning and continue.
If you later have internet access and would like to add TeX live to your
-pipeline, please delete the respective files, then re-run configure as
-shown below. Within configure, answer 'n' (for "no") when asked to re-write
-the configuration files.
+project, please delete the respective files, then re-run configure as shown
+below. Within configure, answer 'n' (for "no") when asked to re-write the
+configuration files.
rm .local/version-info/tex/texlive-ready-tlmgr
./configure
@@ -1125,7 +1126,7 @@ fi
cat <<EOF
----------------
-The reproduction pipeline and its environment are configured with no errors.
+The project and its environment are configured with no errors.
Please run the following command to start.
(Replace '8' with the number of CPU threads)
diff --git a/for-group b/for-group
index 835266d..6c82910 100755
--- a/for-group
+++ b/for-group
@@ -6,12 +6,13 @@
# $ ./for-group group_name make [OPTIONS]
#
# This is a wrapper for the configure and Make steps designed for a group
-# of users (sharing the same group name) using this pipeline on the same
+# of users (sharing the same group name) working on the project in the same
# build directory.
#
# When the configuration (normally done with `./configure') and build
# (normally done with `.local/bin/make') steps are done with this script,
-# all the files that are created within the pipeline have these properties:
+# all the files that are created within the project will have these
+# properties:
#
# 1) Group owner will be the group specified in the command-line.
# 2) The permission flags give write access to the group members.
diff --git a/paper.tex b/paper.tex
index 306c81c..6939284 100644
--- a/paper.tex
+++ b/paper.tex
@@ -37,27 +37,28 @@
%% Start writing.
\begin{document}
-%% Abstract, keywords and reproduction pipeline notice.
+%% Project abstract and keywords.
\includeabstract{
- You have completed the reproduction pipeline and are ready to configure
- and implement it for your own research. This template reproduction
- pipeline and document contains almost all the elements that you will need
- in a research project containing the downloading of raw data, processing
- it, including them in plots and report, including this abstract, figures
- and bibliography. If you use this pipeline in your work, don't forget to
- add a notice to clearly let the readers know that your work is
- reproducible. If this pipeline proves useful in your research, please
- cite \citet{gnuastro}.
+ You have completed the reproducible paper template and are ready to
+ configure and implement it for your own research. This template contains
+ almost all the elements that you will need in a research project
+ containing the downloading of raw data and necessary software, building
+ the software, and processing the data with the software in a
+ highly-controlled environment. It then allows including the results in
+ plots and producing the final report, including this abstract, figures
+ and bibliography. If you design your project with this template's
+ infra-structure in your work, don't forget to add a notice and clearly
+ let the readers know that your work is reproducible. If this template
+ proves useful in your research, please cite \citet{gnuastro}.
\vspace{0.25cm}
\textsl{Keywords}: Add some keywords for your research here.
\textsl{Reproducible paper}: All quantitave results (numbers and plots)
- in this paper are exactly reproducible with reproduction pipeline
- \pipelineversion{}
- (\url{https://gitlab.com/makhlaghi/reproducible-paper}).}
+ in this paper are exactly reproducible (version \pipelineversion{},
+ \url{https://gitlab.com/makhlaghi/reproducible-paper}).}
%% To add the first page's headers.
\thispagestyle{firststyle}
@@ -68,19 +69,18 @@
%% Start of the main body of text.
\section{Congratulations!}
-Congratulations on running the reproduction pipeline! You can now follow
-the checklist in the \texttt{README.md} file to customize this pipeline to
-your exciting research project.
+Congratulations on running the raw template project! You can now follow the
+checklist in the \texttt{README.md} file to customize this template to your
+exciting research project.
Just don't forget to \emph{never} use numbers or fixed strings (for example
database urls like \url{\wfpctwourl}) directly within your \LaTeX{}
-source. Read them directly from your configuration files or outputs of the
-programs as part of the reproduction pipeline and import them into \LaTeX{}
-as macros through the \texttt{tex/pipeline/macros/pipeline.tex} file
-(created after running the pipeline). See the several examples within the
-pipeline for a demonstration. For some recent real-world examples, the
-reproduction pipelines for Sections 4 and 7.3 of \citet{bacon17} are
-available at
+source. Read them directly from your configuration files, or processing
+outputs, and import them into \LaTeX{} as macros through the
+\texttt{tex/pipeline/macros/pipeline.tex} file (created after running the
+project). See the several existing examples within the template for a
+demonstration. For some recent real-world examples, the reproducible
+project sources for Sections 4 and 7.3 of \citet{bacon17} are available at
\href{https://doi.org/10.5281/zenodo.1164774}{zenodo.1164774}\footnote{\url{https://gitlab.com/makhlaghi/muse-udf-origin-only-hst-magnitudes}},
or
\href{https://doi.org/10.5281/zenodo.1163746}{zenodo.1163746}\footnote{\url{https://gitlab.com/makhlaghi/muse-udf-photometry-astrometry}}. Working
@@ -126,15 +126,15 @@ histogram and basic statstics were generated with Gnuastro's
{\small PGFP}lots\footnote{\url{https://ctan.org/pkg/pgfplots}} is a great
tool to build the plots within \LaTeX{} and removes the necessity to add
-further dependencies (to create the plots) to your reproduction
-pipeline. There are high-level language libraries like Matplotlib which
-also generate plots. However, the problem is that they require many
-dependencies (Python, Numpy and etc). Installing these dependencies from
-source, is not easy and will harm the reproducibility of your paper. Note
-that after several years, the binary files of these high-level libraries,
-that you easily install today, will no longer be available in common
-repositories. Therefore building the libraries from source is the only
-option to reproduce your results.
+further dependencies (to create the plots) to your project. There are
+high-level language libraries like Matplotlib which also generate
+plots. However, the problem is that they require many dependencies (Python,
+Numpy and etc). Installing these dependencies from source, is not easy and
+will harm the reproducibility of your paper. Note that after several years,
+the binary files of these high-level libraries, that you easily install
+today, will no longer be available in common repositories. Therefore
+building the libraries from source is the only option to reproduce your
+results.
Furthermore, since {\small PGFP}lots is built by \LaTeX{} it respects all
the properties of your text (for example line width and fonts and
@@ -142,7 +142,7 @@ etc). Therefore the final plot blends in your paper much more nicely. It
also has a wonderful
manual\footnote{\url{http://mirrors.ctan.org/graphics/pgf/contrib/pgfplots/doc/pgfplots.pdf}}.
-This pipeline also defines two \LaTeX{} macros that allow you to mark text
+This template also defines two \LaTeX{} macros that allow you to mark text
within your document as \emph{new} and \emph{notes}. For example, \new{this
text has been marked as \texttt{new}.} \tonote{While this one is marked
as \texttt{tonote}.} If you comment the line (by adding a `\texttt{\%}'
@@ -173,13 +173,12 @@ please add a notice close to the start of your paper or in the end of the
abstract clearly mentioning that your work is fully reproducible.
For the time being, we haven't written a specific paper only for this
-reproduction pipeline, so until then, we would be grateful if you could
-cite the first paper that used the first version of this pipeline:
-\citet{gnuastro}.
+template. Until then, we would be grateful if you could cite the first
+paper that used the early versions of this template: \citet{gnuastro}.
After publication, don't forget to upload all the necessary data, software
-source code and the reproduction pipeline to a long-lasting host like
-Zenodo (\url{https://zenodo.org}).
+source code and the project's source to a long-lasting host like Zenodo
+(\url{https://zenodo.org}).
@@ -187,21 +186,22 @@ Zenodo (\url{https://zenodo.org}).
\section{Acknowledgements}
\new{Please include the following two paragraphs in the Acknowledgement
- section of your paper. This reproduction pipeline was developed in
+ section of your paper. This reproducible paper template was developed in
parallel with Gnuastro, so it benefited from the same grants. If you
- don't use any of these packages in the final/customized pipeline, please
- remove them. }
+ don't use Gnuastro in your final/customized project, please remove it
+ from the paragraph below, only mentioning the reproducible paper
+ template.}
This research was partly done using GNU Astronomy Utilities (Gnuastro,
-ascl.net/1801.009), and reproduction pipeline \pipelineversion. Work on
-Gnuastro and the reproduction pipeline has been funded by the Japanese
-Ministry of Education, Culture, Sports, Science, and Technology (MEXT)
-scholarship and its Grant-in-Aid for Scientific Research (21244012,
-24253003), the European Research Council (ERC) advanced grant
-339659-MUSICOS, European Union’s Horizon 2020 research and innovation
-programme under Marie Sklodowska-Curie grant agreement No 721463 to the
-SUNDIAL ITN, and from the Spanish Ministry of Economy and Competitiveness
-(MINECO) under grant number AYA2016-76219-P.
+ascl.net/1801.009), and the reproducible paper template
+\pipelineversion. Work on Gnuastro and the reproducible paper template has
+been funded by the Japanese Ministry of Education, Culture, Sports,
+Science, and Technology (MEXT) scholarship and its Grant-in-Aid for
+Scientific Research (21244012, 24253003), the European Research Council
+(ERC) advanced grant 339659-MUSICOS, European Union’s Horizon 2020 research
+and innovation programme under Marie Sklodowska-Curie grant agreement No
+721463 to the SUNDIAL ITN, and from the Spanish Ministry of Economy and
+Competitiveness (MINECO) under grant number AYA2016-76219-P.
\input{tex/pipeline/macros/dependencies.tex}
diff --git a/reproduce/config/gnuastro/gnuastro.conf b/reproduce/config/gnuastro/gnuastro.conf
index fbdfa37..57fcadc 100644
--- a/reproduce/config/gnuastro/gnuastro.conf
+++ b/reproduce/config/gnuastro/gnuastro.conf
@@ -1,14 +1,13 @@
# Default values for the common options to all the programs in GNU
# Astronomy Utitlies.
#
-# IMPORTANT NOTE FOR THE REPRODUCTION PIPELINE: The `lastconfig'
-# option is very important here, because we don't want any of
-# Gnuastro's programs to go into an un-controlled environment (user or
-# system wide configuration files).
+# IMPORTANT NOTE: The `lastconfig' option is very important in a
+# reproducible environment. Because we don't want any of Gnuastro's
+# programs to go into an un-controlled environment (user or system wide
+# configuration files).
#
-# The rest of this configuration file in this template reproduction
-# pipeline is taken from the default Gnuastro configuration from its
-# source (`bin/gnuastro.conf').
+# The rest of this configuration file is taken from the default Gnuastro
+# configuration from its source (`bin/gnuastro.conf').
#
# Copyright (C) 2018-2019 Mohammad Akhlaghi <mohammad@akhlaghi.org>
#
@@ -17,7 +16,7 @@
# this notice are preserved. This file is offered as-is, without any
# warranty.
-# Reproduction pipeline (`config' has to be before `lastconfig').
+# Local project settings (`config' has to be before `lastconfig').
config .gnuastro/gnuastro-local.conf
lastconfig 1
diff --git a/reproduce/config/pipeline/INPUTS.mk b/reproduce/config/pipeline/INPUTS.mk
index dbcb5fe..eb38295 100644
--- a/reproduce/config/pipeline/INPUTS.mk
+++ b/reproduce/config/pipeline/INPUTS.mk
@@ -1,4 +1,4 @@
-# Input files necessary for this pipeline.
+# Input files necessary for this project.
#
# This file is read by the configure script and running Makefiles.
#
diff --git a/reproduce/config/pipeline/LOCAL.mk.in b/reproduce/config/pipeline/LOCAL.mk.in
index 7de88d3..785bb6a 100644
--- a/reproduce/config/pipeline/LOCAL.mk.in
+++ b/reproduce/config/pipeline/LOCAL.mk.in
@@ -1,4 +1,4 @@
-# Local pipeline configuration.
+# Local project configuration.
#
# This is just a template for the `./configure' script to fill in. Please
# don't make any change to this file.
diff --git a/reproduce/config/pipeline/dependency-numpy-scipy.cfg b/reproduce/config/pipeline/dependency-numpy-scipy.cfg
index 7590427..4b7a7b0 100644
--- a/reproduce/config/pipeline/dependency-numpy-scipy.cfg
+++ b/reproduce/config/pipeline/dependency-numpy-scipy.cfg
@@ -1,4 +1,4 @@
-# THIS IS A COPY OF NUMPY'S site.cfg.example, CUSTOMIZED FOR THIS PIPELINE
+# THIS IS A COPY OF NUMPY'S site.cfg.example, CUSTOMIZED FOR THIS TEMPLATE
# ------------------------------------------------------------------------
# This file provides configuration information about non-Python
diff --git a/reproduce/config/pipeline/pdf-build.mk b/reproduce/config/pipeline/pdf-build.mk
index 02af72d..3a86ff3 100644
--- a/reproduce/config/pipeline/pdf-build.mk
+++ b/reproduce/config/pipeline/pdf-build.mk
@@ -1,9 +1,9 @@
# Make the final PDF?
# -------------------
#
-# During the testing a pipeline, it is usually not necessary to build the
-# PDF file (which makes a lot of output lines on the command-line and can
-# make it hard to find the commands and possible errors (and their
+# During the project's early phases, it is usually not necessary to build
+# the PDF file (which makes a lot of output lines on the command-line and
+# can make it hard to find the commands and possible errors (and their
# outputs). Also, in some cases, only the produced results may be of
# interest and not the final PDF, so LaTeX (and its necessary packages) may
# not be installed.
diff --git a/reproduce/config/pipeline/texlive.conf b/reproduce/config/pipeline/texlive.conf
index 8a9fb8e..53054e1 100644
--- a/reproduce/config/pipeline/texlive.conf
+++ b/reproduce/config/pipeline/texlive.conf
@@ -1,7 +1,6 @@
# Basic profile for build. Values to set:
#
# installdir: Install directory
-# topdir: Top pipeline directory
#
# Copyright (C) 2018-2019 Mohammad Akhlaghi <mohammad@akhlaghi.org>
#
@@ -11,11 +10,11 @@
# warranty.
selected_scheme scheme-basic
TEXDIR @installdir@/texlive/2018
-TEXMFCONFIG @topdir@/.texlive2018/texmf-config
+TEXMFCONFIG @installdir@/texlive2018/texmf-config
TEXMFLOCAL @installdir@/texlive/texmf-local
TEXMFSYSCONFIG @installdir@/texlive/2018/texmf-config
TEXMFSYSVAR @installdir@/texlive/2018/texmf-var
-TEXMFVAR @topdir@/.texlive2018/texmf-var
+TEXMFVAR @installdir@/texlive2018/texmf-var
instopt_adjustpath 0
instopt_adjustrepo 1
instopt_letter 0
diff --git a/reproduce/src/bash/download-multi-try b/reproduce/src/bash/download-multi-try
index 2399b5d..1fd7497 100755
--- a/reproduce/src/bash/download-multi-try
+++ b/reproduce/src/bash/download-multi-try
@@ -1,4 +1,4 @@
-# Attempt downloading multiple times before crashing whole pipeline. From
+# Attempt downloading multiple times before crashing whole project. From
# the top project directory (for the shebang above), this script must be
# run like this:
#
@@ -10,13 +10,13 @@
#
# Due to temporary network problems, a download may fail suddenly, but
# succeed in a second try a few seconds later. Without this script that
-# temporary glitch in the network will permanently crash the pipeline and
+# temporary glitch in the network will permanently crash the project and
# it can't continue. The job of this script is to be patient and try the
-# download multiple times before crashing the whole pipeline.
+# download multiple times before crashing the whole project.
#
# LOCK FILE: Since there is ultimately only one network port to the outside
# world, downloading is done much faster in serial, not in parallel. But
-# the pipeline's processing may be done in parallel (with multiple threads
+# the project's processing may be done in parallel (with multiple threads
# needing to download different files at the same time). Therefore, this
# script uses the `flock' program to only do one download at a time. To
# benefit from it, any call to this script must be given the same lock
diff --git a/reproduce/src/bash/git-post-checkout b/reproduce/src/bash/git-post-checkout
index ef85c44..9552f01 100644
--- a/reproduce/src/bash/git-post-checkout
+++ b/reproduce/src/bash/git-post-checkout
@@ -7,7 +7,7 @@
# Copyright (C) 2018-2019 Mohammad Akhlaghi <mohammad@akhlaghi.org>
#
# This script is taken from the `examples/hooks/pre-commit' file of the
-# `metastore' package (installed within the pipeline, with an MIT license
+# `metastore' package (installed within the project, with an MIT license
# for copyright). We have just changed the name of the `MSFILE' and also
# set special characters for the installation location of meta-store so our
# own installation is found by Git.
diff --git a/reproduce/src/bash/git-pre-commit b/reproduce/src/bash/git-pre-commit
index 09abce7..dbe0ecc 100644
--- a/reproduce/src/bash/git-pre-commit
+++ b/reproduce/src/bash/git-pre-commit
@@ -18,7 +18,7 @@
# git checkout HEAD -- .metadata
#
# This script is taken from the `examples/hooks/pre-commit' file of the
-# `metastore' package (installed within the pipeline, with an MIT license
+# `metastore' package (installed within the project, with an MIT license
# for copyright). Here, the name of the `MSFILE' and also set special
# characters for the installation location of meta-store so our own
# installation is found by Git.
diff --git a/reproduce/src/make/dependencies-basic.mk b/reproduce/src/make/dependencies-basic.mk
index e3a5ab3..b56d01d 100644
--- a/reproduce/src/make/dependencies-basic.mk
+++ b/reproduce/src/make/dependencies-basic.mk
@@ -1,5 +1,5 @@
-# Build the VERY BASIC reproduction pipeline dependencies before everything
-# else using minimum Make and Shell.
+# Build the VERY BASIC project dependencies before everything else assuming
+# minimal/generic Make and Shell.
#
# ------------------------------------------------------------------------
# !!!!! IMPORTANT NOTES !!!!!
@@ -52,8 +52,8 @@ ilidir = $(BDIR)/dependencies/installed/version-info/lib
# won't be building ourselves.
syspath := $(PATH)
-# As we build more programs, we want to use our own pipeline's built
-# programs and libraries, not the host's.
+# As we build more programs, we want to use this project's built programs
+# and libraries, not the host's.
export CCACHE_DISABLE := 1
export PATH := $(ibdir):$(PATH)
export PKG_CONFIG_PATH := $(ildir)/pkgconfig
@@ -217,7 +217,7 @@ makelink = origpath="$$PATH"; \
if [ x$$a = x ]; then \
if [ "x$(strip $(2))" = xmandatory ]; then \
echo "'$(1)' is necessary for higher-level tools."; \
- echo "Please install it for the pipeline to continue."; \
+ echo "Please install it for the configuration to continue."; \
exit 1; \
fi; \
else \
@@ -231,7 +231,7 @@ $(ibidir)/low-level-links: | $(ibdir) $(ildir)
$(call makelink,as)
# Compiler (Cmake needs the clang compiler which we aren't building
- # yet in the pipeline).
+ # yet in the project).
$(call makelink,clang)
$(call makelink,clang++)
@@ -351,7 +351,7 @@ $(ibidir)/tar: $(tdir)/tar-$(tar-version).tar.gz \
$(ibidir)/lzip \
$(ibidir)/gzip \
$(ibidir)/xz
- # Since all later programs depend on Tar, the pipeline will be
+ # Since all later programs depend on Tar, the configuration will be
# stuck here, only making Tar. So its more efficient to built it on
# multiple threads (when the user's Make doesn't pass down the
# number of threads).
@@ -394,8 +394,8 @@ $(ilidir)/ncurses: $(tdir)/ncurses-$(ncurses-version).tar.gz \
# Delete the (possibly existing) low-level programs that depend on
# `readline', and thus `ncurses'. Since these programs are actually
# used during the building of `ncurses', we need to delete them so
- # the build process doesn't use the pipeline's Bash and AWK, but
- # the host systems.
+ # the build process doesn't use the project's Bash and AWK, but the
+ # host's.
rm -f $(ibdir)/bash* $(ibdir)/awk* $(ibdir)/gawk*
# Standard build process.
@@ -489,8 +489,8 @@ $(ibidir)/patchelf: $(tdir)/patchelf-$(patchelf-version).tar.gz \
# of Readline, that we build below as a prerequisite or AWK, is used) and
# you run `ldd $(ibdir)/bash' on the resulting binary, it will say that it
# is linking with the system's `readline'. But if you run that same command
-# within a rule in this reproduction pipeline, you'll see that it is indeed
-# linking with our own built readline.
+# within a rule in this project, you'll see that it is indeed linking with
+# our own built readline.
ifeq ($(on_mac_os),yes)
needpatchelf =
else
@@ -570,7 +570,7 @@ $(ilidir)/zlib: $(tdir)/zlib-$(zlib-version).tar.gz \
# build libssl (and libcrypto) dynamically also.
#
# Until we find a nice and generic way to create an updated CA file in the
-# pipeline, the certificates will be available in a file for this pipeline
+# project, the certificates will be available in a file for this pipeline
# along with the other tarballs.
#
# In case you do want a static OpenSSL and libcrypto, then uncomment the
@@ -621,7 +621,7 @@ $(ilidir)/openssl: $(tdir)/openssl-$(openssl-version).tar.gz \
# gives a segmentation fault when built statically.
#
# There are many network related libraries that we are currently not
-# building as part of this pipeline. So to avoid too much dependency on the
+# building as part of this project. So to avoid too much dependency on the
# host system (especially a crash when these libraries are updated on the
# host), they are disabled here.
$(ibidir)/wget: $(tdir)/wget-$(wget-version).tar.lz \
@@ -795,7 +795,7 @@ $(ilidir)/mpc: $(tdir)/mpc-$(mpc-version).tar.gz \
# Objective C and Objective C++ is necessary for installing `matplotlib'.
#
# We are currently having problems installing GCC on macOS, so for the time
-# being, if the pipeline is being run on a macOS, we'll just set a link.
+# being, if the project is being run on a macOS, we'll just set a link.
ifeq ($(host_cc),1)
gcc-prerequisites =
else
@@ -816,7 +816,7 @@ $(ibidir)/gcc: $(gcc-prerequisites) \
$(ibidir)/findutils
# GCC builds is own libraries in '$(idir)/lib64'. But all other
- # libraries are in '$(idir)/lib'. Since this pipeline is only for a
+ # libraries are in '$(idir)/lib'. Since this project is only for a
# single architecture, we can trick GCC into building its libraries
# in '$(idir)/lib' by defining the '$(idir)/lib64' as a symbolic
# link to '$(idir)/lib'.
diff --git a/reproduce/src/make/dependencies-build-rules.mk b/reproduce/src/make/dependencies-build-rules.mk
index 2523f6a..a8c8731 100644
--- a/reproduce/src/make/dependencies-build-rules.mk
+++ b/reproduce/src/make/dependencies-build-rules.mk
@@ -110,8 +110,8 @@ cbuild = if [ x$(static_build) = xyes ] && [ $(3)x = staticx ]; then \
opts="-DBUILD_SHARED_LIBS=OFF"; \
fi; \
cd $(ddir) && rm -rf $(2) && tar xf $(1) && cd $(2) && \
- rm -rf pipeline-build && mkdir pipeline-build && \
- cd pipeline-build && \
+ rm -rf project-build && mkdir project-build && \
+ cd project-build && \
cmake .. -DCMAKE_LIBRARY_PATH=$(ildir) \
-DCMAKE_INSTALL_PREFIX=$(idir) \
-DCMAKE_VERBOSE_MAKEFILE:BOOL=ON $$opts $(4) && \
diff --git a/reproduce/src/make/dependencies-python.mk b/reproduce/src/make/dependencies-python.mk
index ce1cd38..837b0ad 100644
--- a/reproduce/src/make/dependencies-python.mk
+++ b/reproduce/src/make/dependencies-python.mk
@@ -1,4 +1,4 @@
-# Build the reproduction pipeline Python dependencies.
+# Build the project's Python dependencies.
#
# ------------------------------------------------------------------------
# !!!!! IMPORTANT NOTES !!!!!
diff --git a/reproduce/src/make/dependencies.mk b/reproduce/src/make/dependencies.mk
index 72cb7c4..fd9bffa 100644
--- a/reproduce/src/make/dependencies.mk
+++ b/reproduce/src/make/dependencies.mk
@@ -1,4 +1,4 @@
-# Build the reproduction pipeline dependencies (programs and libraries).
+# Build the project's dependencies (programs and libraries).
#
# ------------------------------------------------------------------------
# !!!!! IMPORTANT NOTES !!!!!
@@ -46,7 +46,7 @@ ipydir = $(BDIR)/dependencies/installed/version-info/python
# Define the top-level programs to build (installed in `.local/bin').
#
-# About ATLAS: currently the core pipeline does not depend on ATLAS but many
+# About ATLAS: currently the template does not depend on ATLAS but many
# high level software depend on it. The current rule for ATLAS is tested
# successfully on Mac (only static) and GNU/Linux (shared and static). But,
# since it takes a few hours to build, it is not currently a target.
@@ -486,12 +486,12 @@ $(ibidir)/cmake: $(tdir)/cmake-$(cmake-version).tar.gz \
# cURL (and its library, which is needed by several programs here) can
# optionally link with many different network-related libraries on the host
-# system that we are not yet building in the pipeline. Many of these are
+# system that we are not yet building in the template. Many of these are
# not relevant to most science projects, so we are explicitly using
# `--without-XXX' or `--disable-XXX' so cURL doesn't link with them. Note
-# that if it does link with them, the pipeline will crash when the library
-# is updated/changed by the host, and the whole purpose of this pipeline is
-# avoid dependency on the host as much as possible.
+# that if it does link with them, the configuration will crash when the
+# library is updated/changed by the host, and the whole purpose of this
+# project is avoid dependency on the host as much as possible.
$(ibidir)/curl: $(tdir)/curl-$(curl-version).tar.gz
$(call gbuild, $<, curl-$(curl-version), , \
LIBS="-pthread" \
@@ -526,7 +526,7 @@ $(ibidir)/git: $(tdir)/git-$(git-version).tar.xz \
# Metastore is used (through a Git hook) to restore the source modification
# dates of files after a Git checkout. Another Git hook saves all file
# metadata just before a commit (to allow restoration after a
-# checkout). Since this pipeline is managed in Makefiles, file modification
+# checkout). Since this project is managed in Makefiles, file modification
# dates are critical to not having to redo the whole analysis after
# checking out between branches.
#
@@ -583,8 +583,8 @@ $(ibidir)/metastore: $(tdir)/metastore-$(metastore-version).tar.gz \
echo "metastore couldn't be installed!"
echo
echo "Its used for preserving timestamps on Git commits."
- echo "Its useful for development, not simple running of the pipeline."
- echo "So we won't stop the pipeline because it wasn't built."
+ echo "Its useful for development, not simple running of the project."
+ echo "So we won't stop the configuration because it wasn't built."
echo "*****************"
fi
@@ -634,8 +634,8 @@ $(ibidir)/zip: $(tdir)/zip-$(zip-version).tar.gz
# Since we want to avoid complicating the PATH, we are putting a symbolic
# link of all the TeX Live executables in $(ibdir). But symbolic links are
# hard to track for Make (as a target). Also, TeX in general is optional
-# for the pipeline (the processing is the main target, not the generation
-# of the final PDF). So we'll make a simple ASCII file called
+# for the project (the processing is the main target, not the generation of
+# the final PDF). So we'll make a simple ASCII file called
# `texlive-ready-tlmgr' and use its contents to mark if we can use it or
# not.
$(itidir)/texlive-ready-tlmgr: $(tdir)/install-tl-unx.tar.gz \
@@ -648,7 +648,7 @@ $(itidir)/texlive-ready-tlmgr: $(tdir)/install-tl-unx.tar.gz \
rm -rf install-tl-*
tar xf $(tdir)/install-tl-unx.tar.gz
cd install-tl-*
- sed -e's|@installdir[@]|$(idir)|g' -e's|@topdir[@]|'"$$topdir"'|g' \
+ sed -e's|@installdir[@]|$(idir)|g' \
$$topdir/reproduce/config/pipeline/texlive.conf > texlive.conf
# TeX Live's installation may fail due to any reason. But TeX Live
@@ -688,9 +688,6 @@ $(itidir)/texlive: reproduce/config/pipeline/dependency-texlive.mk \
if [ x"$$res" = x"NOT!" ]; then
echo "" > $@
else
- # The current directory is necessary later.
- topdir=$$(pwd)
-
# Install all the extra necessary packages. If LaTeX complains
# about not finding a command/file/what-ever/XXXXXX, simply run
# the following command to find which package its in, then add it
diff --git a/reproduce/src/make/download.mk b/reproduce/src/make/download.mk
index 28ee5ff..dfc49da 100644
--- a/reproduce/src/make/download.mk
+++ b/reproduce/src/make/download.mk
@@ -25,8 +25,8 @@
# --------------------
#
# The input dataset properties are defined in `$(pconfdir)/INPUTS.mk'. For
-# this template pipeline we only have one dataset to enable easy
-# processing, so all the extra checks in this rule may seem redundant.
+# this template we only have one dataset to enable easy processing, so all
+# the extra checks in this rule may seem redundant.
#
# In a real project, you will need more than one dataset. In that case,
# just add them to the target list and add an `elif' statement to define it
@@ -35,7 +35,7 @@
# Files in a server usually have very long names, which are mainly designed
# for helping in data-base management and being generic. Since Make uses
# file names to identify which rule to execute, and the scope of this
-# research pipeline is much less than the generic survey/dataset, it is
+# research project is much less than the generic survey/dataset, it is
# easier to have a simple/short name for the input dataset and work with
# that. In the first condition of the recipe below, we connect the short
# name with the raw database name of the dataset.
diff --git a/reproduce/src/make/initialize.mk b/reproduce/src/make/initialize.mk
index f9e054f..cd533f2 100644
--- a/reproduce/src/make/initialize.mk
+++ b/reproduce/src/make/initialize.mk
@@ -1,4 +1,4 @@
-# Initialize the reproduction pipeline.
+# Project initialization.
#
# Copyright (C) 2018-2019 Mohammad Akhlaghi <mohammad@akhlaghi.org>
#
@@ -22,14 +22,14 @@
# High-level directory definitions
# --------------------------------
#
-# Basic directories that are used throughout the whole pipeline.
+# Basic directories that are used throughout the project.
#
# Locks are used to make sure that an operation is done in series not in
# parallel (even if Make is run in parallel with the `-j' option). The most
# common case is downloads which are better done in series and not in
# parallel. Also, some programs may not be thread-safe, therefore it will
-# be necessary to put a lock on them. This pipeline uses the `flock'
-# program to achieve this.
+# be necessary to put a lock on them. This project uses the `flock' program
+# to achieve this.
texdir = $(BDIR)/tex
srcdir = reproduce/src
lockdir = $(BDIR)/locks
@@ -48,7 +48,7 @@ gconfdir = reproduce/config/gnuastro
# TeX build directory
# ------------------
#
-# In scenarios where multiple users are working on the pipeline
+# In scenarios where multiple users are working on the project
# simultaneously, they can't all build the final paper together, there will
# be conflicts! It is possible to manage the working on the analysis, so no
# conflict is caused in that phase, but it would be very slow to only let
@@ -99,7 +99,7 @@ curdir := $(shell echo $$(pwd))
# want Make to run the specific version of Bash that we have installed
# during `./configure' time.
#
-# Regarding the directories, this pipeline builds its major dependencies
+# Regarding the directories, this project builds its major dependencies
# itself and doesn't use the local system's default tools. With these
# environment variables, we are setting it to prefer the software we have
# build here.
@@ -143,18 +143,12 @@ export MPI_PYTHON3_SITEARCH :=
# directories (or possible sub-directories) for individual steps will be
# defined and added within their own Makefiles.
#
-# IMPORTANT NOTE for $(BDIR)'s dependency: it only depends on the existance
-# (not the time-stamp) of `$(pconfdir)/LOCAL.mk'. So the user can make any
-# changes within that file and if they don't affect the pipeline. For
-# example a change of the top $(BDIR) name, while the contents are the same
-# as before.
-#
# The `.SUFFIXES' rule with no prerequisite is defined to eliminate all the
# default implicit rules. The default implicit rules are to do with
# programming (for example converting `.c' files to `.o' files). The
# problem they cause is when you want to debug the make command with `-d'
# option: they add too many extra checks that make it hard to find what you
-# are looking for in this pipeline.
+# are looking for in the outputs.
.SUFFIXES:
$(lockdir): | $(BDIR); mkdir $@
$(texbdir): | $(texdir); mkdir $@
@@ -172,8 +166,8 @@ $(tikzdir): | $(texbdir); mkdir $@ && ln -fs $@ tex/tikz
#
# Only `$(mtexdir)/initialize.tex' corresponds to a file. This is because
# we want to ensure that the file is always built in every run: it contains
-# the pipeline version which may change between two separate runs, even
-# when no file actually differs.
+# the project version which may change between two separate runs, even when
+# no file actually differs.
packagebasename := $(shell echo paper-$$(git describe --dirty --always))
packagecontents = $(texdir)/$(packagebasename)
.PHONY: all clean dist dist-zip distclean clean-mmap $(packagecontents) \
@@ -260,7 +254,7 @@ $(packagecontents): | $(texdir)
rm $$dir/reproduce/config/pipeline/LOCAL.mk
rm $$dir/reproduce/config/gnuastro/gnuastro-local.conf
- # PIPELINE SPECIFIC: under this comment, copy any other file for
+ # PROJECT SPECIFIC: under this comment, copy any other file for
# packaging, or remove any of the copied files above to suite your
# project.
@@ -313,7 +307,7 @@ pvcheck = prog="$(strip $(1))"; \
if [ "x$$verop" = x ]; then V="--version"; else V=$$verop; fi; \
v=$$($$prog $$V | awk '/'$$ver'/{print "y"; exit 0}'); \
if [ x$$v != xy ]; then \
- echo; echo "PIPELINE ERROR: Not running $$name $$ver"; echo; \
+ echo; echo "PROJECT ERROR: Not running $$name $$ver"; echo; \
exit 1; \
fi; \
echo "\newcommand{\\$$macro}{$$ver}" >> $@
@@ -325,7 +319,7 @@ lvcheck = idir=$(BDIR)/dependencies/installed/include; \
macro="$(strip $(4))"; \
v=$$(awk '/^\#/&&/define/&&/'$$ver'/{print "y";exit 0}' $$f); \
if [ x$$v != xy ]; then \
- echo; echo "PIPELINE ERROR: Not linking with $$name $$ver"; \
+ echo; echo "PROJECT ERROR: Not linking with $$name $$ver"; \
echo; exit 1; \
fi; \
echo "\newcommand{\\$$macro}{$$ver}" >> $@
@@ -333,15 +327,15 @@ lvcheck = idir=$(BDIR)/dependencies/installed/include; \
-# Pipeline initialization results
-# -------------------------------
+# Project initialization results
+# ------------------------------
#
-# This file will store some basic info about the pipeline that is necessary
+# This file will store some basic info about the project that is necessary
# for the final PDF. Since these are not version controlled, it must be
-# calculated everytime the pipeline is run. So even though this file
+# calculated everytime the project is run. So even though this file
# actually exists, it is also aded as a `.PHONY' target above.
$(mtexdir)/initialize.tex: | $(mtexdir)
- # Version of the pipeline and build directory (for LaTeX inputs).
+ # Version of the project.
@v=$$(git describe --dirty --always);
echo "\newcommand{\pipelineversion}{$$v}" > $@
diff --git a/reproduce/src/make/paper.mk b/reproduce/src/make/paper.mk
index 86cf114..0c42bee 100644
--- a/reproduce/src/make/paper.mk
+++ b/reproduce/src/make/paper.mk
@@ -22,9 +22,8 @@
# ----------------------
#
# To report the input settings and results, the final report's PDF (final
-# target of this reproduction pipeline) uses macros generated from various
-# steps of the pipeline. All these macros are defined in
-# `$(mtexdir)/pipeline.tex'.
+# target of this project) uses macros generated from various steps of the
+# project. All these macros are defined in `$(mtexdir)/pipeline.tex'.
#
# `$(mtexdir)/pipeline.tex' is actually just a combination of separate
# files that keep the LaTeX macros related to each workhorse Makefile (in
@@ -32,7 +31,7 @@
# `$(mtexdir)/pipeline.tex'. The only workhorse Makefile that doesn't need
# to produce LaTeX macros is this Makefile (`reproduce/src/make/paper.mk').
#
-# This file is thus the interface between the pipeline scripts and the
+# This file is thus the interface between the processing scripts and the
# final PDF: when we get to this point, all the processing has been
# completed.
#
@@ -61,13 +60,13 @@ $(mtexdir)/pipeline.tex: $(foreach s, $(subst paper,,$(makesrc)), $(mtexdir)/$(s
echo "LaTeX-built PDF paper will not be built."
echo
if [ x$(more-on-building-pdf) = x1 ]; then
- echo "To do so, make sure you have LaTeX within the pipeline (you"
+ echo "To do so, make sure you have LaTeX within the project (you"
echo "can check by running './.local/bin/latex --version'), _AND_"
echo "make sure that the 'pdf-build-final' variable has a value."
echo "'pdf-build-final' is defined in: "
echo "'reproduce/config/pipeline/pdf-build.mk'."
echo
- echo "If you don't have LaTeX within the pipeline, please re-run"
+ echo "If you don't have LaTeX within the project, please re-run"
echo "'./configure' when you have internet access. To speed it up,"
echo "you can keep the previous configuration files (answer 'n'"
echo "when it asks about re-writing previous configuration files)."
@@ -120,8 +119,8 @@ $(texbdir)/paper.bbl: tex/src/references.tex \
# Run LaTeX in the `$(texbdir)' directory so all the intermediate and
# auxiliary files stay there and keep the top directory clean. To be able
# to run everything cleanly from there, it is necessary to add the current
-# directory (top reproduction pipeline directory) to the `TEXINPUTS'
-# environment variable.
+# directory (top project directory) to the `TEXINPUTS' environment
+# variable.
paper.pdf: $(mtexdir)/pipeline.tex paper.tex $(texbdir)/paper.bbl \
| $(tikzdir) $(texbdir)
@@ -135,7 +134,7 @@ paper.pdf: $(mtexdir)/pipeline.tex paper.tex $(texbdir)/paper.bbl \
cd $(texbdir)
pdflatex -shell-escape -halt-on-error $$p/paper.tex
- # Come back to the top pipeline directory and copy the built PDF
+ # Come back to the top project directory and copy the built PDF
# file here.
cd $$p
cp $(texbdir)/$@ $(final-paper)
diff --git a/reproduce/src/make/top.mk b/reproduce/src/make/top.mk
index 14bdbf3..763dbd7 100644
--- a/reproduce/src/make/top.mk
+++ b/reproduce/src/make/top.mk
@@ -1,4 +1,4 @@
-# A ONE-LINE DESCRIPTION OF THE WHOLE PIPELINE
+# Top-level Makefile (first to be loaded).
#
# Copyright (C) 2018-2019 Mohammad Akhlaghi <mohammad@akhlaghi.org>
#
@@ -26,22 +26,21 @@ include reproduce/config/pipeline/LOCAL.mk
-# Ultimate target of this pipeline
-# --------------------------------
+# Ultimate target of this project
+# -------------------------------
#
-# The final paper/report (`paper.pdf') is the main target of this whole
-# reproduction pipeline. So as defined in the Make paradigm, it is the
-# first target that we define (immediately after loading the local
-# configuration settings, necessary for a group building scenario mentioned
-# next).
+# The final paper/report (`paper.pdf') is the main target of this
+# project. As defined in the Make paradigm, it must be the first target
+# that Make encounters (immediately after loading the local configuration
+# settings, necessary for a group building scenario mentioned next).
#
#
# Group build
# -----------
#
-# This pipeline can also be configured to have a shared build directory
+# This project can also be configured to have a shared build directory
# between multiple users. In this scenario, many users (on a server) can
-# have their own/separate version controlled pipeline source, but share the
+# have their own/separate version controlled project source, but share the
# same build outputs (in a common directory). This will allow a group to
# work separately, on parallel parts of the analysis that don't
# interfere. It is thus very useful in cases were special storage
@@ -55,8 +54,8 @@ include reproduce/config/pipeline/LOCAL.mk
# was used to call Make).
#
# The analysis is only done when both have the same group name. Note that
-# when the pipeline isn't being built for a group, both variables will be
-# an empty string.
+# when the project isn't being built for a group, both variables will be an
+# empty string.
#
#
# Only processing, no LaTeX PDF
@@ -70,10 +69,10 @@ all: paper.pdf
else
all:
@if [ "x$(GROUP-NAME)" = x ]; then \
- echo "Pipeline is NOT configured for groups, please run"; \
+ echo "Project is NOT configured for groups, please run"; \
echo " $$ .local/bin/make"; \
else \
- echo "Pipeline is configured for groups, please run"; \
+ echo "Project is configured for groups, please run"; \
echo " $$ ./for-group $(GROUP-NAME) make -j8"; \
fi
endif
@@ -106,7 +105,7 @@ endif
# include Makefiles from any other Makefile.
#
# IMPORTANT NOTE: order matters in the inclusion of the processing
-# Makefiles. As the pipeline grows, some Makefiles will define
+# Makefiles. As the project grows, some Makefiles will define
# variables/dependencies that later Makefiles need. Therefore we are using
# a `foreach' loop in the next step to explicitly request loading them in
# the same order that they are defined here (we aren't just using a
@@ -131,6 +130,6 @@ makesrc = initialize \
# above.
#
# 2) Then, we'll import the workhorse-Makefiles which contain rules to
-# actually do the processing of this pipeline.
+# actually do this project's processing.
include $(filter-out %LOCAL.mk, reproduce/config/pipeline/*.mk)
include $(foreach s,$(makesrc), reproduce/src/make/$(s).mk)
diff --git a/tex/src/preamble-pgfplots.tex b/tex/src/preamble-pgfplots.tex
index bf6bbbd..705e897 100644
--- a/tex/src/preamble-pgfplots.tex
+++ b/tex/src/preamble-pgfplots.tex
@@ -2,7 +2,7 @@
%% -----------------
%
%% PGFPLOTS is a package in (La)TeX for making plots internally. It fits
-%% nicely with the purpose of a reproduction pipeline. But it isn't
+%% nicely with the purpose of a reproducible project. But it isn't
%% mandatory. Therefore if you don't need it, just comment/delete the line
%% that includes this file in the top LaTeX source (`paper.tex').
%
@@ -13,7 +13,7 @@
%% the papers. 2) It doesn't require any extra dependency (it is
%% distributed as part of TeX-live). Adding specific programs/libraries for
%% plots can greatly increase the number of dependencies for the
-%% pipeline. For example Python's Matplotlib library is indeed very good,
+%% project. For example Python's Matplotlib library is indeed very good,
%% but it requires Python and Numpy. The latter is not easy to build from
%% source, so after a few years, installing the required version can be
%% very frustrating.