about: Fix code blocks

Maybe we should create a side navigation toolbar.
author: Pedram Ashofteh Ardakani <pedramardakani@gmail.com> 2020-04-29 15:27:13 +0430
committer: Pedram Ashofteh Ardakani <pedramardakani@gmail.com> 2020-04-29 15:28:52 +0430
commit: 3fd41afb4bf67d0b2b2aae76b133d97d024ddcbe (patch)
tree: 08d3bac9e1ca26ceee3880565eec3a6ee4dd1f49
parent: 804463c799d96802c466f10fd5aa9037430f5ae1 (diff)
1 files changed, 1278 insertions, 1230 deletions
diff --git a/about.html b/about.html
index fbfc83c..8a4a069 100644
--- a/about.html
+++ b/about.html
@@ -1,1240 +1,1288 @@
-<h1>Maneage: managing data lineage</h1>
-
-<p>Copyright (C) 2018-2020 Mohammad Akhlaghi <a href="&#109;&#x61;&#x69;&#x6C;&#x74;&#x6F;:&#x6D;&#111;&#104;&#97;&#x6D;&#109;a&#x64;&#64;&#x61;&#107;&#x68;&#x6C;&#x61;&#x67;&#104;&#x69;.&#x6F;&#x72;&#103;">&#x6D;&#111;&#104;&#97;&#x6D;&#109;a&#x64;&#64;&#x61;&#107;&#x68;&#x6C;&#x61;&#x67;&#104;&#x69;.&#x6F;&#x72;&#103;</a>\
-Copyright (C) 2020 Raul Infante-Sainz <a href="m&#x61;&#105;&#108;t&#111;:&#x69;&#x6E;&#x66;&#x61;&#x6E;&#116;&#101;&#115;&#97;&#x69;n&#122;&#64;&#103;&#x6D;&#x61;&#x69;&#x6C;&#x2E;&#x63;&#111;&#x6D;">&#x69;&#x6E;&#x66;&#x61;&#x6E;&#116;&#101;&#115;&#97;&#x69;n&#122;&#64;&#103;&#x6D;&#x61;&#x69;&#x6C;&#x2E;&#x63;&#111;&#x6D;</a>\
-See the end of the file for license conditions.</p>
-
-<p>Maneage is a <strong>fully working template</strong> for doing reproducible research (or
-writing a reproducible paper) as defined in the link below. If the link
-below is not accessible at the time of reading, please see the appendix at
-the end of this file for a portion of its introduction. Some
-<a href="http://akhlaghi.org/pdf/reproducible-paper.pdf">slides</a> are also available
-to help demonstrate the concept implemented here.</p>
-
-<p>http://akhlaghi.org/reproducible-science.html</p>
-
-<p>Maneage is created with the aim of supporting reproducible research by
-making it easy to start a project in this framework. As shown below, it is
-very easy to customize Maneage for any particular (research) project and
-expand it as it starts and evolves. It can be run with no modification (as
-described in <code>README.md</code>) as a demonstration and customized for use in any
-project as fully described below.</p>
-
-<p>A project designed using Maneage will download and build all the necessary
-libraries and programs for working in a closed environment (highly
-independent of the host operating system) with fixed versions of the
-necessary dependencies. The tarballs for building the local environment are
-also collected in a <a href="http://git.maneage.org/tarballs-software.git/tree/">separate
-repository</a>. The final
-output of the project is <a href="http://git.maneage.org/output-raw.git/plain/paper.pdf">a
-paper</a>.  Notice the
-last paragraph of the Acknowledgments where all the necessary software are
-mentioned with their versions.</p>
-
-<p>Below, we start with a discussion of why Make was chosen as the high-level
-language/framework for project management and how to learn and master Make
-easily (and freely). The general architecture and design of the project is
-then discussed to help you navigate the files and their contents. This is
-followed by a checklist for the easy/fast customization of Maneage to your
-exciting research. We continue with some tips and guidelines on how to
-manage or extend your project as it grows based on our experiences with it
-so far. The main body concludes with a description of possible future
-improvements that are planned for Maneage (but not yet implemented). As
-discussed above, we end with a short introduction on the necessity of
-reproducible science in the appendix.</p>
-
-<p>Please don't forget to share your thoughts, suggestions and
-criticisms. Maintaining and designing Maneage is itself a separate project,
-so please join us if you are interested. Once it is mature enough, we will
-describe it in a paper (written by all contributors) for a formal
-introduction to the community.</p>
-
-<h2>Why Make?</h2>
-
-<p>When batch processing is necessary (no manual intervention, as in a
-reproducible project), shell scripts are usually the first solution that
-come to mind. However, the inherent complexity and non-linearity of
-progress in a scientific project (where experimentation is key) make it
-hard to manage the script(s) as the project evolves. For example, a script
-will start from the top/start every time it is run. So if you have already
-completed 90% of a research project and want to run the remaining 10% that
-you have newly added, you have to run the whole script from the start
-again. Only then will you see the effects of the last new steps (to find
-possible errors, or better solutions and etc).</p>
-
-<p>It is possible to manually ignore/comment parts of a script to only do a
-special part. However, such checks/comments will only add to the complexity
-of the script and will discourage you to play-with/change an already
-completed part of the project when an idea suddenly comes up. It is also
-prone to very serious bugs in the end (when trying to reproduce from
-scratch). Such bugs are very hard to notice during the work and frustrating
-to find in the end.</p>
-
-<p>The Make paradigm, on the other hand, starts from the end: the final
-<em>target</em>. It builds a dependency tree internally, and finds where it should
-start each time the project is run. Therefore, in the scenario above, a
-researcher that has just added the final 10% of steps of her research to
-her Makefile, will only have to run those extra steps. With Make, it is
-also trivial to change the processing of any intermediate (already written)
-<em>rule</em> (or step) in the middle of an already written analysis: the next
-time Make is run, only rules that are affected by the changes/additions
-will be re-run, not the whole analysis/project.</p>
-
-<p>This greatly speeds up the processing (enabling creative changes), while
-keeping all the dependencies clearly documented (as part of the Make
-language), and most importantly, enabling full reproducibility from scratch
-with no changes in the project code that was working during the
-research. This will allow robust results and let the scientists get to what
-they do best: experiment and be critical to the methods/analysis without
-having to waste energy and time on technical problems that come up as a
-result of that experimentation in scripts.</p>
-
-<p>Since the dependencies are clearly demarcated in Make, it can identify
-independent steps and run them in parallel. This further speeds up the
-processing. Make was designed for this purpose. It is how huge projects
-like all Unix-like operating systems (including GNU/Linux or Mac OS
-operating systems) and their core components are built. Therefore, Make is
-a highly mature paradigm/system with robust and highly efficient
-implementations in various operating systems perfectly suited for a complex
-non-linear research project.</p>
-
-<p>Make is a small language with the aim of defining <em>rules</em> containing
-<em>targets</em>, <em>prerequisites</em> and <em>recipes</em>. It comes with some nice features
-like functions or automatic-variables to greatly facilitate the management
-of text (filenames for example) or any of those constructs. For a more
-detailed (yet still general) introduction see the article on Wikipedia:</p>
-
-<p>https://en.wikipedia.org/wiki/Make_(software)</p>
-
-<p>Make is a +40 year old software that is still evolving, therefore many
-implementations of Make exist. The only difference in them is some extra
-features over the <a href="https://pubs.opengroup.org/onlinepubs/009695399/utilities/make.html">standard
-definition</a>
-(which is shared in all of them). Maneage is primarily written in GNU Make
-(which it installs itself, you don't have to have it on your system). GNU
-Make is the most common, most actively developed, and most advanced
-implementation. Just note that Maneage downloads, builds, internally
-installs, and uses its own dependencies (including GNU Make), so you don't
-have to have it installed before you try it out.</p>
-
-<h2>How can I learn Make?</h2>
-
-<p>The GNU Make book/manual (links below) is arguably the best place to learn
-Make. It is an excellent and non-technical book to help get started (it is
-only non-technical in its first few chapters to get you started easily). It
-is freely available and always up to date with the current GNU Make
-release. It also clearly explains which features are specific to GNU Make
-and which are general in all implementations. So the first few chapters
-regarding the generalities are useful for all implementations.</p>
-
-<p>The first link below points to the GNU Make manual in various formats and
-in the second, you can download it in PDF (which may be easier for a first
-time reading).</p>
-
-<p>https://www.gnu.org/software/make/manual/</p>
-
-<p>https://www.gnu.org/software/make/manual/make.pdf</p>
-
-<p>If you use GNU Make, you also have the whole GNU Make manual on the
-command-line with the following command (you can come out of the "Info"
-environment by pressing <code>q</code>).</p>
-
-<p><code>shell
-  $ info make
-</code></p>
-
-<p>If you aren't familiar with the Info documentation format, we strongly
-recommend running <code>$ info info</code> and reading along. In less than an hour,
-you will become highly proficient in it (it is very simple and has a great
-manual for itself). Info greatly simplifies your access (without taking
-your hands off the keyboard!) to many manuals that are installed on your
-system, allowing you to be much more efficient as you work. If you use the
-GNU Emacs text editor (or any of its variants), you also have access to all
-Info manuals while you are writing your projects (again, without taking
-your hands off the keyboard!).</p>
-
-<h2>Published works using Maneage</h2>
-
-<p>The list below shows some of the works that have already been published
-with (earlier versions of) Maneage. Previously it was simply called
-"Reproducible paper template". Note that Maneage is evolving, so some
-details may be different in them. The more recent ones can be used as a
-good working example.</p>
-
-<ul>
-<li><p>Infante-Sainz et
-al. (<a href="https://ui.adsabs.harvard.edu/abs/2020MNRAS.491.5317I">2020</a>,
-MNRAS, 491, 5317): The version controlled project source is available
-<a href="https://gitlab.com/infantesainz/sdss-extended-psfs-paper">on GitLab</a>
-and is also archived on Zenodo with all the necessary software tarballs:
-<a href="https://zenodo.org/record/3524937">zenodo.3524937</a>.</p></li>
-<li><p>Akhlaghi (<a href="https://arxiv.org/abs/1909.11230">2019</a>, IAU Symposium
-355). The version controlled project source is available <a href="https://gitlab.com/makhlaghi/iau-symposium-355">on
-GitLab</a> and is also
-archived on Zenodo with all the necessary software tarballs:
-<a href="https://doi.org/10.5281/zenodo.3408481">zenodo.3408481</a>.</p></li>
-<li><p>Section 7.3 of Bacon et
-al. (<a href="http://adsabs.harvard.edu/abs/2017A%26A...608A...1B">2017</a>, A&amp;A
-608, A1): The version controlled project source is available <a href="https://gitlab.com/makhlaghi/muse-udf-origin-only-hst-magnitudes">on
-GitLab</a>
-and a snapshot of the project along with all the necessary input
-datasets and outputs is available in
-<a href="https://doi.org/10.5281/zenodo.1164774">zenodo.1164774</a>.</p></li>
-<li><p>Section 4 of Bacon et
-al. (<a href="http://adsabs.harvard.edu/abs/2017A%26A...608A...1B">2017</a>, A&amp;A,
-608, A1): The version controlled project is available <a href="https://gitlab.com/makhlaghi/muse-udf-photometry-astrometry">on
-GitLab</a> and
-a snapshot of the project along with all the necessary input datasets is
-available in <a href="https://doi.org/10.5281/zenodo.1163746">zenodo.1163746</a>.</p></li>
-<li><p>Akhlaghi &amp; Ichikawa
-(<a href="http://adsabs.harvard.edu/abs/2015ApJS..220....1A">2015</a>, ApJS, 220,
-1): The version controlled project is available <a href="https://gitlab.com/makhlaghi/NoiseChisel-paper">on
-GitLab</a>. This is the
-very first (and much less mature!) incarnation of Maneage: the history
-of Maneage started more than two years after this paper was
-published. It is a very rudimentary/initial implementation, thus it is
-only included here for historical reasons. However, the project source
-is complete, accurate and uploaded to arXiv along with the paper.</p></li>
-</ul>
-
-<h2>Citation</h2>
-
-<p>A paper to fully describe Maneage has been submitted. Until then, if you
-used it in your work, please cite the paper that implemented its first
-version: Akhlaghi &amp; Ichikawa
-(<a href="http://adsabs.harvard.edu/abs/2015ApJS..220....1A">2015</a>, ApJS, 220, 1).</p>
-
-<p>Also, when your paper is published, don't forget to add a notice in your
-own paper (in coordination with the publishing editor) that the paper is
-fully reproducible and possibly add a sentence or paragraph in the end of
-the paper shortly describing the concept. This will help spread the word
-and encourage other scientists to also manage and publish their projects in
-a reproducible manner.</p>
-
-<h1>Project architecture</h1>
-
-<p>In order to customize Maneage to your research, it is important to first
-understand its architecture so you can navigate your way in the directories
-and understand how to implement your research project within its framework:
-where to add new files and which existing files to modify for what
-purpose. But if this the first time you are using Maneage, before reading
-this theoretical discussion, please run Maneage once from scratch without
-any changes (described in <code>README.md</code>). You will see how it works (note that
-the configure step builds all necessary software, so it can take long, but
-you can continue reading while its working).</p>
-
-<p>The project has two top-level directories: <code>reproduce</code> and
-<code>tex</code>. <code>reproduce</code> hosts all the software building and analysis
-steps. <code>tex</code> contains all the final paper's components to be compiled into
-a PDF using LaTeX.</p>
-
-<p>The <code>reproduce</code> directory has two sub-directories: <code>software</code> and
-<code>analysis</code>. As the name says, the former contains all the instructions to
-download, build and install (independent of the host operating system) the
-necessary software (these are called by the <code>./project configure</code>
-command). The latter contains instructions on how to use those software to
-do your project's analysis.</p>
-
-<p>After it finishes, <code>./project configure</code> will create the following symbolic
-links in the project's top source directory: <code>.build</code> which points to the
-top build directory and <code>.local</code> for easy access to the custom built
-software installation directory. With these you can easily access the build
-directory and project-specific software from your top source directory. For
-example if you run <code>.local/bin/ls</code> you will be using the <code>ls</code> of Maneage,
-which is probably different from your system's <code>ls</code> (run them both with
-<code>--version</code> to check).</p>
-
-<p>Once the project is configured for your system, <code>./project make</code> will do
-the basic preparations and run the project's analysis with the custom
-version of software. The <code>project</code> script is just a wrapper, and with the
-<code>make</code> argument, it will first call <code>top-prepare.mk</code> and <code>top-make.mk</code>
-(both are in the <code>reproduce/analysis/make</code> directory).</p>
-
-<p>In terms of organization, <code>top-prepare.mk</code> and <code>top-make.mk</code> have an
-identical design, only minor differences. So, let's continue Maneage's
-architecture with <code>top-make.mk</code>. Once you understand that, you'll clearly
-understand <code>top-prepare.mk</code> also. These very high-level files are
-relatively short and heavily commented so hopefully the descriptions in
-each comment will be enough to understand the general details. As you read
-this section, please also look at the contents of the mentioned files and
-directories to fully understand what is going on.</p>
-
-<p>Before starting to look into the top <code>top-make.mk</code>, it is important to
-recall that Make defines dependencies by files. Therefore, the
-input/prerequisite and output of every step/rule must be a file. Also
-recall that Make will use the modification date of the prerequisite(s) and
-target files to see if the target must be re-built or not. Therefore during
-the processing, <em>many</em> intermediate files will be created (see the tips
-section below on a good strategy to deal with large/huge files).</p>
-
-<p>To keep the source and (intermediate) built files separate, the user <em>must</em>
-define a top-level build directory variable (or <code>$(BDIR)</code>) to host all the
-intermediate files (you defined it during <code>./project configure</code>). This
-directory doesn't need to be version controlled or even synchronized, or
-backed-up in other servers: its contents are all products, and can be
-easily re-created any time. As you define targets for your new rules, it is
-thus important to place them all under sub-directories of <code>$(BDIR)</code>. As
-mentioned above, you always have fast access to this "build"-directory with
-the <code>.build</code> symbolic link. Also, beware to <em>never</em> make any manual change
-in the files of the build-directory, just delete them (so they are
-re-built).</p>
-
-<p>In this architecture, we have two types of Makefiles that are loaded into
-the top <code>Makefile</code>: <em>configuration-Makefiles</em> (only independent
-variables/configurations) and <em>workhorse-Makefiles</em> (Makefiles that
-actually contain analysis/processing rules).</p>
-
-<p>The configuration-Makefiles are those that satisfy these two wildcards:
-<code>reproduce/software/config/*.conf</code> (for building the necessary software
-when you run <code>./project configure</code>) and <code>reproduce/analysis/config/*.conf</code>
-(for the high-level analysis, when you run <code>./project make</code>). These
-Makefiles don't actually have any rules, they just have values for various
-free parameters throughout the configuration or analysis. Open a few of
-them to see for yourself. These Makefiles must only contain raw Make
-variables (project configurations). By "raw" we mean that the Make
-variables in these files must not depend on variables in any other
-configuration-Makefile. This is because we don't want to assume any order
-in reading them. It is also very important to <em>not</em> define any rule, or
-other Make construct, in these configuration-Makefiles.</p>
-
-<p>Following this rule-of-thumb enables you to set these configure-Makefiles
-as a prerequisite to any target that depends on their variable
-values. Therefore, if you change any of their values, all targets that
-depend on those values will be re-built. This is very convenient as your
-project scales up and gets more complex.</p>
-
-<p>The workhorse-Makefiles are those satisfying this wildcard
-<code>reproduce/software/make/*.mk</code> and <code>reproduce/analysis/make/*.mk</code>. They
-contain the details of the processing steps (Makefiles containing
-rules). Therefore, in this phase <em>order is important</em>, because the
-prerequisites of most rules will be the targets of other rules that will be
-defined prior to them (not a fixed name like <code>paper.pdf</code>). The lower-level
-rules must be imported into Make before the higher-level ones.</p>
-
-<p>All processing steps are assumed to ultimately (usually after many rules)
-end up in some number, image, figure, or table that will be included in the
-paper. The writing of these results into the final report/paper is managed
-through separate LaTeX files that only contain macros (a name given to a
-number/string to be used in the LaTeX source, which will be replaced when
-compiling it to the final PDF). So the last target in a workhorse-Makefile
-is a <code>.tex</code> file (with the same base-name as the Makefile, but in
-<code>$(BDIR)/tex/macros</code>). As a result, if the targets in a workhorse-Makefile
-aren't directly a prerequisite of other workhorse-Makefile targets, they
-can be a prerequisite of that intermediate LaTeX macro file and thus be
-called when necessary. Otherwise, they will be ignored by Make.</p>
-
-<p>Maneage also has a mode to share the build directory between several
-users of a Unix group (when working on large computer clusters). In this
-scenario, each user can have their own cloned project source, but share the
-large built files between each other. To do this, it is necessary for all
-built files to give full permission to group members while not allowing any
-other users access to the contents. Therefore the <code>./project configure</code> and
-<code>./project make</code> steps must be called with special conditions which are
-managed in the <code>--group</code> option.</p>
-
-<p>Let's see how this design is implemented. Please open and inspect
-<code>top-make.mk</code> it as we go along here. The first step (un-commented line) is
-to import the local configuration (your answers to the questions of
-<code>./project configure</code>). They are defined in the configuration-Makefile
-<code>reproduce/software/config/LOCAL.conf</code> which was also built by <code>./project
-configure</code> (based on the <code>LOCAL.conf.in</code> template of the same directory).</p>
-
-<p>The next non-commented set of the top <code>Makefile</code> defines the ultimate
-target of the whole project (<code>paper.pdf</code>). But to avoid mistakes, a sanity
-check is necessary to see if Make is being run with the same group settings
-as the configure script (for example when the project is configured for
-group access using the <code>./for-group</code> script, but Make isn't). Therefore we
-use a Make conditional to define the <code>all</code> target based on the group
-permissions.</p>
-
-<p>Having defined the top/ultimate target, our next step is to include all the
-other necessary Makefiles. However, order matters in the importing of
-workhorse-Makefiles and each must also have a TeX macro file with the same
-base name (without a suffix). Therefore, the next step in the top-level
-Makefile is to define the <code>makesrc</code> variable to keep the base names
-(without a <code>.mk</code> suffix) of the workhorse-Makefiles that must be imported,
-in the proper order.</p>
-
-<p>Finally, we import all the necessary remaining Makefiles: 1) All the
-analysis configuration-Makefiles with a wildcard. 2) The software
-configuration-Makefile that contains their version (just in case its
-necessary). 3) All workhorse-Makefiles in the proper order using a Make
-<code>foreach</code> loop.</p>
-
-<p>In short, to keep things modular, readable and manageable, follow these
-recommendations: 1) Set clear-to-understand names for the
-configuration-Makefiles, and workhorse-Makefiles, 2) Only import other
-Makefiles from top Makefile. These will let you know/remember generally
-which step you are taking before or after another. Projects will scale up
-very fast. Thus if you don't start and continue with a clean and robust
-convention like this, in the end it will become very dirty and hard to
-manage/understand (even for yourself). As a general rule of thumb, break
-your rules into as many logically-similar but independent steps as
-possible.</p>
-
-<p>The <code>reproduce/analysis/make/paper.mk</code> Makefile must be the final Makefile
-that is included. This workhorse Makefile ends with the rule to build
-<code>paper.pdf</code> (final target of the whole project). If you look in it, you
-will notice that this Makefile starts with a rule to create
-<code>$(mtexdir)/project.tex</code> (<code>mtexdir</code> is just a shorthand name for
-<code>$(BDIR)/tex/macros</code> mentioned before). As you see, the only dependency of
-<code>$(mtexdir)/project.tex</code> is <code>$(mtexdir)/verify.tex</code> (which is the last
-analysis step: it verifies all the generated results).  Therefore,
-<code>$(mtexdir)/project.tex</code> is <em>the connection</em> between the
-processing/analysis steps of the project, and the steps to build the final
-PDF.</p>
-
-<p>During the research, it often happens that you want to test a step that is
-not a prerequisite of any higher-level operation. In such cases, you can
-(temporarily) define that processing as a rule in the most relevant
-workhorse-Makefile and set its target as a prerequisite of its TeX
-macro. If your test gives a promising result and you want to include it in
-your research, set it as prerequisites to other rules and remove it from
-the list of prerequisites for TeX macro file. In fact, this is how a
-project is designed to grow in this framework.</p>
-
-<h2>File modification dates (meta data)</h2>
-
-<p>While Git does an excellent job at keeping a history of the contents of
-files, it makes no effort in keeping the file meta data, and in particular
-the dates of files. Therefore when you checkout to a different branch,
-files that are re-written by Git will have a newer date than the other
-project files. However, file dates are important in the current design of
-Maneage: Make checks the dates of the prerequisite files and target files
-to see if the target should be re-built.</p>
-
-<p>To fix this problem, for Maneage we use a forked version of
-<a href="https://github.com/mohammad-akhlaghi/metastore">Metastore</a>. Metastore use
-a binary database file (which is called <code>.file-metadata</code>) to keep the
-modification dates of all the files under version control. This file is
-also under version control, but is hidden (because it shouldn't be modified
-by hand). During the project's configuration, Maneage installs to Git hooks
-to run Metastore 1) before making a commit to update its database with the
-file dates in a branch, and 2) after doing a checkout, to reset the
-file-dates after the checkout is complete and re-set the file dates back to
-what they were.</p>
-
-<p>In practice, Metastore should work almost fully invisibly within your
-project. The only place you might notice its presence is that you'll see
-<code>.file-metadata</code> in the list of modified/staged files (commonly after
-merging your branches). Since its a binary file, Git also won't show you
-the changed contents. In a merge, you can simply accept any changes with
-<code>git add -u</code>. But if Git is telling you that it has changed without a merge
-(for example if you started a commit, but canceled it in the middle), you
-can just do <code>git checkout .file-metadata</code> and set it back to its original
-state.</p>
-
-<h2>Summary</h2>
-
-<p>Based on the explanation above, some major design points you should have in
-mind are listed below.</p>
-
-<ul>
-<li><p>Define new <code>reproduce/analysis/make/XXXXXX.mk</code> workhorse-Makefile(s)
-with good and human-friendly name(s) replacing <code>XXXXXX</code>.</p></li>
-<li><p>Add <code>XXXXXX</code>, as a new line, to the values in <code>makesrc</code> of the top-level
-<code>Makefile</code>.</p></li>
-<li><p>Do not use any constant numbers (or important names like filter names)
-in the workhorse-Makefiles or paper's LaTeX source. Define such
-constants as logically-grouped, separate configuration-Makefiles in
-<code>reproduce/analysis/config/XXXXX.conf</code>. Then set this
-configuration-Makefiles file as a prerequisite to any rule that uses
-the variable defined in it.</p></li>
-<li><p>Through any number of intermediate prerequisites, all processing steps
-should end in (be a prerequisite of) <code>$(mtexdir)/verify.tex</code> (defined in
-<code>reproduce/analysis/make/verify.mk</code>). <code>$(mtexdir)/verify.tex</code> is the sole
-dependency of <code>$(mtexdir)/project.tex</code>, which is the bridge between the
-processing steps and PDF-building steps of the project.</p></li>
-</ul>
-
-<h1>Customization checklist</h1>
-
-<p>Take the following steps to fully customize Maneage for your research
-project. After finishing the list, be sure to run <code>./project configure</code> and
-<code>project make</code> to see if everything works correctly. If you notice anything
-missing or any in-correct part (probably a change that has not been
-explained here), please let us know to correct it.</p>
-
-<p>As described above, the concept of reproducibility (during a project)
-heavily relies on <a href="https://en.wikipedia.org/wiki/Version_control">version
-control</a>. Currently Maneage
-uses Git as its main version control system. If you are not already
-familiar with Git, please read the first three chapters of the <a href="https://git-scm.com/book/en/v2">ProGit
-book</a> which provides a wonderful practical
-understanding of the basics. You can read later chapters as you get more
-advanced in later stages of your work.</p>
-
-<h2>First custom commit</h2>
-
-<ol>
-<li><p><strong>Get this repository and its history</strong> (if you don't already have it):
- Arguably the easiest way to start is to clone Maneage and prepare for
- your customizations as shown below. After the cloning first you rename
- the default <code>origin</code> remote server to specify that this is Maneage's
- remote server. This will allow you to use the conventional <code>origin</code>
- name for your own project as shown in the next steps. Second, you will
- create and go into the conventional <code>master</code> branch to start
- committing in your project later.</p>
-
-<p><code>shell
- $ git clone https://git.maneage.org/project.git    # Clone/copy the project and its history.
- $ mv project my-project                            # Change the name to your project's name.
- $ cd my-project                                    # Go into the cloned directory.
- $ git remote rename origin origin-maneage          # Rename current/only remote to "origin-maneage".
- $ git checkout -b master                           # Create and enter your own "master" branch.
- $ pwd                                              # Just to confirm where you are.
-</code></p></li>
-<li><p><strong>Prepare to build project</strong>: The <code>./project configure</code> command of the
- next step will build the different software packages within the
- "build" directory (that you will specify). Nothing else on your system
- will be touched. However, since it takes long, it is useful to see
- what it is being built at every instant (its almost impossible to tell
- from the torrent of commands that are produced!). So open another
- terminal on your desktop and navigate to the same project directory
- that you cloned (output of last command above). Then run the following
- command. Once every second, this command will just print the date
- (possibly followed by a non-existent directory notice). But as soon as
- the next step starts building software, you'll see the names of
- software get printed as they are being built. Once any software is
- installed in the project build directory it will be removed. Again,
- don't worry, nothing will be installed outside the build directory.</p>
-
-<p><code>shell
- # On another terminal (go to top project source directory, last command above)
- $ ./project --check-config
-</code></p></li>
-<li><p><strong>Test Maneage</strong>: Before making any changes, it is important to test it
- and see if everything works properly with the commands below. If there
- is any problem in the <code>./project configure</code> or <code>./project make</code> steps,
- please contact us to fix the problem before continuing. Since the
- building of dependencies in configuration can take long, you can take
- the next few steps (editing the files) while its working (they don't
- affect the configuration). After <code>./project make</code> is finished, open
- <code>paper.pdf</code>. If it looks fine, you are ready to start customizing the
- Maneage for your project. But before that, clean all the extra Maneage
- outputs with <code>make clean</code> as shown below.</p>
-
-<p>```shell
- $ ./project configure           # Build the project's software environment (can take an hour or so).
- $ ./project make                # Do the processing and build paper (just a simple demo).</p>
-
-<p># Open 'paper.pdf' and see if everything is ok.
- ```</p></li>
-<li><p><strong>Setup the remote</strong>: You can use any <a href="https://en.wikipedia.org/wiki/Comparison_of_source_code_hosting_facilities">hosting
- facility</a>
- that supports Git to keep an online copy of your project's version
- controlled history. We recommend <a href="https://gitlab.com">GitLab</a> because
- it is <a href="https://www.gnu.org/software/repo-criteria-evaluation.html">more ethical (although not
- perfect)</a>,
- and later you can also host GitLab on your own server. Anyway, create
- an account in your favorite hosting facility (if you don't already
- have one), and define a new project there. Please make sure <em>the newly
- created project is empty</em> (some services ask to include a <code>README</code> in
- a new project which is bad in this scenario, and will not allow you to
- push to it). It will give you a URL (usually starting with <code>git@</code> and
- ending in <code>.git</code>), put this URL in place of <code>XXXXXXXXXX</code> in the first
- command below. With the second command, "push" your <code>master</code> branch to
- your <code>origin</code> remote, and (with the <code>--set-upstream</code> option) set them
- to track/follow each other. However, the <code>maneage</code> branch is currently
- tracking/following your <code>origin-maneage</code> remote (automatically set
- when you cloned Maneage). So when pushing the <code>maneage</code> branch to your
- <code>origin</code> remote, you <em>shouldn't</em> use <code>--set-upstream</code>. With the last
- command, you can actually check this (which local and remote branches
- are tracking each other).</p>
-
-<p><code>shell
- git remote add origin XXXXXXXXXX        # Newly created repo is now called 'origin'.
- git push --set-upstream origin master   # Push 'master' branch to 'origin' (with tracking).
- git push origin maneage                 # Push 'maneage' branch to 'origin' (no tracking).
-</code></p></li>
-<li><p><strong>Title</strong>, <strong>short description</strong> and <strong>author</strong>: The title and basic
- information of your project's output PDF paper should be added in
- <code>paper.tex</code>. You should see the relevant place in the preamble (prior
- to <code>\begin{document}</code>. After you are done, run the <code>./project make</code>
- command again to see your changes in the final PDF, and make sure that
- your changes don't cause a crash in LaTeX. Of course, if you use a
- different LaTeX package/style for managing the title and authors (in
- particular a specific journal's style), please feel free to use it
- your own methods after finishing this checklist and doing your first
- commit.</p></li>
-<li><p><strong>Delete dummy parts</strong>: Maneage contains some parts that are only for
- the initial/test run, mainly as a demonstration of important steps,
- which you can use as a reference to use in your own project. But they
- not for any real analysis, so you should remove these parts as
- described below:</p>
-
-<ul>
-<li><p><code>paper.tex</code>: 1) Delete the text of the abstract (from
-<code>\includeabstract{</code> to <code>\vspace{0.25cm}</code>) and write your own (a
-single sentence can be enough now, you can complete it later). 2)
-Add some keywords under it in the keywords part. 3) Delete
-everything between <code>%% Start of main body.</code> and <code>%% End of main
-body.</code>. 4) Remove the notice in the "Acknowledgments" section (in
-<code>\new{}</code>) and Acknowledge your funding sources (this can also be
-done later). Just don't delete the existing acknowledgment
-statement: Maneage is possible thanks to funding from several
-grants. Since Maneage is being used in your work, it is necessary to
-acknowledge them in your work also.</p></li>
-<li><p><code>reproduce/analysis/make/top-make.mk</code>: Delete the <code>delete-me</code> line
-in the <code>makesrc</code> definition. Just make sure there is no empty line
-between the <code>download \</code> and <code>verify \</code> lines (they should be
-directly under each other).</p></li>
-<li><p><code>reproduce/analysis/make/verify.mk</code>: In the final recipe, under the
-commented line <code>Verify TeX macros</code>, remove the full line that
-contains <code>delete-me</code>, and set the value of <code>s</code> in the line for
-<code>download</code> to <code>XXXXX</code> (any temporary string, you'll fix it in the
-end of your project, when its complete).</p></li>
-<li><p>Delete all <code>delete-me*</code> files in the following directories:</p>
-
-<p><code>shell
-$ rm tex/src/delete-me*
-$ rm reproduce/analysis/make/delete-me*
-$ rm reproduce/analysis/config/delete-me*
-</code></p></li>
-<li><p>Disable verification of outputs by removing the <code>yes</code> from
-<code>reproduce/analysis/config/verify-outputs.conf</code>. Later, when you are
-ready to submit your paper, or publish the dataset, activate
-verification and make the proper corrections in this file (described
-under the "Other basic customizations" section below). This is a
-critical step and only takes a few minutes when your project is
-finished. So DON'T FORGET to activate it in the end.</p></li>
-<li><p>Re-make the project (after a cleaning) to see if you haven't
-introduced any errors.</p>
-
-<p><code>shell
-$ ./project make clean
-$ ./project make
-</code></p></li>
-</ul></li>
-<li><p><strong>Don't merge some files in future updates</strong>: As described below, you
- can later update your infra-structure (for example to fix bugs) by
- merging your <code>master</code> branch with <code>maneage</code>. For files that you have
- created in your own branch, there will be no problem. However if you
- modify an existing Maneage file for your project, next time its
- updated on <code>maneage</code> you'll have an annoying conflict. The commands
- below show how to fix this future problem. With them, you can
- configure Git to ignore the changes in <code>maneage</code> for some of the files
- you have already edited and deleted above (and will edit below). Note
- that only the first <code>echo</code> command has a <code>&gt;</code> (to write over the file),
- the rest are <code>&gt;&gt;</code> (to append to it). If you want to avoid any other
- set of files to be imported from Maneage into your project's branch,
- you can follow a similar strategy. We recommend only doing it when you
- encounter the same conflict in more than one merge and there is no
- other change in that file. Also, don't add core Maneage Makefiles,
- otherwise Maneage can break on the next run.</p>
-
-<p><code>shell
- $ echo "paper.tex merge=ours" &gt; .gitattributes
- $ echo "tex/src/delete-me.mk merge=ours" &gt;&gt; .gitattributes
- $ echo "tex/src/delete-me-demo.mk merge=ours" &gt;&gt; .gitattributes
- $ echo "reproduce/analysis/make/delete-me.mk merge=ours" &gt;&gt; .gitattributes
- $ echo "reproduce/software/config/TARGETS.conf merge=ours" &gt;&gt; .gitattributes
- $ echo "reproduce/analysis/config/delete-me-num.conf merge=ours" &gt;&gt; .gitattributes
- $ git add .gitattributes
-</code></p></li>
-<li><p><strong>Copyright and License notice</strong>: It is necessary that <em>all</em> the
- "copyright-able" files in your project (those larger than 10 lines)
- have a copyright and license notice. Please take a moment to look at
- several existing files to see a few examples. The copyright notice is
- usually close to the start of the file, it is the line starting with
- <code>Copyright (C)</code> and containing a year and the author's name (like the
- examples below). The License notice is a short description of the
- copyright license, usually one or two paragraphs with a URL to the
- full license. Don't forget to add these <em>two</em> notices to <em>any new
- file</em> you add in your project (you can just copy-and-paste). When you
- modify an existing Maneage file (which already has the notices), just
- add a copyright notice in your name under the existing one(s), like
- the line with capital letters below. To start with, add this line with
- your name and email address to <code>paper.tex</code>,
- <code>tex/src/preamble-header.tex</code>, <code>reproduce/analysis/make/top-make.mk</code>,
- and generally, all the files you modified in the previous step.</p>
-
-<p><code>
- Copyright (C) 2018-2020 Existing Name &lt;existing@email.address&gt;
- Copyright (C) 2020 YOUR NAME &lt;YOUR@EMAIL.ADDRESS&gt;
-</code></p></li>
-<li><p><strong>Configure Git for fist time</strong>: If this is the first time you are
- running Git on this system, then you have to configure it with some
- basic information in order to have essential information in the commit
- messages (ignore this step if you have already done it). Git will
- include your name and e-mail address information in each commit. You
- can also specify your favorite text editor for making the commit
- (<code>emacs</code>, <code>vim</code>, <code>nano</code>, and etc.).</p>
-
-<p><code>shell
- $ git config --global user.name "YourName YourSurname"
- $ git config --global user.email your-email@example.com
- $ git config --global core.editor nano
-</code></p></li>
-<li><p><strong>Your first commit</strong>: You have already made some small and basic
- changes in the steps above and you are in your project's <code>master</code>
- branch. So, you can officially make your first commit in your
- project's history and push it. But before that, you need to make sure
- that there are no problems in the project. This is a good habit to
- always re-build the system before a commit to be sure it works as
- expected.</p>
-
-<p><code>shell
- $ git status                 # See which files you have changed.
- $ git diff                   # Check the lines you have added/changed.
- $ ./project make             # Make sure everything builds successfully.
- $ git add -u                 # Put all tracked changes in staging area.
- $ git status                 # Make sure everything is fine.
- $ git diff --cached          # Confirm all the changes that will be committed.
- $ git commit                 # Your first commit: put a good description!
- $ git push                   # Push your commit to your remote.
-</code></p></li>
-<li><p><strong>Start your exciting research</strong>: You are now ready to add flesh and
- blood to this raw skeleton by further modifying and adding your
- exciting research steps. You can use the "published works" section in
- the introduction (above) as some fully working models to learn
- from. Also, don't hesitate to contact us if you have any
- questions.</p></li>
-</ol>
-
-<h2>Other basic customizations</h2>
-
-<ul>
-<li><p><strong>High-level software</strong>: Maneage installs all the software that your
- project needs. You can specify which software your project needs in
- <code>reproduce/software/config/TARGETS.conf</code>. The necessary software are
- classified into two classes: 1) programs or libraries (usually written
- in C/C++) which are run directly by the operating system. 2) Python
- modules/libraries that are run within Python. By default
- <code>TARGETS.conf</code> only has GNU Astronomy Utilities (Gnuastro) as one
- scientific program and Astropy as one scientific Python module. Both
- have many dependencies which will be installed into your project
- during the configuration step. To see a list of software that are
- currently ready to be built in Maneage, see
- <code>reproduce/software/config/versions.conf</code> (which has their versions
- also), the comments in <code>TARGETS.conf</code> describe how to use the software
- name from <code>versions.conf</code>. Currently the raw pipeline just uses
- Gnuastro to make the demonstration plots. Therefore if you don't need
- Gnuastro, go through the analysis steps in <code>reproduce/analysis</code> and
- remove all its use cases (clearly marked).</p></li>
-<li><p><strong>Input dataset</strong>: The input datasets are managed through the
- <code>reproduce/analysis/config/INPUTS.conf</code> file. It is best to gather all
- the information regarding all the input datasets into this one central
- file. To ensure that the proper dataset is being downloaded and used
- by the project, it is also recommended get an <a href="https://en.wikipedia.org/wiki/MD5">MD5
- checksum</a> of the file and include
- that in <code>INPUTS.conf</code> so the project can check it automatically. The
- preparation/downloading of the input datasets is done in
- <code>reproduce/analysis/make/download.mk</code>. Have a look there to see how
- these values are to be used. This information about the input datasets
- is also used in the initial <code>configure</code> script (to inform the users),
- so also modify that file. You can find all occurrences of the demo
- dataset with the command below and replace it with your input's
- dataset.</p>
-
-<p><code>shell
- $ grep -ir wfpc2 ./*
-</code></p></li>
-<li><p><strong><code>README.md</code></strong>: Correct all the <code>XXXXX</code> place holders (name of your
- project, your own name, address of your project's online/remote
- repository, link to download dependencies and etc). Generally, read
- over the text and update it where necessary to fit your project. Don't
- forget that this is the first file that is displayed on your online
- repository and also your colleagues will first be drawn to read this
- file. Therefore, make it as easy as possible for them to start
- with. Also check and update this file one last time when you are ready
- to publish your project's paper/source.</p></li>
-<li><p><strong>Verify outputs</strong>: During the initial customization checklist, you
- disabled verification. This is natural because during the project you
- need to make changes all the time and its a waste of time to enable
- verification every time. But at significant moments of the project
- (for example before submission to a journal, or publication) it is
- necessary. When you activate verification, before building the paper,
- all the specified datasets will be compared with their respective
- checksum and if any file's checksum is different from the one recorded
- in the project, it will stop and print the problematic file and its
- expected and calculated checksums. First set the value of
- <code>verify-outputs</code> variable in
- <code>reproduce/analysis/config/verify-outputs.conf</code> to <code>yes</code>. Then go to
- <code>reproduce/analysis/make/verify.mk</code>. The verification of all the files
- is only done in one recipe. First the files that go into the
- plots/figures are checked, then the LaTeX macros. Validation of the
- former (inputs to plots/figures) should be done manually. If its the
- first time you are doing this, you can see two examples of the dummy
- steps (with <code>delete-me</code>, you can use them if you like). These two
- examples should be removed before you can run the project. For the
- latter, you just have to update the checksums. The important thing to
- consider is that a simple checksum can be problematic because some
- file generators print their run-time date in the file (for example as
- commented lines in a text table). When checking text files, this
- Makefile already has this function:
- <code>verify-txt-no-comments-leading-space</code>. As the name suggests, it will
- remove comment lines and empty lines before calculating the MD5
- checksum. For FITS formats (common in astronomy, fortunately there is
- a <code>DATASUM</code> definition which will return the checksum independent of
- the headers. You can use the provided function(s), or define one for
- your special formats.</p></li>
-<li><p><strong>Feedback</strong>: As you use Maneage you will notice many things that if
- implemented from the start would have been very useful for your
- work. This can be in the actual scripting and architecture of Maneage,
- or useful implementation and usage tips, like those below. In any
- case, please share your thoughts and suggestions with us, so we can
- add them here for everyone's benefit.</p></li>
-<li><p><strong>Re-preparation</strong>: Automatic preparation is only run in the first run
- of the project on a system, to re-do the preparation you have to use
- the option below. Here is the reason for this: when its necessary, the
- preparation process can be slow and will unnecessarily slow down the
- whole project while the project is under development (focus is on the
- analysis that is done after preparation). Because of this, preparation
- will be done automatically for the first time that the project is run
- (when <code>.build/software/preparation-done.mk</code> doesn't exist). After the
- preparation process completes once, future runs of <code>./project make</code>
- will not do the preparation process anymore (will not call
- <code>top-prepare.mk</code>). They will only call <code>top-make.mk</code> for the
- analysis. To manually invoke the preparation process after the first
- attempt, the <code>./project make</code> script should be run with the
- <code>--prepare-redo</code> option, or you can delete the special file above.</p>
-
-<p><code>shell
- $ ./project make --prepare-redo
-</code></p></li>
-<li><p><strong>Pre-publication</strong>: add notice on reproducibility**: Add a notice
- somewhere prominent in the first page within your paper, informing the
- reader that your research is fully reproducible. For example in the
- end of the abstract, or under the keywords with a title like
- "reproducible paper". This will encourage them to publish their own
- works in this manner also and also will help spread the word.</p></li>
-</ul>
-
-<h1>Tips for designing your project</h1>
-
-<p>The following is a list of design points, tips, or recommendations that
-have been learned after some experience with this type of project
-management. Please don't hesitate to share any experience you gain after
-using it with us. In this way, we can add it here (with full giving credit)
-for the benefit of others.</p>
-
-<ul>
-<li><p><strong>Modularity</strong>: Modularity is the key to easy and clean growth of a
- project. So it is always best to break up a job into as many
- sub-components as reasonable. Here are some tips to stay modular.</p>
-
-<ul>
-<li><p><em>Short recipes</em>: if you see the recipe of a rule becoming more than a
-handful of lines which involve significant processing, it is probably
-a good sign that you should break up the rule into its main
-components. Try to only have one major processing step per rule.</p></li>
-<li><p><em>Context-based (many) Makefiles</em>: For maximum modularity, this design
-allows easy inclusion of many Makefiles: in
-<code>reproduce/analysis/make/*.mk</code> for analysis steps, and
-<code>reproduce/software/make/*.mk</code> for building software. So keep the
-rules for closely related parts of the processing in separate
-Makefiles.</p></li>
-<li><p><em>Descriptive names</em>: Be very clear and descriptive with the naming of
-the files and the variables because a few months after the
-processing, it will be very hard to remember what each one was
-for. Also this helps others (your collaborators or other people
-reading the project source after it is published) to more easily
-understand your work and find their way around.</p></li>
-<li><p><em>Naming convention</em>: As the project grows, following a single standard
-or convention in naming the files is very useful. Try best to use
-multiple word filenames for anything that is non-trivial (separating
-the words with a <code>-</code>). For example if you have a Makefile for
-creating a catalog and another two for processing it under models A
-and B, you can name them like this: <code>catalog-create.mk</code>,
-<code>catalog-model-a.mk</code> and <code>catalog-model-b.mk</code>. In this way, when
-listing the contents of <code>reproduce/analysis/make</code> to see all the
-Makefiles, those related to the catalog will all be close to each
-other and thus easily found. This also helps in auto-completions by
-the shell or text editors like Emacs.</p></li>
-<li><p><em>Source directories</em>: If you need to add files in other languages for
-example in shell, Python, AWK or C, keep the files in the same
-language in a separate directory under <code>reproduce/analysis</code>, with the
-appropriate name.</p></li>
-<li><p><em>Configuration files</em>: If your research uses special programs as part
-of the processing, put all their configuration files in a devoted
-directory (with the program's name) within
-<code>reproduce/software/config</code>. Similar to the
-<code>reproduce/software/config/gnuastro</code> directory (which is put in
-Maneage as a demo in case you use GNU Astronomy Utilities). It is
-much cleaner and readable (thus less buggy) to avoid mixing the
-configuration files, even if there is no technical necessity.</p></li>
-</ul></li>
-<li><p><strong>Contents</strong>: It is good practice to follow the following
- recommendations on the contents of your files, whether they are source
- code for a program, Makefiles, scripts or configuration files
- (copyrights aren't necessary for the latter).</p>
-
-<ul>
-<li><p><em>Copyright</em>: Always start a file containing programming constructs
-with a copyright statement like the ones that Maneage starts with
-(for example in the top level <code>Makefile</code>).</p></li>
-<li><p><em>Comments</em>: Comments are vital for readability (by yourself in two
-months, or others). Describe everything you can about why you are
-doing something, how you are doing it, and what you expect the result
-to be. Write the comments as if it was what you would say to describe
-the variable, recipe or rule to a friend sitting beside you. When
-writing the project it is very tempting to just steam ahead with
-commands and codes, but be patient and write comments before the
-rules or recipes. This will also allow you to think more about what
-you should be doing. Also, in several months when you come back to
-the code, you will appreciate the effort of writing them. Just don't
-forget to also read and update the comment first if you later want to
-make changes to the code (variable, recipe or rule). As a general
-rule of thumb: first the comments, then the code.</p></li>
-<li><p><em>File title</em>: In general, it is good practice to start all files with
-a single line description of what that particular file does. If
-further information about the totality of the file is necessary, add
-it after a blank line. This will help a fast inspection where you
-don't care about the details, but just want to remember/see what that
-file is (generally) for. This information must of course be commented
-(its for a human), but this is kept separate from the general
-recommendation on comments, because this is a comment for the whole
-file, not each step within it.</p></li>
-</ul></li>
-<li><p><strong>Make programming</strong>: Here are some experiences that we have come to
- learn over the years in using Make and are useful/handy in research
- contexts.</p>
-
-<ul>
-<li><p><em>Environment of each recipe</em>: If you need to define a special
-environment (or aliases, or scripts to run) for all the recipes in
-your Makefiles, you can use a Bash startup file
-<code>reproduce/software/shell/bashrc.sh</code>. This file is loaded before every
-Make recipe is run, just like the <code>.bashrc</code> in your home directory is
-loaded every time you start a new interactive, non-login terminal. See
-the comments in that file for more.</p></li>
-<li><p><em>Automatic variables</em>: These are wonderful and very useful Make
-constructs that greatly shrink the text, while helping in
-read-ability, robustness (less bugs in typos for example) and
-generalization. For example even when a rule only has one target or
-one prerequisite, always use <code>$@</code> instead of the target's name, <code>$&lt;</code>
-instead of the first prerequisite, <code>$^</code> instead of the full list of
-prerequisites and etc. You can see the full list of automatic
-variables
-<a href="https://www.gnu.org/software/make/manual/html_node/Automatic-Variables.html">here</a>. If
-you use GNU Make, you can also see this page on your command-line:</p>
-
-<p><code>shell
-$ info make "automatic variables"
-</code></p></li>
-<li><p><em>Debug</em>: Since Make doesn't follow the common top-down paradigm, it
-can be a little hard to get accustomed to why you get an error or
-un-expected behavior. In such cases, run Make with the <code>-d</code>
-option. With this option, Make prints a full list of exactly which
-prerequisites are being checked for which targets. Looking
-(patiently) through this output and searching for the faulty
-file/step will clearly show you any mistake you might have made in
-defining the targets or prerequisites.</p></li>
-<li><p><em>Large files</em>: If you are dealing with very large files (thus having
-multiple copies of them for intermediate steps is not possible), one
-solution is the following strategy (Also see the next item on "Fast
-access to temporary files"). Set a small plain text file as the
-actual target and delete the large file when it is no longer needed
-by the project (in the last rule that needs it). Below is a simple
-demonstration of doing this. In it, we use Gnuastro's Arithmetic
-program to add all pixels of the input image with 2 and create
-<code>large1.fits</code>. We then subtract 2 from <code>large1.fits</code> to create
-<code>large2.fits</code> and delete <code>large1.fits</code> in the same rule (when its no
-longer needed). We can later do the same with <code>large2.fits</code> when it
-is no longer needed and so on.
-<code>
+<!DOCTYPE html>
+<!--
+    Webpage of Maneage: a framework for managing data lineage
+
+    Copyright (C) 2020, Mohammad Akhlaghi <mohammad@akhlaghi.org>
+
+    This file is part of Maneage. Maneage is free software: you can
+    redistribute it and/or modify it under the terms of the GNU General
+    Public License as published by the Free Software Foundation, either
+    version 3 of the License, or (at your option) any later version.
+
+    Maneage is distributed in the hope that it will be useful, but
+    WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+    General Public License for more details. See
+    <http://www.gnu.org/licenses/>.  -->
+
+    <html lang="en-US">
+
+        <!-- HTML Header -->
+        <head>
+            <!-- Title of the page. -->
+            <title>Maneage -- Managing data lineage</title>
+
+            <!-- Enable UTF-8 encoding to easily use non-ASCII charactes -->
+            <meta charset="UTF-8">
+            <meta http-equiv="Content-type" content="text/html; charset=UTF-8">
+
+            <!-- Put logo beside the address bar -->
+            <link rel="shortcut icon" href="./img/favicon.svg" />
+
+            <!-- The viewport meta tag is placed mainly for mobile browsers
+                that are pre-configured in different ways (for example setting the
+                different widths for the page than the actual width of the device,
+                or zooming to different values. Without this the CSS media
+                solutions might not work properly on all mobile browsers.-->
+                <meta name="viewport"
+                      content="width=device-width, initial-scale=1">
+
+                <!-- Basic styles -->
+                <link rel="stylesheet" href="css/base.css" />
+        </head>
+
+
+
+
+        <!-- Start the main body. -->
+        <body>
+            <div id="container">
+
+                <h1>Maneage: managing data lineage</h1>
+
+                <p>Copyright (C) 2018-2020 Mohammad Akhlaghi <a href="&#109;&#x61;&#x69;&#x6C;&#x74;&#x6F;:&#x6D;&#111;&#104;&#97;&#x6D;&#109;a&#x64;&#64;&#x61;&#107;&#x68;&#x6C;&#x61;&#x67;&#104;&#x69;.&#x6F;&#x72;&#103;">&#x6D;&#111;&#104;&#97;&#x6D;&#109;a&#x64;&#64;&#x61;&#107;&#x68;&#x6C;&#x61;&#x67;&#104;&#x69;.&#x6F;&#x72;&#103;</a><br />
+                Copyright (C) 2020 Raul Infante-Sainz <a href="m&#x61;&#105;&#108;t&#111;:&#x69;&#x6E;&#x66;&#x61;&#x6E;&#116;&#101;&#115;&#97;&#x69;n&#122;&#64;&#103;&#x6D;&#x61;&#x69;&#x6C;&#x2E;&#x63;&#111;&#x6D;">&#x69;&#x6E;&#x66;&#x61;&#x6E;&#116;&#101;&#115;&#97;&#x69;n&#122;&#64;&#103;&#x6D;&#x61;&#x69;&#x6C;&#x2E;&#x63;&#111;&#x6D;</a><br />
+                See the end of the file for license conditions.</p>
+
+                <p>Maneage is a <strong>fully working template</strong> for doing reproducible research (or
+                writing a reproducible paper) as defined in the link below. If the link
+                below is not accessible at the time of reading, please see the appendix at
+                the end of this file for a portion of its introduction. Some
+                <a href="http://akhlaghi.org/pdf/reproducible-paper.pdf">slides</a> are also available
+                to help demonstrate the concept implemented here.</p>
+
+                <p>http://akhlaghi.org/reproducible-science.html</p>
+
+                <p>Maneage is created with the aim of supporting reproducible research by
+                making it easy to start a project in this framework. As shown below, it is
+                very easy to customize Maneage for any particular (research) project and
+                expand it as it starts and evolves. It can be run with no modification (as
+                described in <code>README.md</code>) as a demonstration and customized for use in any
+                project as fully described below.</p>
+
+                <p>A project designed using Maneage will download and build all the necessary
+                libraries and programs for working in a closed environment (highly
+                independent of the host operating system) with fixed versions of the
+                necessary dependencies. The tarballs for building the local environment are
+                also collected in a <a href="http://git.maneage.org/tarballs-software.git/tree/">separate
+                    repository</a>. The final
+                output of the project is <a href="http://git.maneage.org/output-raw.git/plain/paper.pdf">a
+                    paper</a>.  Notice the
+                last paragraph of the Acknowledgments where all the necessary software are
+                mentioned with their versions.</p>
+
+                <p>Below, we start with a discussion of why Make was chosen as the high-level
+                language/framework for project management and how to learn and master Make
+                easily (and freely). The general architecture and design of the project is
+                then discussed to help you navigate the files and their contents. This is
+                followed by a checklist for the easy/fast customization of Maneage to your
+                exciting research. We continue with some tips and guidelines on how to
+                manage or extend your project as it grows based on our experiences with it
+                so far. The main body concludes with a description of possible future
+                improvements that are planned for Maneage (but not yet implemented). As
+                discussed above, we end with a short introduction on the necessity of
+                reproducible science in the appendix.</p>
+
+                <p>Please don't forget to share your thoughts, suggestions and
+                criticisms. Maintaining and designing Maneage is itself a separate project,
+                so please join us if you are interested. Once it is mature enough, we will
+                describe it in a paper (written by all contributors) for a formal
+                introduction to the community.</p>
+
+                <h2>Why Make?</h2>
+
+                <p>When batch processing is necessary (no manual intervention, as in a
+                reproducible project), shell scripts are usually the first solution that
+                come to mind. However, the inherent complexity and non-linearity of
+                progress in a scientific project (where experimentation is key) make it
+                hard to manage the script(s) as the project evolves. For example, a script
+                will start from the top/start every time it is run. So if you have already
+                completed 90% of a research project and want to run the remaining 10% that
+                you have newly added, you have to run the whole script from the start
+                again. Only then will you see the effects of the last new steps (to find
+                possible errors, or better solutions and etc).</p>
+
+                <p>It is possible to manually ignore/comment parts of a script to only do a
+                special part. However, such checks/comments will only add to the complexity
+                of the script and will discourage you to play-with/change an already
+                completed part of the project when an idea suddenly comes up. It is also
+                prone to very serious bugs in the end (when trying to reproduce from
+                scratch). Such bugs are very hard to notice during the work and frustrating
+                to find in the end.</p>
+
+                <p>The Make paradigm, on the other hand, starts from the end: the final
+                <em>target</em>. It builds a dependency tree internally, and finds where it should
+                start each time the project is run. Therefore, in the scenario above, a
+                researcher that has just added the final 10% of steps of her research to
+                her Makefile, will only have to run those extra steps. With Make, it is
+                also trivial to change the processing of any intermediate (already written)
+                <em>rule</em> (or step) in the middle of an already written analysis: the next
+                time Make is run, only rules that are affected by the changes/additions
+                will be re-run, not the whole analysis/project.</p>
+
+                <p>This greatly speeds up the processing (enabling creative changes), while
+                keeping all the dependencies clearly documented (as part of the Make
+                language), and most importantly, enabling full reproducibility from scratch
+                with no changes in the project code that was working during the
+                research. This will allow robust results and let the scientists get to what
+                they do best: experiment and be critical to the methods/analysis without
+                having to waste energy and time on technical problems that come up as a
+                result of that experimentation in scripts.</p>
+
+                <p>Since the dependencies are clearly demarcated in Make, it can identify
+                independent steps and run them in parallel. This further speeds up the
+                processing. Make was designed for this purpose. It is how huge projects
+                like all Unix-like operating systems (including GNU/Linux or Mac OS
+                operating systems) and their core components are built. Therefore, Make is
+                a highly mature paradigm/system with robust and highly efficient
+                implementations in various operating systems perfectly suited for a complex
+                non-linear research project.</p>
+
+                <p>Make is a small language with the aim of defining <em>rules</em> containing
+                <em>targets</em>, <em>prerequisites</em> and <em>recipes</em>. It comes with some nice features
+                like functions or automatic-variables to greatly facilitate the management
+                of text (filenames for example) or any of those constructs. For a more
+                detailed (yet still general) introduction see the article on Wikipedia:</p>
+
+                <p>https://en.wikipedia.org/wiki/Make_(software)</p>
+
+                <p>Make is a +40 year old software that is still evolving, therefore many
+                implementations of Make exist. The only difference in them is some extra
+                features over the <a href="https://pubs.opengroup.org/onlinepubs/009695399/utilities/make.html">standard
+                    definition</a>
+                (which is shared in all of them). Maneage is primarily written in GNU Make
+                (which it installs itself, you don't have to have it on your system). GNU
+                Make is the most common, most actively developed, and most advanced
+                implementation. Just note that Maneage downloads, builds, internally
+                installs, and uses its own dependencies (including GNU Make), so you don't
+                have to have it installed before you try it out.</p>
+
+                <h2>How can I learn Make?</h2>
+
+                <p>The GNU Make book/manual (links below) is arguably the best place to learn
+                Make. It is an excellent and non-technical book to help get started (it is
+                only non-technical in its first few chapters to get you started easily). It
+                is freely available and always up to date with the current GNU Make
+                release. It also clearly explains which features are specific to GNU Make
+                and which are general in all implementations. So the first few chapters
+                regarding the generalities are useful for all implementations.</p>
+
+                <p>The first link below points to the GNU Make manual in various formats and
+                in the second, you can download it in PDF (which may be easier for a first
+                time reading).</p>
+
+                <p>https://www.gnu.org/software/make/manual/</p>
+
+                <p>https://www.gnu.org/software/make/manual/make.pdf</p>
+
+                <p>If you use GNU Make, you also have the whole GNU Make manual on the
+                command-line with the following command (you can come out of the "Info"
+                environment by pressing <code>q</code>).</p>
+
+                <pre><code>
+info make
+                </code></pre>
+
+                <p>If you aren't familiar with the Info documentation format, we strongly
+                recommend running <code>$ info info</code> and reading along. In less than an hour,
+                you will become highly proficient in it (it is very simple and has a great
+                manual for itself). Info greatly simplifies your access (without taking
+                your hands off the keyboard!) to many manuals that are installed on your
+                system, allowing you to be much more efficient as you work. If you use the
+                GNU Emacs text editor (or any of its variants), you also have access to all
+                Info manuals while you are writing your projects (again, without taking
+                your hands off the keyboard!).</p>
+
+                <h2>Published works using Maneage</h2>
+
+                <p>The list below shows some of the works that have already been published
+                with (earlier versions of) Maneage. Previously it was simply called
+                "Reproducible paper template". Note that Maneage is evolving, so some
+                details may be different in them. The more recent ones can be used as a
+                good working example.</p>
+
+                <ul>
+                    <li><p>Infante-Sainz et
+                        al. (<a href="https://ui.adsabs.harvard.edu/abs/2020MNRAS.491.5317I">2020</a>,
+                        MNRAS, 491, 5317): The version controlled project source is available
+                        <a href="https://gitlab.com/infantesainz/sdss-extended-psfs-paper">on GitLab</a>
+                        and is also archived on Zenodo with all the necessary software tarballs:
+                        <a href="https://zenodo.org/record/3524937">zenodo.3524937</a>.</p></li>
+                    <li><p>Akhlaghi (<a href="https://arxiv.org/abs/1909.11230">2019</a>, IAU Symposium
+                        355). The version controlled project source is available <a href="https://gitlab.com/makhlaghi/iau-symposium-355">on
+                            GitLab</a> and is also
+                        archived on Zenodo with all the necessary software tarballs:
+                        <a href="https://doi.org/10.5281/zenodo.3408481">zenodo.3408481</a>.</p></li>
+                    <li><p>Section 7.3 of Bacon et
+                        al. (<a href="http://adsabs.harvard.edu/abs/2017A%26A...608A...1B">2017</a>, A&amp;A
+                        608, A1): The version controlled project source is available <a href="https://gitlab.com/makhlaghi/muse-udf-origin-only-hst-magnitudes">on
+                            GitLab</a>
+                        and a snapshot of the project along with all the necessary input
+                        datasets and outputs is available in
+                        <a href="https://doi.org/10.5281/zenodo.1164774">zenodo.1164774</a>.</p></li>
+                    <li><p>Section 4 of Bacon et
+                        al. (<a href="http://adsabs.harvard.edu/abs/2017A%26A...608A...1B">2017</a>, A&amp;A,
+                        608, A1): The version controlled project is available <a href="https://gitlab.com/makhlaghi/muse-udf-photometry-astrometry">on
+                            GitLab</a> and
+                        a snapshot of the project along with all the necessary input datasets is
+                        available in <a href="https://doi.org/10.5281/zenodo.1163746">zenodo.1163746</a>.</p></li>
+                    <li><p>Akhlaghi &amp; Ichikawa
+                        (<a href="http://adsabs.harvard.edu/abs/2015ApJS..220....1A">2015</a>, ApJS, 220,
+                        1): The version controlled project is available <a href="https://gitlab.com/makhlaghi/NoiseChisel-paper">on
+                            GitLab</a>. This is the
+                        very first (and much less mature!) incarnation of Maneage: the history
+                        of Maneage started more than two years after this paper was
+                        published. It is a very rudimentary/initial implementation, thus it is
+                        only included here for historical reasons. However, the project source
+                        is complete, accurate and uploaded to arXiv along with the paper.</p></li>
+                </ul>
+
+                <h2>Citation</h2>
+
+                <p>A paper to fully describe Maneage has been submitted. Until then, if you
+                used it in your work, please cite the paper that implemented its first
+                version: Akhlaghi &amp; Ichikawa
+                (<a href="http://adsabs.harvard.edu/abs/2015ApJS..220....1A">2015</a>, ApJS, 220, 1).</p>
+
+                <p>Also, when your paper is published, don't forget to add a notice in your
+                own paper (in coordination with the publishing editor) that the paper is
+                fully reproducible and possibly add a sentence or paragraph in the end of
+                the paper shortly describing the concept. This will help spread the word
+                and encourage other scientists to also manage and publish their projects in
+                a reproducible manner.</p>
+
+                <h1>Project architecture</h1>
+
+                <p>In order to customize Maneage to your research, it is important to first
+                understand its architecture so you can navigate your way in the directories
+                and understand how to implement your research project within its framework:
+                where to add new files and which existing files to modify for what
+                purpose. But if this the first time you are using Maneage, before reading
+                this theoretical discussion, please run Maneage once from scratch without
+                any changes (described in <code>README.md</code>). You will see how it works (note that
+                the configure step builds all necessary software, so it can take long, but
+                you can continue reading while its working).</p>
+
+                <p>The project has two top-level directories: <code>reproduce</code> and
+                <code>tex</code>. <code>reproduce</code> hosts all the software building and analysis
+                steps. <code>tex</code> contains all the final paper's components to be compiled into
+                a PDF using LaTeX.</p>
+
+                <p>The <code>reproduce</code> directory has two sub-directories: <code>software</code> and
+                <code>analysis</code>. As the name says, the former contains all the instructions to
+                download, build and install (independent of the host operating system) the
+                necessary software (these are called by the <code>./project configure</code>
+                command). The latter contains instructions on how to use those software to
+                do your project's analysis.</p>
+
+                <p>After it finishes, <code>./project configure</code> will create the following symbolic
+                links in the project's top source directory: <code>.build</code> which points to the
+                top build directory and <code>.local</code> for easy access to the custom built
+                software installation directory. With these you can easily access the build
+                directory and project-specific software from your top source directory. For
+                example if you run <code>.local/bin/ls</code> you will be using the <code>ls</code> of Maneage,
+                which is probably different from your system's <code>ls</code> (run them both with
+                <code>--version</code> to check).</p>
+
+                <p>Once the project is configured for your system, <code>./project make</code> will do
+                the basic preparations and run the project's analysis with the custom
+                version of software. The <code>project</code> script is just a wrapper, and with the
+                <code>make</code> argument, it will first call <code>top-prepare.mk</code> and <code>top-make.mk</code>
+                (both are in the <code>reproduce/analysis/make</code> directory).</p>
+
+                <p>In terms of organization, <code>top-prepare.mk</code> and <code>top-make.mk</code> have an
+                identical design, only minor differences. So, let's continue Maneage's
+                architecture with <code>top-make.mk</code>. Once you understand that, you'll clearly
+                understand <code>top-prepare.mk</code> also. These very high-level files are
+                relatively short and heavily commented so hopefully the descriptions in
+                each comment will be enough to understand the general details. As you read
+                this section, please also look at the contents of the mentioned files and
+                directories to fully understand what is going on.</p>
+
+                <p>Before starting to look into the top <code>top-make.mk</code>, it is important to
+                recall that Make defines dependencies by files. Therefore, the
+                input/prerequisite and output of every step/rule must be a file. Also
+                recall that Make will use the modification date of the prerequisite(s) and
+                target files to see if the target must be re-built or not. Therefore during
+                the processing, <em>many</em> intermediate files will be created (see the tips
+                section below on a good strategy to deal with large/huge files).</p>
+
+                <p>To keep the source and (intermediate) built files separate, the user <em>must</em>
+                define a top-level build directory variable (or <code>$(BDIR)</code>) to host all the
+                intermediate files (you defined it during <code>./project configure</code>). This
+                directory doesn't need to be version controlled or even synchronized, or
+                backed-up in other servers: its contents are all products, and can be
+                easily re-created any time. As you define targets for your new rules, it is
+                thus important to place them all under sub-directories of <code>$(BDIR)</code>. As
+                mentioned above, you always have fast access to this "build"-directory with
+                the <code>.build</code> symbolic link. Also, beware to <em>never</em> make any manual change
+                in the files of the build-directory, just delete them (so they are
+                re-built).</p>
+
+                <p>In this architecture, we have two types of Makefiles that are loaded into
+                the top <code>Makefile</code>: <em>configuration-Makefiles</em> (only independent
+                variables/configurations) and <em>workhorse-Makefiles</em> (Makefiles that
+                actually contain analysis/processing rules).</p>
+
+                <p>The configuration-Makefiles are those that satisfy these two wildcards:
+                <code>reproduce/software/config/*.conf</code> (for building the necessary software
+                when you run <code>./project configure</code>) and <code>reproduce/analysis/config/*.conf</code>
+                (for the high-level analysis, when you run <code>./project make</code>). These
+                Makefiles don't actually have any rules, they just have values for various
+                free parameters throughout the configuration or analysis. Open a few of
+                them to see for yourself. These Makefiles must only contain raw Make
+                variables (project configurations). By "raw" we mean that the Make
+                variables in these files must not depend on variables in any other
+                configuration-Makefile. This is because we don't want to assume any order
+                in reading them. It is also very important to <em>not</em> define any rule, or
+                other Make construct, in these configuration-Makefiles.</p>
+
+                <p>Following this rule-of-thumb enables you to set these configure-Makefiles
+                as a prerequisite to any target that depends on their variable
+                values. Therefore, if you change any of their values, all targets that
+                depend on those values will be re-built. This is very convenient as your
+                project scales up and gets more complex.</p>
+
+                <p>The workhorse-Makefiles are those satisfying this wildcard
+                <code>reproduce/software/make/*.mk</code> and <code>reproduce/analysis/make/*.mk</code>. They
+                contain the details of the processing steps (Makefiles containing
+                rules). Therefore, in this phase <em>order is important</em>, because the
+                prerequisites of most rules will be the targets of other rules that will be
+                defined prior to them (not a fixed name like <code>paper.pdf</code>). The lower-level
+                rules must be imported into Make before the higher-level ones.</p>
+
+                <p>All processing steps are assumed to ultimately (usually after many rules)
+                end up in some number, image, figure, or table that will be included in the
+                paper. The writing of these results into the final report/paper is managed
+                through separate LaTeX files that only contain macros (a name given to a
+                number/string to be used in the LaTeX source, which will be replaced when
+                compiling it to the final PDF). So the last target in a workhorse-Makefile
+                is a <code>.tex</code> file (with the same base-name as the Makefile, but in
+                <code>$(BDIR)/tex/macros</code>). As a result, if the targets in a workhorse-Makefile
+                aren't directly a prerequisite of other workhorse-Makefile targets, they
+                can be a prerequisite of that intermediate LaTeX macro file and thus be
+                called when necessary. Otherwise, they will be ignored by Make.</p>
+
+                <p>Maneage also has a mode to share the build directory between several
+                users of a Unix group (when working on large computer clusters). In this
+                scenario, each user can have their own cloned project source, but share the
+                large built files between each other. To do this, it is necessary for all
+                built files to give full permission to group members while not allowing any
+                other users access to the contents. Therefore the <code>./project configure</code> and
+                <code>./project make</code> steps must be called with special conditions which are
+                managed in the <code>--group</code> option.</p>
+
+                <p>Let's see how this design is implemented. Please open and inspect
+                <code>top-make.mk</code> it as we go along here. The first step (un-commented line) is
+                to import the local configuration (your answers to the questions of
+                <code>./project configure</code>). They are defined in the configuration-Makefile
+                <code>reproduce/software/config/LOCAL.conf</code> which was also built by <code>./project
+                    configure</code> (based on the <code>LOCAL.conf.in</code> template of the same directory).</p>
+
+                <p>The next non-commented set of the top <code>Makefile</code> defines the ultimate
+                target of the whole project (<code>paper.pdf</code>). But to avoid mistakes, a sanity
+                check is necessary to see if Make is being run with the same group settings
+                as the configure script (for example when the project is configured for
+                group access using the <code>./for-group</code> script, but Make isn't). Therefore we
+                use a Make conditional to define the <code>all</code> target based on the group
+                permissions.</p>
+
+                <p>Having defined the top/ultimate target, our next step is to include all the
+                other necessary Makefiles. However, order matters in the importing of
+                workhorse-Makefiles and each must also have a TeX macro file with the same
+                base name (without a suffix). Therefore, the next step in the top-level
+                Makefile is to define the <code>makesrc</code> variable to keep the base names
+                (without a <code>.mk</code> suffix) of the workhorse-Makefiles that must be imported,
+                in the proper order.</p>
+
+                <p>Finally, we import all the necessary remaining Makefiles: 1) All the
+                analysis configuration-Makefiles with a wildcard. 2) The software
+                configuration-Makefile that contains their version (just in case its
+                necessary). 3) All workhorse-Makefiles in the proper order using a Make
+                <code>foreach</code> loop.</p>
+
+                <p>In short, to keep things modular, readable and manageable, follow these
+                recommendations: 1) Set clear-to-understand names for the
+                configuration-Makefiles, and workhorse-Makefiles, 2) Only import other
+                Makefiles from top Makefile. These will let you know/remember generally
+                which step you are taking before or after another. Projects will scale up
+                very fast. Thus if you don't start and continue with a clean and robust
+                convention like this, in the end it will become very dirty and hard to
+                manage/understand (even for yourself). As a general rule of thumb, break
+                your rules into as many logically-similar but independent steps as
+                possible.</p>
+
+                <p>The <code>reproduce/analysis/make/paper.mk</code> Makefile must be the final Makefile
+                that is included. This workhorse Makefile ends with the rule to build
+                <code>paper.pdf</code> (final target of the whole project). If you look in it, you
+                will notice that this Makefile starts with a rule to create
+                <code>$(mtexdir)/project.tex</code> (<code>mtexdir</code> is just a shorthand name for
+                <code>$(BDIR)/tex/macros</code> mentioned before). As you see, the only dependency of
+                <code>$(mtexdir)/project.tex</code> is <code>$(mtexdir)/verify.tex</code> (which is the last
+                analysis step: it verifies all the generated results).  Therefore,
+                <code>$(mtexdir)/project.tex</code> is <em>the connection</em> between the
+                processing/analysis steps of the project, and the steps to build the final
+                PDF.</p>
+
+                <p>During the research, it often happens that you want to test a step that is
+                not a prerequisite of any higher-level operation. In such cases, you can
+                (temporarily) define that processing as a rule in the most relevant
+                workhorse-Makefile and set its target as a prerequisite of its TeX
+                macro. If your test gives a promising result and you want to include it in
+                your research, set it as prerequisites to other rules and remove it from
+                the list of prerequisites for TeX macro file. In fact, this is how a
+                project is designed to grow in this framework.</p>
+
+                <h2>File modification dates (meta data)</h2>
+
+                <p>While Git does an excellent job at keeping a history of the contents of
+                files, it makes no effort in keeping the file meta data, and in particular
+                the dates of files. Therefore when you checkout to a different branch,
+                files that are re-written by Git will have a newer date than the other
+                project files. However, file dates are important in the current design of
+                Maneage: Make checks the dates of the prerequisite files and target files
+                to see if the target should be re-built.</p>
+
+                <p>To fix this problem, for Maneage we use a forked version of
+                <a href="https://github.com/mohammad-akhlaghi/metastore">Metastore</a>. Metastore use
+                a binary database file (which is called <code>.file-metadata</code>) to keep the
+                modification dates of all the files under version control. This file is
+                also under version control, but is hidden (because it shouldn't be modified
+                by hand). During the project's configuration, Maneage installs to Git hooks
+                to run Metastore 1) before making a commit to update its database with the
+                file dates in a branch, and 2) after doing a checkout, to reset the
+                file-dates after the checkout is complete and re-set the file dates back to
+                what they were.</p>
+
+                <p>In practice, Metastore should work almost fully invisibly within your
+                project. The only place you might notice its presence is that you'll see
+                <code>.file-metadata</code> in the list of modified/staged files (commonly after
+                merging your branches). Since its a binary file, Git also won't show you
+                the changed contents. In a merge, you can simply accept any changes with
+                <code>git add -u</code>. But if Git is telling you that it has changed without a merge
+                (for example if you started a commit, but canceled it in the middle), you
+                can just do <code>git checkout .file-metadata</code> and set it back to its original
+                state.</p>
+
+                <h2>Summary</h2>
+
+                <p>Based on the explanation above, some major design points you should have in
+                mind are listed below.</p>
+
+                <ul>
+                    <li><p>Define new <code>reproduce/analysis/make/XXXXXX.mk</code> workhorse-Makefile(s)
+                        with good and human-friendly name(s) replacing <code>XXXXXX</code>.</p></li>
+                    <li><p>Add <code>XXXXXX</code>, as a new line, to the values in <code>makesrc</code> of the top-level
+                        <code>Makefile</code>.</p></li>
+                    <li><p>Do not use any constant numbers (or important names like filter names)
+                        in the workhorse-Makefiles or paper's LaTeX source. Define such
+                        constants as logically-grouped, separate configuration-Makefiles in
+                        <code>reproduce/analysis/config/XXXXX.conf</code>. Then set this
+                        configuration-Makefiles file as a prerequisite to any rule that uses
+                        the variable defined in it.</p></li>
+                    <li><p>Through any number of intermediate prerequisites, all processing steps
+                        should end in (be a prerequisite of) <code>$(mtexdir)/verify.tex</code> (defined in
+                        <code>reproduce/analysis/make/verify.mk</code>). <code>$(mtexdir)/verify.tex</code> is the sole
+                        dependency of <code>$(mtexdir)/project.tex</code>, which is the bridge between the
+                        processing steps and PDF-building steps of the project.</p></li>
+                </ul>
+
+                <h1>Customization checklist</h1>
+
+                <p>Take the following steps to fully customize Maneage for your research
+                project. After finishing the list, be sure to run <code>./project configure</code> and
+                <code>project make</code> to see if everything works correctly. If you notice anything
+                missing or any in-correct part (probably a change that has not been
+                explained here), please let us know to correct it.</p>
+
+                <p>As described above, the concept of reproducibility (during a project)
+                heavily relies on <a href="https://en.wikipedia.org/wiki/Version_control">version
+                    control</a>. Currently Maneage
+                uses Git as its main version control system. If you are not already
+                familiar with Git, please read the first three chapters of the <a href="https://git-scm.com/book/en/v2">ProGit
+                    book</a> which provides a wonderful practical
+                understanding of the basics. You can read later chapters as you get more
+                advanced in later stages of your work.</p>
+
+                <h2>First custom commit</h2>
+
+                <ol>
+                    <li><p><strong>Get this repository and its history</strong> (if you don't already have it):
+                        Arguably the easiest way to start is to clone Maneage and prepare for
+                        your customizations as shown below. After the cloning first you rename
+                        the default <code>origin</code> remote server to specify that this is Maneage's
+                        remote server. This will allow you to use the conventional <code>origin</code>
+                        name for your own project as shown in the next steps. Second, you will
+                        create and go into the conventional <code>master</code> branch to start
+                        committing in your project later.</p>
+
+                        <pre><code>
+git clone https://git.maneage.org/project.git    <span class="comment"># Clone/copy the project and its history.</span>
+mv project my-project                            <span class="comment"># Change the name to your project's name.</span>
+cd my-project                                    <span class="comment"># Go into the cloned directory.</span>
+git remote rename origin origin-maneage          <span class="comment"># Rename current/only remote to "origin-maneage".</span>
+git checkout -b master                           <span class="comment"># Create and enter your own "master" branch.</span>
+pwd                                              <span class="comment"># Just to confirm where you are.</span>
+                        </code></pre></li>
+                        <li><p><strong>Prepare to build project</strong>: The <code>./project configure</code> command of the
+                            next step will build the different software packages within the
+                            "build" directory (that you will specify). Nothing else on your system
+                            will be touched. However, since it takes long, it is useful to see
+                            what it is being built at every instant (its almost impossible to tell
+                            from the torrent of commands that are produced!). So open another
+                            terminal on your desktop and navigate to the same project directory
+                            that you cloned (output of last command above). Then run the following
+                            command. Once every second, this command will just print the date
+                            (possibly followed by a non-existent directory notice). But as soon as
+                            the next step starts building software, you'll see the names of
+                            software get printed as they are being built. Once any software is
+                            installed in the project build directory it will be removed. Again,
+                            don't worry, nothing will be installed outside the build directory.</p>
+
+                            <pre><code>
+<span class="comment"># On another terminal (go to top project source directory, last command above)</span>
+./project --check-config
+                            </code></pre></li>
+                            <li><p><strong>Test Maneage</strong>: Before making any changes, it is important to test it
+                                and see if everything works properly with the commands below. If there
+                                is any problem in the <code>./project configure</code> or <code>./project make</code> steps,
+                                please contact us to fix the problem before continuing. Since the
+                                building of dependencies in configuration can take long, you can take
+                                the next few steps (editing the files) while its working (they don't
+                                affect the configuration). After <code>./project make</code> is finished, open
+                                <code>paper.pdf</code>. If it looks fine, you are ready to start customizing the
+                                Maneage for your project. But before that, clean all the extra Maneage
+                                outputs with <code>make clean</code> as shown below.</p>
+
+                            <pre><code>
+./project configure     <span class="comment"># Build the project's software environment (can take an hour or so).</span>
+./project make          <span class="comment"># Do the processing and build paper (just a simple demo).</span>
+                        <span class="comment"># Open 'paper.pdf' and see if everything is ok.
+                            </code></pre></li>
+                            <li><p><strong>Setup the remote</strong>: You can use any <a href="https://en.wikipedia.org/wiki/Comparison_of_source_code_hosting_facilities">hosting
+                                    facility</a>
+                                that supports Git to keep an online copy of your project's version
+                                controlled history. We recommend <a href="https://gitlab.com">GitLab</a> because
+                                it is <a href="https://www.gnu.org/software/repo-criteria-evaluation.html">more ethical (although not
+                                    perfect)</a>,
+                                and later you can also host GitLab on your own server. Anyway, create
+                                an account in your favorite hosting facility (if you don't already
+                                have one), and define a new project there. Please make sure <em>the newly
+                                    created project is empty</em> (some services ask to include a <code>README</code> in
+                                a new project which is bad in this scenario, and will not allow you to
+                                push to it). It will give you a URL (usually starting with <code>git@</code> and
+                                ending in <code>.git</code>), put this URL in place of <code>XXXXXXXXXX</code> in the first
+                                command below. With the second command, "push" your <code>master</code> branch to
+                                your <code>origin</code> remote, and (with the <code>--set-upstream</code> option) set them
+                                to track/follow each other. However, the <code>maneage</code> branch is currently
+                                tracking/following your <code>origin-maneage</code> remote (automatically set
+                                when you cloned Maneage). So when pushing the <code>maneage</code> branch to your
+                                <code>origin</code> remote, you <em>shouldn't</em> use <code>--set-upstream</code>. With the last
+                                command, you can actually check this (which local and remote branches
+                                are tracking each other).</p>
+
+                                <pre><code>
+git remote add origin XXXXXXXXXX        <span class="comment"># Newly created repo is now called 'origin'.</span>
+git push --set-upstream origin master   <span class="comment"># Push 'master' branch to 'origin' (with tracking).</span>
+git push origin maneage                 <span class="comment"># Push 'maneage' branch to 'origin' (no tracking).</span>
+                                </code></pre></li>
+                                <li><p><strong>Title</strong>, <strong>short description</strong> and <strong>author</strong>: The title and basic
+                                    information of your project's output PDF paper should be added in
+                                    <code>paper.tex</code>. You should see the relevant place in the preamble (prior
+                                    to <code>\begin{document}</code>. After you are done, run the <code>./project make</code>
+                                    command again to see your changes in the final PDF, and make sure that
+                                    your changes don't cause a crash in LaTeX. Of course, if you use a
+                                    different LaTeX package/style for managing the title and authors (in
+                                    particular a specific journal's style), please feel free to use it
+                                    your own methods after finishing this checklist and doing your first
+                                    commit.</p></li>
+                                <li><p><strong>Delete dummy parts</strong>: Maneage contains some parts that are only for
+                                    the initial/test run, mainly as a demonstration of important steps,
+                                    which you can use as a reference to use in your own project. But they
+                                    not for any real analysis, so you should remove these parts as
+                                    described below:</p>
+
+                                    <ul>
+                                        <li><p><code>paper.tex</code>: 1) Delete the text of the abstract (from
+                                            <code>\includeabstract{</code> to <code>\vspace{0.25cm}</code>) and write your own (a
+                                            single sentence can be enough now, you can complete it later). 2)
+                                            Add some keywords under it in the keywords part. 3) Delete
+                                            everything between <code>%% Start of main body.</code> and <code>%% End of main
+                                                body.</code>. 4) Remove the notice in the "Acknowledgments" section (in
+                                            <code>\new{}</code>) and Acknowledge your funding sources (this can also be
+                                            done later). Just don't delete the existing acknowledgment
+                                            statement: Maneage is possible thanks to funding from several
+                                            grants. Since Maneage is being used in your work, it is necessary to
+                                            acknowledge them in your work also.</p></li>
+                                        <li><p><code>reproduce/analysis/make/top-make.mk</code>: Delete the <code>delete-me</code> line
+                                            in the <code>makesrc</code> definition. Just make sure there is no empty line
+                                            between the <code>download \</code> and <code>verify \</code> lines (they should be
+                                            directly under each other).</p></li>
+                                        <li><p><code>reproduce/analysis/make/verify.mk</code>: In the final recipe, under the
+                                            commented line <code>Verify TeX macros</code>, remove the full line that
+                                            contains <code>delete-me</code>, and set the value of <code>s</code> in the line for
+                                            <code>download</code> to <code>XXXXX</code> (any temporary string, you'll fix it in the
+                                            end of your project, when its complete).</p></li>
+                                        <li><p>Delete all <code>delete-me*</code> files in the following directories:</p>
+                                            <pre><code>
+rm tex/src/delete-me*
+rm reproduce/analysis/make/delete-me*
+rm reproduce/analysis/config/delete-me*
+                                            </code></pre></li>
+                                            <li><p>Disable verification of outputs by removing the <code>yes</code> from
+                                                <code>reproduce/analysis/config/verify-outputs.conf</code>. Later, when you are
+                                                ready to submit your paper, or publish the dataset, activate
+                                                verification and make the proper corrections in this file (described
+                                                under the "Other basic customizations" section below). This is a
+                                                critical step and only takes a few minutes when your project is
+                                                finished. So DON'T FORGET to activate it in the end.</p></li>
+                                            <li><p>Re-make the project (after a cleaning) to see if you haven't
+                                                introduced any errors.</p>
+
+                                                <pre><code>
+./project make clean
+./project make
+                                                </code></pre></li>
+                                    </ul></li>
+                                    <li><p><strong>Don't merge some files in future updates</strong>: As described below, you
+                                        can later update your infra-structure (for example to fix bugs) by
+                                        merging your <code>master</code> branch with <code>maneage</code>. For files that you have
+                                        created in your own branch, there will be no problem. However if you
+                                        modify an existing Maneage file for your project, next time its
+                                        updated on <code>maneage</code> you'll have an annoying conflict. The commands
+                                        below show how to fix this future problem. With them, you can
+                                        configure Git to ignore the changes in <code>maneage</code> for some of the files
+                                        you have already edited and deleted above (and will edit below). Note
+                                        that only the first <code>echo</code> command has a <code>&gt;</code> (to write over the file),
+                                        the rest are <code>&gt;&gt;</code> (to append to it). If you want to avoid any other
+                                        set of files to be imported from Maneage into your project's branch,
+                                        you can follow a similar strategy. We recommend only doing it when you
+                                        encounter the same conflict in more than one merge and there is no
+                                        other change in that file. Also, don't add core Maneage Makefiles,
+                                        otherwise Maneage can break on the next run.</p>
+
+                                        <pre><code>
+echo "paper.tex merge=ours" &gt; .gitattributes
+echo "tex/src/delete-me.mk merge=ours" &gt;&gt; .gitattributes
+echo "tex/src/delete-me-demo.mk merge=ours" &gt;&gt; .gitattributes
+echo "reproduce/analysis/make/delete-me.mk merge=ours" &gt;&gt; .gitattributes
+echo "reproduce/software/config/TARGETS.conf merge=ours" &gt;&gt; .gitattributes
+echo "reproduce/analysis/config/delete-me-num.conf merge=ours" &gt;&gt; .gitattributes
+git add .gitattributes
+                                        </code></pre></li>
+                                        <li><p><strong>Copyright and License notice</strong>: It is necessary that <em>all</em> the
+                                            "copyright-able" files in your project (those larger than 10 lines)
+                                            have a copyright and license notice. Please take a moment to look at
+                                            several existing files to see a few examples. The copyright notice is
+                                            usually close to the start of the file, it is the line starting with
+                                            <code>Copyright (C)</code> and containing a year and the author's name (like the
+                                            examples below). The License notice is a short description of the
+                                            copyright license, usually one or two paragraphs with a URL to the
+                                            full license. Don't forget to add these <em>two</em> notices to <em>any new
+                                                file</em> you add in your project (you can just copy-and-paste). When you
+                                            modify an existing Maneage file (which already has the notices), just
+                                            add a copyright notice in your name under the existing one(s), like
+                                            the line with capital letters below. To start with, add this line with
+                                            your name and email address to <code>paper.tex</code>,
+                                            <code>tex/src/preamble-header.tex</code>, <code>reproduce/analysis/make/top-make.mk</code>,
+                                            and generally, all the files you modified in the previous step.</p>
+
+                                            <pre><code>
+Copyright (C) 2018-2020 Existing Name &lt;existing@email.address&gt;
+Copyright (C) 2020 YOUR NAME &lt;YOUR@EMAIL.ADDRESS&gt;
+                                            </code></pre></li>
+                                            <li><p><strong>Configure Git for fist time</strong>: If this is the first time you are
+                                                running Git on this system, then you have to configure it with some
+                                                basic information in order to have essential information in the commit
+                                                messages (ignore this step if you have already done it). Git will
+                                                include your name and e-mail address information in each commit. You
+                                                can also specify your favorite text editor for making the commit
+                                                (<code>emacs</code>, <code>vim</code>, <code>nano</code>, and etc.).</p>
+
+                                                <pre><code>
+git config --global user.name "YourName YourSurname"
+git config --global user.email your-email@example.com
+git config --global core.editor nano
+                                                </code></pre></li>
+                                                <li><p><strong>Your first commit</strong>: You have already made some small and basic
+                                                    changes in the steps above and you are in your project's <code>master</code>
+                                                    branch. So, you can officially make your first commit in your
+                                                    project's history and push it. But before that, you need to make sure
+                                                    that there are no problems in the project. This is a good habit to
+                                                    always re-build the system before a commit to be sure it works as
+                                                    expected.</p>
+
+                                                    <pre><code>
+git status                 <span class="comment"># See which files you have changed.</span>
+git diff                   <span class="comment"># Check the lines you have added/changed.</span>
+./project make             <span class="comment"># Make sure everything builds successfully.</span>
+git add -u                 <span class="comment"># Put all tracked changes in staging area.</span>
+git status                 <span class="comment"># Make sure everything is fine.</span>
+git diff --cached          <span class="comment"># Confirm all the changes that will be committed.</span>
+git commit                 <span class="comment"># Your first commit: put a good description!</span>
+git push                   <span class="comment"># Push your commit to your remote.</span>
+                                                    </code></pre></li>
+                                                    <li><p><strong>Start your exciting research</strong>: You are now ready to add flesh and
+                                                        blood to this raw skeleton by further modifying and adding your
+                                                        exciting research steps. You can use the "published works" section in
+                                                        the introduction (above) as some fully working models to learn
+                                                        from. Also, don't hesitate to contact us if you have any
+                                                        questions.</p></li>
+                </ol>
+
+                <h2>Other basic customizations</h2>
+
+                <ul>
+                    <li><p><strong>High-level software</strong>: Maneage installs all the software that your
+                        project needs. You can specify which software your project needs in
+                        <code>reproduce/software/config/TARGETS.conf</code>. The necessary software are
+                        classified into two classes: 1) programs or libraries (usually written
+                        in C/C++) which are run directly by the operating system. 2) Python
+                        modules/libraries that are run within Python. By default
+                        <code>TARGETS.conf</code> only has GNU Astronomy Utilities (Gnuastro) as one
+                        scientific program and Astropy as one scientific Python module. Both
+                        have many dependencies which will be installed into your project
+                        during the configuration step. To see a list of software that are
+                        currently ready to be built in Maneage, see
+                        <code>reproduce/software/config/versions.conf</code> (which has their versions
+                        also), the comments in <code>TARGETS.conf</code> describe how to use the software
+                        name from <code>versions.conf</code>. Currently the raw pipeline just uses
+                        Gnuastro to make the demonstration plots. Therefore if you don't need
+                        Gnuastro, go through the analysis steps in <code>reproduce/analysis</code> and
+                        remove all its use cases (clearly marked).</p></li>
+                    <li><p><strong>Input dataset</strong>: The input datasets are managed through the
+                        <code>reproduce/analysis/config/INPUTS.conf</code> file. It is best to gather all
+                        the information regarding all the input datasets into this one central
+                        file. To ensure that the proper dataset is being downloaded and used
+                        by the project, it is also recommended get an <a href="https://en.wikipedia.org/wiki/MD5">MD5
+                            checksum</a> of the file and include
+                        that in <code>INPUTS.conf</code> so the project can check it automatically. The
+                        preparation/downloading of the input datasets is done in
+                        <code>reproduce/analysis/make/download.mk</code>. Have a look there to see how
+                        these values are to be used. This information about the input datasets
+                        is also used in the initial <code>configure</code> script (to inform the users),
+                        so also modify that file. You can find all occurrences of the demo
+                        dataset with the command below and replace it with your input's
+                        dataset.</p>
+
+                        <pre><code>
+grep -ir wfpc2 ./*
+                        </code></pre></li>
+                        <li><p><strong><code>README.md</code></strong>: Correct all the <code>XXXXX</code> place holders (name of your
+                            project, your own name, address of your project's online/remote
+                            repository, link to download dependencies and etc). Generally, read
+                            over the text and update it where necessary to fit your project. Don't
+                            forget that this is the first file that is displayed on your online
+                            repository and also your colleagues will first be drawn to read this
+                            file. Therefore, make it as easy as possible for them to start
+                            with. Also check and update this file one last time when you are ready
+                            to publish your project's paper/source.</p></li>
+                        <li><p><strong>Verify outputs</strong>: During the initial customization checklist, you
+                            disabled verification. This is natural because during the project you
+                            need to make changes all the time and its a waste of time to enable
+                            verification every time. But at significant moments of the project
+                            (for example before submission to a journal, or publication) it is
+                            necessary. When you activate verification, before building the paper,
+                            all the specified datasets will be compared with their respective
+                            checksum and if any file's checksum is different from the one recorded
+                            in the project, it will stop and print the problematic file and its
+                            expected and calculated checksums. First set the value of
+                            <code>verify-outputs</code> variable in
+                            <code>reproduce/analysis/config/verify-outputs.conf</code> to <code>yes</code>. Then go to
+                            <code>reproduce/analysis/make/verify.mk</code>. The verification of all the files
+                            is only done in one recipe. First the files that go into the
+                            plots/figures are checked, then the LaTeX macros. Validation of the
+                            former (inputs to plots/figures) should be done manually. If its the
+                            first time you are doing this, you can see two examples of the dummy
+                            steps (with <code>delete-me</code>, you can use them if you like). These two
+                            examples should be removed before you can run the project. For the
+                            latter, you just have to update the checksums. The important thing to
+                            consider is that a simple checksum can be problematic because some
+                            file generators print their run-time date in the file (for example as
+                            commented lines in a text table). When checking text files, this
+                            Makefile already has this function:
+                            <code>verify-txt-no-comments-leading-space</code>. As the name suggests, it will
+                            remove comment lines and empty lines before calculating the MD5
+                            checksum. For FITS formats (common in astronomy, fortunately there is
+                            a <code>DATASUM</code> definition which will return the checksum independent of
+                            the headers. You can use the provided function(s), or define one for
+                            your special formats.</p></li>
+                        <li><p><strong>Feedback</strong>: As you use Maneage you will notice many things that if
+                            implemented from the start would have been very useful for your
+                            work. This can be in the actual scripting and architecture of Maneage,
+                            or useful implementation and usage tips, like those below. In any
+                            case, please share your thoughts and suggestions with us, so we can
+                            add them here for everyone's benefit.</p></li>
+                        <li><p><strong>Re-preparation</strong>: Automatic preparation is only run in the first run
+                            of the project on a system, to re-do the preparation you have to use
+                            the option below. Here is the reason for this: when its necessary, the
+                            preparation process can be slow and will unnecessarily slow down the
+                            whole project while the project is under development (focus is on the
+                            analysis that is done after preparation). Because of this, preparation
+                            will be done automatically for the first time that the project is run
+                            (when <code>.build/software/preparation-done.mk</code> doesn't exist). After the
+                            preparation process completes once, future runs of <code>./project make</code>
+                            will not do the preparation process anymore (will not call
+                            <code>top-prepare.mk</code>). They will only call <code>top-make.mk</code> for the
+                            analysis. To manually invoke the preparation process after the first
+                            attempt, the <code>./project make</code> script should be run with the
+                            <code>--prepare-redo</code> option, or you can delete the special file above.</p>
+
+                            <pre><code>
+./project make --prepare-redo
+                            </code></pre></li>
+                            <li><p><strong>Pre-publication</strong>: add notice on reproducibility**: Add a notice
+                                somewhere prominent in the first page within your paper, informing the
+                                reader that your research is fully reproducible. For example in the
+                                end of the abstract, or under the keywords with a title like
+                                "reproducible paper". This will encourage them to publish their own
+                                works in this manner also and also will help spread the word.</p></li>
+                </ul>
+
+                <h1>Tips for designing your project</h1>
+
+                <p>The following is a list of design points, tips, or recommendations that
+                have been learned after some experience with this type of project
+                management. Please don't hesitate to share any experience you gain after
+                using it with us. In this way, we can add it here (with full giving credit)
+                for the benefit of others.</p>
+
+                <ul>
+                    <li><p><strong>Modularity</strong>: Modularity is the key to easy and clean growth of a
+                        project. So it is always best to break up a job into as many
+                        sub-components as reasonable. Here are some tips to stay modular.</p>
+
+                        <ul>
+                            <li><p><em>Short recipes</em>: if you see the recipe of a rule becoming more than a
+                                handful of lines which involve significant processing, it is probably
+                                a good sign that you should break up the rule into its main
+                                components. Try to only have one major processing step per rule.</p></li>
+                            <li><p><em>Context-based (many) Makefiles</em>: For maximum modularity, this design
+                                allows easy inclusion of many Makefiles: in
+                                <code>reproduce/analysis/make/*.mk</code> for analysis steps, and
+                                <code>reproduce/software/make/*.mk</code> for building software. So keep the
+                                rules for closely related parts of the processing in separate
+                                Makefiles.</p></li>
+                            <li><p><em>Descriptive names</em>: Be very clear and descriptive with the naming of
+                                the files and the variables because a few months after the
+                                processing, it will be very hard to remember what each one was
+                                for. Also this helps others (your collaborators or other people
+                                reading the project source after it is published) to more easily
+                                understand your work and find their way around.</p></li>
+                            <li><p><em>Naming convention</em>: As the project grows, following a single standard
+                                or convention in naming the files is very useful. Try best to use
+                                multiple word filenames for anything that is non-trivial (separating
+                                the words with a <code>-</code>). For example if you have a Makefile for
+                                creating a catalog and another two for processing it under models A
+                                and B, you can name them like this: <code>catalog-create.mk</code>,
+                                <code>catalog-model-a.mk</code> and <code>catalog-model-b.mk</code>. In this way, when
+                                listing the contents of <code>reproduce/analysis/make</code> to see all the
+                                Makefiles, those related to the catalog will all be close to each
+                                other and thus easily found. This also helps in auto-completions by
+                                the shell or text editors like Emacs.</p></li>
+                            <li><p><em>Source directories</em>: If you need to add files in other languages for
+                                example in shell, Python, AWK or C, keep the files in the same
+                                language in a separate directory under <code>reproduce/analysis</code>, with the
+                                appropriate name.</p></li>
+                            <li><p><em>Configuration files</em>: If your research uses special programs as part
+                                of the processing, put all their configuration files in a devoted
+                                directory (with the program's name) within
+                                <code>reproduce/software/config</code>. Similar to the
+                                <code>reproduce/software/config/gnuastro</code> directory (which is put in
+                                Maneage as a demo in case you use GNU Astronomy Utilities). It is
+                                much cleaner and readable (thus less buggy) to avoid mixing the
+                                configuration files, even if there is no technical necessity.</p></li>
+                        </ul></li>
+                        <li><p><strong>Contents</strong>: It is good practice to follow the following
+                            recommendations on the contents of your files, whether they are source
+                            code for a program, Makefiles, scripts or configuration files
+                            (copyrights aren't necessary for the latter).</p>
+
+                            <ul>
+                                <li><p><em>Copyright</em>: Always start a file containing programming constructs
+                                    with a copyright statement like the ones that Maneage starts with
+                                    (for example in the top level <code>Makefile</code>).</p></li>
+                                <li><p><em>Comments</em>: Comments are vital for readability (by yourself in two
+                                    months, or others). Describe everything you can about why you are
+                                    doing something, how you are doing it, and what you expect the result
+                                    to be. Write the comments as if it was what you would say to describe
+                                    the variable, recipe or rule to a friend sitting beside you. When
+                                    writing the project it is very tempting to just steam ahead with
+                                    commands and codes, but be patient and write comments before the
+                                    rules or recipes. This will also allow you to think more about what
+                                    you should be doing. Also, in several months when you come back to
+                                    the code, you will appreciate the effort of writing them. Just don't
+                                    forget to also read and update the comment first if you later want to
+                                    make changes to the code (variable, recipe or rule). As a general
+                                    rule of thumb: first the comments, then the code.</p></li>
+                                <li><p><em>File title</em>: In general, it is good practice to start all files with
+                                    a single line description of what that particular file does. If
+                                    further information about the totality of the file is necessary, add
+                                    it after a blank line. This will help a fast inspection where you
+                                    don't care about the details, but just want to remember/see what that
+                                    file is (generally) for. This information must of course be commented
+                                    (its for a human), but this is kept separate from the general
+                                    recommendation on comments, because this is a comment for the whole
+                                    file, not each step within it.</p></li>
+                            </ul></li>
+                            <li><p><strong>Make programming</strong>: Here are some experiences that we have come to
+                                learn over the years in using Make and are useful/handy in research
+                                contexts.</p>
+
+                                <ul>
+                                    <li><p><em>Environment of each recipe</em>: If you need to define a special
+                                        environment (or aliases, or scripts to run) for all the recipes in
+                                        your Makefiles, you can use a Bash startup file
+                                        <code>reproduce/software/shell/bashrc.sh</code>. This file is loaded before every
+                                        Make recipe is run, just like the <code>.bashrc</code> in your home directory is
+                                        loaded every time you start a new interactive, non-login terminal. See
+                                        the comments in that file for more.</p></li>
+                                    <li><p><em>Automatic variables</em>: These are wonderful and very useful Make
+                                        constructs that greatly shrink the text, while helping in
+                                        read-ability, robustness (less bugs in typos for example) and
+                                        generalization. For example even when a rule only has one target or
+                                        one prerequisite, always use <code>$@</code> instead of the target's name, <code>$&lt;</code>
+                                        instead of the first prerequisite, <code>$^</code> instead of the full list of
+                                        prerequisites and etc. You can see the full list of automatic
+                                        variables
+                                        <a href="https://www.gnu.org/software/make/manual/html_node/Automatic-Variables.html">here</a>. If
+                                        you use GNU Make, you can also see this page on your command-line:</p>
+
+                                        <pre><code>
+info make "automatic variables"
+                                        </code></pre></li>
+                                        <li><p><em>Debug</em>: Since Make doesn't follow the common top-down paradigm, it
+                                            can be a little hard to get accustomed to why you get an error or
+                                            un-expected behavior. In such cases, run Make with the <code>-d</code>
+                                            option. With this option, Make prints a full list of exactly which
+                                            prerequisites are being checked for which targets. Looking
+                                            (patiently) through this output and searching for the faulty
+                                            file/step will clearly show you any mistake you might have made in
+                                            defining the targets or prerequisites.</p></li>
+                                        <li><p><em>Large files</em>: If you are dealing with very large files (thus having
+                                            multiple copies of them for intermediate steps is not possible), one
+                                            solution is the following strategy (Also see the next item on "Fast
+                                            access to temporary files"). Set a small plain text file as the
+                                            actual target and delete the large file when it is no longer needed
+                                            by the project (in the last rule that needs it). Below is a simple
+                                            demonstration of doing this. In it, we use Gnuastro's Arithmetic
+                                            program to add all pixels of the input image with 2 and create
+                                            <code>large1.fits</code>. We then subtract 2 from <code>large1.fits</code> to create
+                                            <code>large2.fits</code> and delete <code>large1.fits</code> in the same rule (when its no
+                                            longer needed). We can later do the same with <code>large2.fits</code> when it
+                                            is no longer needed and so on.
+                                            <pre><code>
 large1.fits.txt: input.fits
-        astarithmetic $&lt; 2 + --output=$(subst .txt,,$@)
-        echo "done" &gt; $@
+astarithmetic $&lt; 2 + --output=$(subst .txt,,$@)
+echo "done" &gt; $@
 large2.fits.txt: large1.fits.txt
-        astarithmetic $(subst .txt,,$&lt;) 2 - --output=$(subst .txt,,$@)
-        rm $(subst .txt,,$&lt;)
-        echo "done" &gt; $@
-</code>
-A more advanced Make programmer will use Make's <a href="https://www.gnu.org/software/make/manual/html_node/Call-Function.html">call
-function</a>
-to define a wrapper in <code>reproduce/analysis/make/initialize.mk</code>. This
-wrapper will replace <code>$(subst .txt,,XXXXX)</code>. Therefore, it will be
-possible to greatly simplify this repetitive statement and make the
-code even more readable throughout the whole project.</p></li>
-<li><p><em>Fast access to temporary files</em>: Most Unix-like operating systems
-will give you a special shared-memory device (directory): on systems
-using the GNU C Library (all GNU/Linux system), it is <code>/dev/shm</code>. The
-contents of this directory are actually in your RAM, not in your
-persistence storage like the HDD or SSD. Reading and writing from/to
-the RAM is much faster than persistent storage, so if you have enough
-RAM available, it can be very beneficial for large temporary files to
-be put there. You can use the <code>mktemp</code> program to give the temporary
-files a randomly-set name, and use text files as targets to keep that
-name (as described in the item above under "Large files") for later
-deletion. For example, see the minimal working example Makefile below
-(which you can actually put in a <code>Makefile</code> and run if you have an
-<code>input.fits</code> in the same directory, and Gnuastro is installed).
-<code>
+astarithmetic $(subst .txt,,$&lt;) 2 - --output=$(subst .txt,,$@)
+rm $(subst .txt,,$&lt;)
+echo "done" &gt; $@
+                                            </code></pre>
+                                            A more advanced Make programmer will use Make's <a href="https://www.gnu.org/software/make/manual/html_node/Call-Function.html">call function</a>
+                                            to define a wrapper in <code>reproduce/analysis/make/initialize.mk</code>. This
+                                            wrapper will replace <code>$(subst .txt,,XXXXX)</code>. Therefore, it will be
+                                            possible to greatly simplify this repetitive statement and make the
+                                            code even more readable throughout the whole project.</p></li>
+                                        <li><p><em>Fast access to temporary files</em>: Most Unix-like operating systems
+                                            will give you a special shared-memory device (directory): on systems
+                                            using the GNU C Library (all GNU/Linux system), it is <code>/dev/shm</code>. The
+                                            contents of this directory are actually in your RAM, not in your
+                                            persistence storage like the HDD or SSD. Reading and writing from/to
+                                            the RAM is much faster than persistent storage, so if you have enough
+                                            RAM available, it can be very beneficial for large temporary files to
+                                            be put there. You can use the <code>mktemp</code> program to give the temporary
+                                            files a randomly-set name, and use text files as targets to keep that
+                                            name (as described in the item above under "Large files") for later
+                                            deletion. For example, see the minimal working example Makefile below
+                                            (which you can actually put in a <code>Makefile</code> and run if you have an
+                                            <code>input.fits</code> in the same directory, and Gnuastro is installed).
+                                            <pre><code>
 .ONESHELL:
 .SHELLFLAGS = -ec
 all: mean-std.txt
 shm-maneage := /dev/shm/$(shell whoami)-maneage-XXXXXXXXXX
 large1.txt: input.fits
-        out=$$(mktemp $(shm-maneage))
-        astarithmetic $&lt; 2 + --output=$$out.fits
-        echo "$$out" &gt; $@
+out=$$(mktemp $(shm-maneage))
+astarithmetic $&lt; 2 + --output=$$out.fits
+echo "$$out" &gt; $@
 large2.txt: large1.txt
-        input=$$(cat $&lt;)
-        out=$$(mktemp $(shm-maneage))
-        astarithmetic $$input.fits 2 - --output=$$out.fits
-        rm $$input.fits $$input
-        echo "$$out" &gt; $@
+input=$$(cat $&lt;)
+out=$$(mktemp $(shm-maneage))
+astarithmetic $$input.fits 2 - --output=$$out.fits
+rm $$input.fits $$input
+echo "$$out" &gt; $@
 mean-std.txt: large2.txt
-        input=$$(cat $&lt;)
-        aststatistics $$input.fits --mean --std &gt; $@
-        rm $$input.fits $$input
-</code>
-The important point here is that the temporary name template
-(<code>shm-maneage</code>) has no suffix. So you can add the suffix
-corresponding to your desired format afterwards (for example
-<code>$$out.fits</code>, or <code>$$out.txt</code>). But more importantly, when <code>mktemp</code>
-sets the random name, it also checks if no file exists with that name
-and creates a file with that exact name at that moment. So at the end
-of each recipe above, you'll have two files in your <code>/dev/shm</code>, one
-empty file with no suffix one with a suffix. The role of the file
-without a suffix is just to ensure that the randomly set name will
-not be used by other calls to <code>mktemp</code> (when running in parallel) and
-it should be deleted with the file containing a suffix. This is the
-reason behind the <code>rm $$input.fits $$input</code> command above: to make
-sure that first the file with a suffix is deleted, then the core
-random file (note that when working in parallel on powerful systems,
-in the time between deleting two files of a single <code>rm</code> command, many
-things can happen!). When using Maneage, you can put the definition
-of <code>shm-maneage</code> in <code>reproduce/analysis/make/initialize.mk</code> to be
-usable in all the different Makefiles of your analysis, and you won't
-need the three lines above it. <strong>Finally, BE RESPONSIBLE:</strong> after you
-are finished, be sure to clean up any possibly remaining files (due
-to crashes in the processing while you are working), otherwise your
-RAM may fill up very fast. You can do it easily with a command like
-this on your command-line: <code>rm -f /dev/shm/$(whoami)-*</code>.</p></li>
-</ul></li>
-<li><p><strong>Software tarballs and raw inputs</strong>: It is critically important to
- document the raw inputs to your project (software tarballs and raw
- input data):</p>
-
-<ul>
-<li><p><em>Keep the source tarball of dependencies</em>: After configuration
-finishes, the <code>.build/software/tarballs</code> directory will contain all
-the software tarballs that were necessary for your project. You can
-mirror the contents of this directory to keep a backup of all the
-software tarballs used in your project (possibly as another version
-controlled repository) that is also published with your project. Note
-that software web-pages are not written in stone and can suddenly go
-offline or not be accessible in some conditions. This backup is thus
-very important. If you intend to release your project in a place like
-Zenodo, you can upload/keep all the necessary tarballs (and data)
-there with your
-project. <a href="https://doi.org/10.5281/zenodo.1163746">zenodo.1163746</a> is
-one example of how the data, Gnuastro (main software used) and all
-major Gnuastro's dependencies have been uploaded with the project's
-source. Just note that this is only possible for free and open-source
-software.</p></li>
-<li><p><em>Keep your input data</em>: The input data is also critical to the
-project's reproducibility, so like the above for software, make sure
-you have a backup of them, or their persistent identifiers (PIDs).</p></li>
-</ul></li>
-<li><p><strong>Version control</strong>: Version control is a critical component of
- Maneage. Here are some tips to help in effectively using it.</p>
-
-<ul>
-<li><p><em>Regular commits</em>: It is important (and extremely useful) to have the
-history of your project under version control. So try to make commits
-regularly (after any meaningful change/step/result).</p></li>
-<li><p><em>Keep Maneage up-to-date</em>: In time, Maneage is going to become more
-and more mature and robust (thanks to your feedback and the feedback
-of other users). Bugs will be fixed and new/improved features will be
-added. So every once and a while, you can run the commands below to
-pull new work that is done in Maneage. If the changes are useful for
-your work, you can merge them with your project to benefit from
-them. Just pay <strong>very close attention</strong> to resolving possible
-<strong>conflicts</strong> which might happen in the merge (updated settings that
-you have customized in Maneage).</p>
-
-<p><code>shell
-$ git checkout maneage
-$ git pull                            # Get recent work in Maneage
-$ git log XXXXXX..XXXXXX --reverse    # Inspect new work (replace XXXXXXs with hashs mentioned in output of previous command).
-$ git log --oneline --graph --decorate --all # General view of branches.
-$ git checkout master                 # Go to your top working branch.
-$ git merge maneage                   # Import all the work into master.
-</code></p></li>
-<li><p><em>Adding Maneage to a fork of your project</em>: As you and your colleagues
-continue your project, it will be necessary to have separate
-forks/clones of it. But when you clone your own project on a
-different system, or a colleague clones it to collaborate with you,
-the clone won't have the <code>origin-maneage</code> remote that you started the
-project with. As shown in the previous item above, you need this
-remote to be able to pull recent updates from Maneage. The steps
-below will setup the <code>origin-maneage</code> remote, and a local <code>maneage</code>
-branch to track it, on the new clone.</p>
-
-<p><code>shell
-$ git remote add origin-maneage https://git.maneage.org/project.git
-$ git fetch origin-maneage
-$ git checkout -b maneage --track origin-maneage/maneage
-</code></p></li>
-<li><p><em>Commit message</em>: The commit message is a very important and useful
-aspect of version control. To make the commit message useful for
-others (or yourself, one year later), it is good to follow a
-consistent style. Maneage already has a consistent formatting
-(described below), which you can also follow in your project if you
-like. You can see many examples by running <code>git log</code> in the <code>maneage</code>
-branch. If you intend to push commits to Maneage, for the consistency
-of Maneage, it is necessary to follow these guidelines. 1) No line
-should be more than 75 characters (to enable easy reading of the
-message when you run <code>git log</code> on the standard 80-character
-terminal). 2) The first line is the title of the commit and should
-summarize it (so <code>git log --oneline</code> can be useful). The title should
-also not end with a point (<code>.</code>, because its a short single sentence,
-so a point is not necessary and only wastes space). 3) After the
-title, leave an empty line and start the body of your message
-(possibly containing many paragraphs). 4) Describe the context of
-your commit (the problem it is trying to solve) as much as possible,
-then go onto how you solved it. One suggestion is to start the main
-body of your commit with "Until now ...", and continue describing the
-problem in the first paragraph(s). Afterwards, start the next
-paragraph with "With this commit ...".</p></li>
-<li><p><em>Project outputs</em>: During your research, it is possible to checkout a
-specific commit and reproduce its results. However, the processing
-can be time consuming. Therefore, it is useful to also keep track of
-the final outputs of your project (at minimum, the paper's PDF) in
-important points of history.  However, keeping a snapshot of these
-(most probably large volume) outputs in the main history of the
-project can unreasonably bloat it. It is thus recommended to make a
-separate Git repo to keep those files and keep your project's source
-as small as possible. For example if your project is called
-<code>my-exciting-project</code>, the name of the outputs repository can be
-<code>my-exciting-project-output</code>. This enables easy sharing of the output
-files with your co-authors (with necessary permissions) and not
-having to bloat your email archive with extra attachments also (you
-can just share the link to the online repo in your
-communications). After the research is published, you can also
-release the outputs repository, or you can just delete it if it is
-too large or un-necessary (it was just for convenience, and fully
-reproducible after all). For example Maneage's output is available
-for demonstration in <a href="http://git.maneage.org/output-raw.git/">a
-separate</a> repository.</p></li>
-<li><p><em>Full Git history in one file</em>: When you are publishing your project
-(for example to Zenodo for long term preservation), it is more
-convenient to have the whole project's Git history into one file to
-save with your datasets. After all, you can't be sure that your
-current Git server (for example GitLab, Github, or Bitbucket) will be
-active forever. While they are good for the immediate future, you
-can't rely on them for archival purposes. Fortunately keeping your
-whole history in one file is easy with Git using the following
-commands. To learn more about it, run <code>git help bundle</code>.</p>
-
-<ul>
-<li>"bundle" your project's history into one file (just don't forget to
-change <code>my-project-git.bundle</code> to a descriptive name of your
-project):</li>
-</ul>
-
-<p><code>shell
-$ git bundle create my-project-git.bundle --all
-</code></p>
-
-<ul>
-<li>You can easily upload <code>my-project-git.bundle</code> anywhere. Later, if
-you need to un-bundle it, you can use the following command.</li>
-</ul>
-
-<p><p><p><code>shell
-$ git clone my-project-git.bundle
-</code></p></li>
-</ul></p></li>
-</ul></p>
-
-<h1>Future improvements</h1>
-
-<p>This is an evolving project and as time goes on, it will evolve and become
-more robust. Some of the most prominent issues we plan to implement in the
-future are listed below, please join us if you are interested.</p>
-
-<h2>Package management</h2>
-
-<p>It is important to have control of the environment of the project. Maneage
-currently builds the higher-level programs (for example GNU Bash, GNU Make,
-GNU AWK and domain-specific software) it needs, then sets <code>PATH</code> so the
-analysis is done only with the project's built software. But currently the
-configuration of each program is in the Makefile rules that build it. This
-is not good because a change in the build configuration does not
-automatically cause a re-build. Also, each separate project on a system
-needs to have its own built tools (that can waste a lot of space).</p>
-
-<p>A good solution is based on the <a href="https://nixos.org/nix/about.html">Nix package
-manager</a>: a separate file is present for
-each software, containing all the necessary info to build it (including its
-URL, its tarball MD5 hash, dependencies, configuration parameters, build
-steps and etc). Using this file, a script can automatically generate the
-Make rules to download, build and install program and its dependencies
-(along with the dependencies of those dependencies and etc).</p>
-
-<p>All the software are installed in a "store". Each installed file (library
-or executable) is prefixed by a hash of this configuration (and the OS
-architecture) and the standard program name. For example (from the Nix
-webpage):</p>
-
-<p><code>
+input=$$(cat $&lt;)
+aststatistics $$input.fits --mean --std &gt; $@
+rm $$input.fits $$input
+                                            </code></pre>
+                                            The important point here is that the temporary name template
+                                            (<code>shm-maneage</code>) has no suffix. So you can add the suffix
+                                            corresponding to your desired format afterwards (for example
+                                            <code>$$out.fits</code>, or <code>$$out.txt</code>). But more importantly, when <code>mktemp</code>
+                                            sets the random name, it also checks if no file exists with that name
+                                            and creates a file with that exact name at that moment. So at the end
+                                            of each recipe above, you'll have two files in your <code>/dev/shm</code>, one
+                                            empty file with no suffix one with a suffix. The role of the file
+                                            without a suffix is just to ensure that the randomly set name will
+                                            not be used by other calls to <code>mktemp</code> (when running in parallel) and
+                                            it should be deleted with the file containing a suffix. This is the
+                                            reason behind the <code>rm $$input.fits $$input</code> command above: to make
+                                            sure that first the file with a suffix is deleted, then the core
+                                            random file (note that when working in parallel on powerful systems,
+                                            in the time between deleting two files of a single <code>rm</code> command, many
+                                            things can happen!). When using Maneage, you can put the definition
+                                            of <code>shm-maneage</code> in <code>reproduce/analysis/make/initialize.mk</code> to be
+                                            usable in all the different Makefiles of your analysis, and you won't
+                                            need the three lines above it. <strong>Finally, BE RESPONSIBLE:</strong> after you
+                                            are finished, be sure to clean up any possibly remaining files (due
+                                            to crashes in the processing while you are working), otherwise your
+                                            RAM may fill up very fast. You can do it easily with a command like
+                                            this on your command-line: <code>rm -f /dev/shm/$(whoami)-*</code>.</p></li>
+                                </ul></li>
+                                <li><p><strong>Software tarballs and raw inputs</strong>: It is critically important to
+                                    document the raw inputs to your project (software tarballs and raw
+                                    input data):</p>
+
+                                    <ul>
+                                        <li><p><em>Keep the source tarball of dependencies</em>: After configuration
+                                            finishes, the <code>.build/software/tarballs</code> directory will contain all
+                                            the software tarballs that were necessary for your project. You can
+                                            mirror the contents of this directory to keep a backup of all the
+                                            software tarballs used in your project (possibly as another version
+                                            controlled repository) that is also published with your project. Note
+                                            that software web-pages are not written in stone and can suddenly go
+                                            offline or not be accessible in some conditions. This backup is thus
+                                            very important. If you intend to release your project in a place like
+                                            Zenodo, you can upload/keep all the necessary tarballs (and data)
+                                            there with your
+                                            project. <a href="https://doi.org/10.5281/zenodo.1163746">zenodo.1163746</a> is
+                                            one example of how the data, Gnuastro (main software used) and all
+                                            major Gnuastro's dependencies have been uploaded with the project's
+                                            source. Just note that this is only possible for free and open-source
+                                            software.</p></li>
+                                        <li><p><em>Keep your input data</em>: The input data is also critical to the
+                                            project's reproducibility, so like the above for software, make sure
+                                            you have a backup of them, or their persistent identifiers (PIDs).</p></li>
+                                    </ul></li>
+                                    <li><p><strong>Version control</strong>: Version control is a critical component of
+                                        Maneage. Here are some tips to help in effectively using it.</p>
+
+                                        <ul>
+                                            <li><p><em>Regular commits</em>: It is important (and extremely useful) to have the
+                                                history of your project under version control. So try to make commits
+                                                regularly (after any meaningful change/step/result).</p></li>
+                                            <li><p><em>Keep Maneage up-to-date</em>: In time, Maneage is going to become more
+                                                and more mature and robust (thanks to your feedback and the feedback
+                                                of other users). Bugs will be fixed and new/improved features will be
+                                                added. So every once and a while, you can run the commands below to
+                                                pull new work that is done in Maneage. If the changes are useful for
+                                                your work, you can merge them with your project to benefit from
+                                                them. Just pay <strong>very close attention</strong> to resolving possible
+                                                <strong>conflicts</strong> which might happen in the merge (updated settings that
+                                                you have customized in Maneage).</p>
+
+                                                <pre><code>
+git checkout maneage
+git pull                            <span class="comment"># Get recent work in Maneage</span>
+git log XXXXXX..XXXXXX --reverse    <span class="comment"># Inspect new work (replace XXXXXXs with hashs mentioned in output of previous command).</span>
+git log --oneline --graph --decorate --all <span class="comment"># General view of branches.</span>
+git checkout master                 <span class="comment"># Go to your top working branch.</span>
+git merge maneage                   <span class="comment"># Import all the work into master.</span>
+                                                </code></pre></li>
+                                                <li><p><em>Adding Maneage to a fork of your project</em>: As you and your colleagues
+                                                    continue your project, it will be necessary to have separate
+                                                    forks/clones of it. But when you clone your own project on a
+                                                    different system, or a colleague clones it to collaborate with you,
+                                                    the clone won't have the <code>origin-maneage</code> remote that you started the
+                                                    project with. As shown in the previous item above, you need this
+                                                    remote to be able to pull recent updates from Maneage. The steps
+                                                    below will setup the <code>origin-maneage</code> remote, and a local <code>maneage</code>
+                                                    branch to track it, on the new clone.</p>
+
+                                                    <pre><code>
+git remote add origin-maneage https://git.maneage.org/project.git
+git fetch origin-maneage
+git checkout -b maneage --track origin-maneage/maneage
+                                                    </code></pre></li>
+                                                    <li><p><em>Commit message</em>: The commit message is a very important and useful
+                                                        aspect of version control. To make the commit message useful for
+                                                        others (or yourself, one year later), it is good to follow a
+                                                        consistent style. Maneage already has a consistent formatting
+                                                        (described below), which you can also follow in your project if you
+                                                        like. You can see many examples by running <code>git log</code> in the <code>maneage</code>
+                                                        branch. If you intend to push commits to Maneage, for the consistency
+                                                        of Maneage, it is necessary to follow these guidelines. 1) No line
+                                                        should be more than 75 characters (to enable easy reading of the
+                                                        message when you run <code>git log</code> on the standard 80-character
+                                                        terminal). 2) The first line is the title of the commit and should
+                                                        summarize it (so <code>git log --oneline</code> can be useful). The title should
+                                                        also not end with a point (<code>.</code>, because its a short single sentence,
+                                                        so a point is not necessary and only wastes space). 3) After the
+                                                        title, leave an empty line and start the body of your message
+                                                        (possibly containing many paragraphs). 4) Describe the context of
+                                                        your commit (the problem it is trying to solve) as much as possible,
+                                                        then go onto how you solved it. One suggestion is to start the main
+                                                        body of your commit with "Until now ...", and continue describing the
+                                                        problem in the first paragraph(s). Afterwards, start the next
+                                                        paragraph with "With this commit ...".</p></li>
+                                                    <li><p><em>Project outputs</em>: During your research, it is possible to checkout a
+                                                        specific commit and reproduce its results. However, the processing
+                                                        can be time consuming. Therefore, it is useful to also keep track of
+                                                        the final outputs of your project (at minimum, the paper's PDF) in
+                                                        important points of history.  However, keeping a snapshot of these
+                                                        (most probably large volume) outputs in the main history of the
+                                                        project can unreasonably bloat it. It is thus recommended to make a
+                                                        separate Git repo to keep those files and keep your project's source
+                                                        as small as possible. For example if your project is called
+                                                        <code>my-exciting-project</code>, the name of the outputs repository can be
+                                                        <code>my-exciting-project-output</code>. This enables easy sharing of the output
+                                                        files with your co-authors (with necessary permissions) and not
+                                                        having to bloat your email archive with extra attachments also (you
+                                                        can just share the link to the online repo in your
+                                                        communications). After the research is published, you can also
+                                                        release the outputs repository, or you can just delete it if it is
+                                                        too large or un-necessary (it was just for convenience, and fully
+                                                        reproducible after all). For example Maneage's output is available
+                                                        for demonstration in <a href="http://git.maneage.org/output-raw.git/">a
+                                                            separate</a> repository.</p></li>
+                                                    <li><p><em>Full Git history in one file</em>: When you are publishing your project
+                                                        (for example to Zenodo for long term preservation), it is more
+                                                        convenient to have the whole project's Git history into one file to
+                                                        save with your datasets. After all, you can't be sure that your
+                                                        current Git server (for example GitLab, Github, or Bitbucket) will be
+                                                        active forever. While they are good for the immediate future, you
+                                                        can't rely on them for archival purposes. Fortunately keeping your
+                                                        whole history in one file is easy with Git using the following
+                                                        commands. To learn more about it, run <code>git help bundle</code>.</p>
+
+                                                        <ul>
+                                                            <li>"bundle" your project's history into one file (just don't forget to
+                                                                change <code>my-project-git.bundle</code> to a descriptive name of your
+                                                                project):</li>
+                                                        </ul>
+
+                                                        <pre><code>
+git bundle create my-project-git.bundle --all
+                                                        </code></pre>
+
+                                                        <ul>
+                                                            <li>You can easily upload <code>my-project-git.bundle</code> anywhere. Later, if
+                                                                you need to un-bundle it, you can use the following command.</li>
+                                                        </ul>
+
+                                                        <p><p><pre><code>
+git clone my-project-git.bundle
+                                                        </code></pre></li>
+                                        </ul></p></li>
+                </ul></p>
+
+                <h1>Future improvements</h1>
+
+                <p>This is an evolving project and as time goes on, it will evolve and become
+                more robust. Some of the most prominent issues we plan to implement in the
+                future are listed below, please join us if you are interested.</p>
+
+                <h2>Package management</h2>
+
+                <p>It is important to have control of the environment of the project. Maneage
+                currently builds the higher-level programs (for example GNU Bash, GNU Make,
+                GNU AWK and domain-specific software) it needs, then sets <code>PATH</code> so the
+                analysis is done only with the project's built software. But currently the
+                configuration of each program is in the Makefile rules that build it. This
+                is not good because a change in the build configuration does not
+                automatically cause a re-build. Also, each separate project on a system
+                needs to have its own built tools (that can waste a lot of space).</p>
+
+                <p>A good solution is based on the <a href="https://nixos.org/nix/about.html">Nix package manager</a>: a separate file is present for
+                each software, containing all the necessary info to build it (including its
+                URL, its tarball MD5 hash, dependencies, configuration parameters, build
+                steps and etc). Using this file, a script can automatically generate the
+                Make rules to download, build and install program and its dependencies
+                (along with the dependencies of those dependencies and etc).</p>
+
+                <p>All the software are installed in a "store". Each installed file (library
+                or executable) is prefixed by a hash of this configuration (and the OS
+                architecture) and the standard program name. For example (from the Nix
+                webpage):</p>
+
+                <pre><code>
 /nix/store/b6gvzjyb2pg0kjfwrjmg1vfhh54ad73z-firefox-33.1/
-</code></p>
-
-<p>The important thing is that the "store" is <em>not</em> in the project's search
-path. After the complete installation of the software, symbolic links are
-made to populate each project's program and library search paths without a
-hash. This hash will be unique to that particular software and its
-particular configuration. So simply by searching for this hash in the
-installed directory, we can find the installed files of that software to
-generate the links.</p>
-
-<p>This scenario has several advantages: 1) a change in a software's build
-configuration triggers a rebuild. 2) a single "store" can be used in many
-projects, thus saving space and configuration time for new projects (that
-commonly have large overlaps in lower-level programs).</p>
-
-<h1>Appendix: Necessity of exact reproduction in scientific research</h1>
-
-<p>In case <a href="http://akhlaghi.org/reproducible-science.html">the link above</a> is
-not accessible at the time of reading, here is a copy of the introduction
-of that link, describing the necessity for a reproducible project like this
-(copied on February 7th, 2018):</p>
-
-<p>The most important element of a "scientific" statement/result is the fact
-that others should be able to falsify it. The Tsunami of data that has
-engulfed astronomers in the last two decades, combined with faster
-processors and faster internet connections has made it much more easier to
-obtain a result. However, these factors have also increased the complexity
-of a scientific analysis, such that it is no longer possible to describe
-all the steps of an analysis in the published paper. Citing this
-difficulty, many authors suffice to describing the generalities of their
-analysis in their papers.</p>
-
-<p>However, It is impossible to falsify (or even study) a result if you can't
-exactly reproduce it. The complexity of modern science makes it vitally
-important to exactly reproduce the final result. Because even a small
-deviation can be due to many different parts of an analysis. Nature is
-already a black box which we are trying so hard to comprehend. Not letting
-other scientists see the exact steps taken to reach a result, or not
-allowing them to modify it (do experiments on it) is a self-imposed black
-box, which only exacerbates our ignorance.</p>
-
-<p>Other scientists should be able to reproduce, check and experiment on the
-results of anything that is to carry the "scientific" label. Any result
-that is not reproducible (due to incomplete information by the author) is
-not scientific: the readers have to have faith in the subjective experience
-of the authors in the very important choice of configuration values and
-order of operations: this is contrary to the scientific spirit.</p>
-
-<h2>Copyright information</h2>
-
-<p>This file is part of Maneage's core: https://git.maneage.org/project.git</p>
-
-<p>Maneage is free software: you can redistribute it and/or modify it under
-the terms of the GNU General Public License as published by the Free
-Software Foundation, either version 3 of the License, or (at your option)
-any later version.</p>
-
-<p>Maneage is distributed in the hope that it will be useful, but WITHOUT ANY
-WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
-FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
-details.</p>
-
-<p>You should have received a copy of the GNU General Public License along
-with Maneage.  If not, see <a href="https://www.gnu.org/licenses/">https://www.gnu.org/licenses/</a>.</p>
+                </code></pre>
+
+                <p>The important thing is that the "store" is <em>not</em> in the project's search
+                path. After the complete installation of the software, symbolic links are
+                made to populate each project's program and library search paths without a
+                hash. This hash will be unique to that particular software and its
+                particular configuration. So simply by searching for this hash in the
+                installed directory, we can find the installed files of that software to
+                generate the links.</p>
+
+                <p>This scenario has several advantages: 1) a change in a software's build
+                configuration triggers a rebuild. 2) a single "store" can be used in many
+                projects, thus saving space and configuration time for new projects (that
+                commonly have large overlaps in lower-level programs).</p>
+
+                <h1>Appendix: Necessity of exact reproduction in scientific research</h1>
+
+                <p>In case <a href="http://akhlaghi.org/reproducible-science.html">the link above</a> is
+                not accessible at the time of reading, here is a copy of the introduction
+                of that link, describing the necessity for a reproducible project like this
+                (copied on February 7th, 2018):</p>
+
+                <p>The most important element of a "scientific" statement/result is the fact
+                that others should be able to falsify it. The Tsunami of data that has
+                engulfed astronomers in the last two decades, combined with faster
+                processors and faster internet connections has made it much more easier to
+                obtain a result. However, these factors have also increased the complexity
+                of a scientific analysis, such that it is no longer possible to describe
+                all the steps of an analysis in the published paper. Citing this
+                difficulty, many authors suffice to describing the generalities of their
+                analysis in their papers.</p>
+
+                <p>However, It is impossible to falsify (or even study) a result if you can't
+                exactly reproduce it. The complexity of modern science makes it vitally
+                important to exactly reproduce the final result. Because even a small
+                deviation can be due to many different parts of an analysis. Nature is
+                already a black box which we are trying so hard to comprehend. Not letting
+                other scientists see the exact steps taken to reach a result, or not
+                allowing them to modify it (do experiments on it) is a self-imposed black
+                box, which only exacerbates our ignorance.</p>
+
+                <p>Other scientists should be able to reproduce, check and experiment on the
+                results of anything that is to carry the "scientific" label. Any result
+                that is not reproducible (due to incomplete information by the author) is
+                not scientific: the readers have to have faith in the subjective experience
+                of the authors in the very important choice of configuration values and
+                order of operations: this is contrary to the scientific spirit.</p>
+
+                <h2>Copyright information</h2>
+
+                <p>This file is part of Maneage's core: https://git.maneage.org/project.git</p>
+
+                <p>Maneage is free software: you can redistribute it and/or modify it under
+                the terms of the GNU General Public License as published by the Free
+                Software Foundation, either version 3 of the License, or (at your option)
+                any later version.</p>
+
+                <p>Maneage is distributed in the hope that it will be useful, but WITHOUT ANY
+                WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+                FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
+                details.</p>
+
+                <p>You should have received a copy of the GNU General Public License along
+                with Maneage.  If not, see <a href="https://www.gnu.org/licenses/">https://www.gnu.org/licenses/</a>.</p>
+            </div>
+        </body>
author	Pedram Ashofteh Ardakani <pedramardakani@gmail.com>	2020-04-29 15:27:13 +0430
committer	Pedram Ashofteh Ardakani <pedramardakani@gmail.com>	2020-04-29 15:28:52 +0430
commit	3fd41afb4bf67d0b2b2aae76b133d97d024ddcbe (patch)
tree	08d3bac9e1ca26ceee3880565eec3a6ee4dd1f49
parent	804463c799d96802c466f10fd5aa9037430f5ae1 (diff)