| Age | Commit message (Collapse) | Author | Lines | 
|---|
|  | Given the referee reports, after discussing with the editors of CiSE, we
decided that it is important to include the complete appendix we had before
that included a thorough review of existing tools and methods. However, the
appendix will not be published in the paper (due to the strict word-count
limit). It will only be used in the arXiv/Zenodo versions of the paper.
This actually created a technical problem: we want the commit hash of the
project source to remain the same when the paper is built with an appendix
or without it.
To fix this problem the choice of including an appendix has gone into the
'project' script as a run-time option called '--no-appendix'. So by default
(when someone just runs './project make'), the PDF will have an appendix,
but when we want to submit to the journal, or when the appendix isn't
needed for a certain reason, we can use this new option. The appendix also
has its own separate bibliography.
Some other corrections made in this commit:
 1. Some new references were added that had an '_' in their source, they
    were corrected in 'references.tex'.
 2. I noticed that 'preamble-style.tex' is not actually used in this paper,
    so it has been deleted. | 
|  | There weren't any conflicts in this merge. | 
|  | Tcl/Tk are a set of tools to provide Graphic User Interface (GUI) support
in some software. But they are not yet natively built within Maneage,
primarily because we have higher-priority work right now. GUI tools in
general aren't high on our priority list right now because GUI tools are
generally good for human interaction (which is contrary to the reproducible
philosophy), not automatic analysis (a core concept in reproducibility). So
even later, when we do include Tcl/Tk in Maneage, their direct usage will
be discouraged.
Until this commit, because we don't yet build Tcl/Tk, the default maneage
install of the statistical package R failed on a Debian Stretch, with 6227
repeats of the line:
'/usr/lib//tcl8.5/tclConfig.sh: line 2: dpkg-architecture:
command not found'
To fix this problem (atleast until Tcl/Tk is installed within Maneage), R
is now configured with the '--without-tcltk' option which fixed the
problem. Please see the description above the R installation instructions
in 'reproduce/software/make/high-level.mk' for more. | 
|  | Following the previous commit, we recognized that the 'IFS' terms are not
necessary and can be even cause problems. So all their occurances in the
scripts of Maneage have been removed with this commit. | 
|  | Until a recent commit, the IFS='"' was added at the start of the variables
in this shell script and as a result, the SPACE character wasn't being used
as a delimiter. This caused a major problem when downloading the tarballs
(all the backup servers were considered as the top link).
With this commit we removed these 'IFS' statements). Because we now check
for the existance of meta-characters in the build directory name, there is
no more problem, and also generally both the calling command and
internally, we have double-qutations around the variable names. So removal
of IFS will not affect the result in this scenario.
This bug was found by Mohammadreza Khellat. | 
|  | Only two small conflicts came up:
 * The addition of the hardware architecture macro in 'paper.tex' (which
   was removed for now, but will be added as the referee has requested
   within the text).
 * The usage of "" around directory variables in 'paper.mk'. | 
|  | While a project is under development, the raw analysis software are not the
only necessary software in a project. We also need tools to all the edit
plain-text files within the Maneaged project. Usually people use their
operating system's plain-text editor. However, when working on the project
on a new computer, or in a container, the plain-text editors will have
different versions, or may not be present at all! This can be very annoying
and frustrating!
With this commit, Maneage now installs GNU Nano as part of the basic
tools. GNU Nano is a very simple and small plain text editor (the installed
size is only ~3.5MB, and it is friendly to new users). Therefore, any
Maneaged project can assume atleast Nano will be present (in particular
when no editor is available on the running system!). GNU Emacs and VIM
(both without extra dependencies, in particular without GUI support) are
also optionally available in 'high-level.mk' (by adding them to
'TARGETS.conf').
The basic idea for the more advanced editors (Emacs and VIM) is that
project authors can add their favorite editor while they are working on the
project, but upon publication they can remove them from 'TARGETS.conf'.
A few other minor things came up during this work and are now also fixed:
 - The 'file' program and its libraries like 'libmagic' were linking to
   system's 'libseccomp'! This dependency then leaked into Nano (which
   depends on 'libmagic'). But this is just an extra feature of 'file',
   only for the Linux kernel. Also, we have no dependency on it so far. So
   'file' is not configured to not build with 'libseccomp'.
 - A typo was fixed in the line where the physical core information is
   being read on macOS.
 - The top-level directories when running './project shell' are now quoted
   (in case they have special characters). | 
|  | Until now, no machine-related specifications were being documented in the
workflow. This information can become helpful when observing differences in
the outcome of both software and analysis segments of the workflow by
others (some software may behave differently based on host machine).
With this commit, the host machine's 'hardware class' and 'byte-order' are
collected and now available as LaTeX macros for the authors to use in the
paper. Currently it is placed in the acknowledgments, right after
mentioning the Maneage commit.
Furthermore, the project and configuration scripts are now capable of
dealing with input directory names that have SPACE (and other special
characters) by putting them inside double-quotes. However, having spaces
and metacharacters in the address of the build directory could cause
build/install failure for some software source files which are beyond the
control of Maneage. So we now check the user's given build directory
string, and if the string has any '@', '#', '$', '%', '^', '&', '*', '(',
')', '+', ';', and ' ' (SPACE), it will ask the user to provide a different
directory. | 
|  | Some very minor conflicts came up and were easily corrected. They were
mostly in parts that are also shared with the demonstration in the core
Maneage branch. | 
|  | Until now, if the software source tarballs already existed on the system
they would be copied inside the project. However, the software source
tarballs are sometimes/mostly larger than their actual product and can
consume significant space (~375 MB in the core branch!).
With this commit, when the software are present on the system, their
symbolic link will be placed in 'BDIR/software/tarballs', not a full
copy. Also, because the tarballs in software tarball directory may
themselves be links, we use 'realpath' to find the final place of the
actual file and link to that location. Therefore if 'realpath' can't be
found (prior to installing Coreutils in Maneage), we will copy the tarballs
from the given software tarball directory. After Maneage has installed
Coreutils, the project's own 'realpath' will be used. Of course, if the
software are downloaded, their full downloaded copy will be kept in
'BDIR/software/tarballs', nothing has changed in the downloading scenario. | 
|  | It was a long time that the Maneage software versions hadn't been updated.
With this commit, the versions of all basic software were checked and 17 of
that had newer versions were updated. Also, 16 high-level programs and
libraries were updated as well as 7 Python modules. The full list is
available below.
Basic Software (affecting all projects)
---------------------------------------
bash            5.0.11 -> 5.0.18
binutils        2.32 -> 2.35
coreutils       8.31 -> 8.32
curl            7.65.3 -> 7.71.1
file            5.36 -> 5.39
gawk            5.0.1 -> 5.1.0
gcc             9.2.0 -> 10.2.0
gettext         0.20.2 -> 0.21
git             2.26.2 -> 2.28.0
gmp             6.1.2 -> 6.2.0
grep            3.3 -> 3.4
libbsd          0.9.1 -> 0.10.0
ncurses         6.1 -> 6.2
perl            5.30.0 -> 5.32.0
sed             4.7 -> 4.8
texinfo         6.6 -> 6.7
xz              5.2.4 -> 5.2.5
Custom programs/libraries
-------------------------
astrometrynet   0.77 -> 0.80
automake        0.16.1 -> 0.16.2
bison           3.6 -> 3.7
cfitsio         3.47 -> 3.48
cmake           3.17.0 -> 3.18.1
freetype        2.9 -> 2.10.2
gdb             8.3 -> 9.2
ghostscript     9.50 -> 9.52
gnuastro        0.11 -> 0.12
libgit2         0.28.2 -> 1.0.1
libidn          1.35 -> 1.36
openmpi         4.0.1 -> 4.0.4
R               3.6.2 -> 4.0.2
python          3.7.4 -> 3.8.5
wcslib          6.4 -> 7.3
yaml            0.2.2 -> 0.2.5
Python modules
--------------
cython          0.29.6 -> 0.29.21
h5py            2.9.0 -> 2.10.0
matplotlib      3.1.1 -> 3.3.0
mpi4py          3.0.2 -> 3.0.3
numpy           1.17.2 -> 1.19.1
pybind11        2.4.3 -> 2.5.0
scipy           1.3.1 -> 1.5.2 | 
|  | When the host C compiler is used (either by calling '--host-cc' or on OSs
that we can't build the GNU C Compiler), Maneage will also not build the
Fortran compiler 'gfortran'. Until now, the './project configure' script
would give a big warning about the need for 'gfortran' and the fact that it
is missing, and would for 5 seconds, but it would continue anyway.
For projects that don't need 'gfortran', this can be confusing to the users
and for those that need 'gfortran', it means that a lot of time and cpu
cycles are wasted compiling non-fortran software that are unusable in the
end.
With this commit, the 'need_gfortarn' variable has been added
'reproduce/software/shell/configure.sh', in a new part that is devoted to
project-specific features. If it equals '0', then the 'gfortran' test (and
message!) isn't done at all, but if it is set to '1', then the configure
stage will halt immediately gfortran is not found and not built.
The default operations of the core Maneage branch don't need 'gfortran', so
by default it is set to 0. But 'gfortran' is necessary for all projects
that use Numpy (Python's numeric library) for example. So if your project
needs 'gfortran', please set this new variable to 1. As mentioned in the
comments of 'configure.sh', ideally we should detect this automatically,
but we haven't had the time to implement it yet. | 
|  | Prior to this commit, compilation of OpenMPI used the default OpenMPI
choices of deciding which libraries should be used in relating to a job
scheduler [1] (such as Slurm [2]). Given that the user on a multi-user
cluster has to accept the sysadmin's choice of a job scheduler, the
question of whether to (1) link with OpenMPI's own libraries (and increase
the reproducibility of the science project) or rather (2) link with the
sysadmin managed libraries (more likely to be compatible with the host's
job scheduler), is an open question of which the best strategy for
reproducibility needs to be debated and studied.
In this commit, strategy (1) is adopted. The options '--withpmix=internal'
and '--with-hwloc=internal' are added to the configure command. The working
assumption is that the Maneage version of OpenMPI is likely to be modern
enough to be compatible with the native job scheduler such as
Slurm. Compilation without any 'pmix' option gave a fail in at least one
case; it appears that an external pmix library was sought by the configure
script.
As of OpenMPI 4.0.1, the internal libevent library is used by default, so
there appears to be no option to force it to be chosen internally.
This commit also includes the option '--without-verbs'.  This option
removes a library related to "infiniband", "verbs", "openib" and "BTL";
this library appears to be deprecated. See [3], [4] for discussion.
Please add feedback and discussion to the Maneage task about openmpi
linking strategies (1) (internal) and (2) (external) at Savannah [5].
[1] https://en.wikipedia.org/wiki/Job_scheduler#Batch_queuing_for_HPC_clusters
[2] https://en.wikipedia.org/wiki/Slurm_Workload_Manager - To avoid a name
    clash, 'slurm-wlm' is the metapackage in Debian for the client
    commands, the compute node daemon, and the central node daemon. An
    unrelated package 'slurm' also exists.
[3] https://www-lb.open-mpi.org/faq/?category=openfabrics#ofa-device-error
[4] https://www-lb.open-mpi.org/faq/?category=building
[5] https://savannah.nongnu.org/task/index.php?15737 | 
|  | Until now, if a project needed the healpy software package, Maneage would
crash with the following error message (abridged for full name in build
directory). This was caused by a typo in the version of 'healpix' (the
dependency of 'healpy').
  make: *** No rule to make target '.../version-info/proglib/healpix-'
With this commit, the typo in line 334 of 'python.mk' is fixed, so that
when '$(ipydir)/healpy-$(healpy-version)' gets called it correctly searches
for a rule to make '$(ibidir)/healpix-$(healpix-version)'. | 
|  | In the previous commit (Commit 1bc00c9: Only using clang in macOS systems
that also have GCC) we set the used C compiler for high-level programs to
be 'clang' on macOS systems. But I forgot to do the same kind of change in
the configure script (to prefer 'clang' when we are testing for a C
compiler on the host).
With this commit, the compiler checking phases of the configure script have
been improved, so on macOS systems, we now first search for 'clang', then
search for 'gcc'.
While doing this, I also noticed that the 'rpath' checking command was done
before we actually define 'instdir'!!! So in effect, the 'rpath' directory
was being set to '/lib'! So with this commit, this test has been taken to
after defining 'instdir'. | 
|  | Until now, when Maneage was built on a macOS that had both a clang and GCC,
we would make links to both. But this cause many conflicts in some
high-level programs (for example Numpy and etc, all the programs where we
have explicity set 'export CC=clang' before the build recipe). This happens
because the GCC that is built on a macOS isn't complete for some
operations.
To fix this problem, when we are on a macOS, we explicity set 'gcc' to
point to 'clang' and 'g++' to point to 'clang++'. We also don't link to the
host's C-preprocessor ('cpp') on macOS systems because this is only a GNU
feature and using the GNU CPP is also known to have some basic
problems. For example this was reported by Mahdieh Nabavi (which was the
main trigger for this work):
  ld: Symbol not found: ___keymgr_global
    Referenced from: /Users/Mahdieh/build/software/installed/bin/cpp
    Expected in: /usr/lib/libSystem.B.dylib
Also, to avoid linking to another link on the host tools (in the 'makelink'
function of 'basic.mk'), we are now using 'realpath'. | 
|  | Until now, when reading the host's PATH environment variable we weren't
accounting for directory names with a space character. This was most
prominently visible in the 'low-level-links' step where we put links to
some core system components into the project's build directory (mainly for
prorietary systems like macOS).
To address the problem, double quotations have been placed around the part
that we extract 'ccache' from the PATH, and the part where we make the
symbolic link. In the process the comments above 'makelink' were made more
clear and 'low-level-links' now depends on 'grep' (which is the
highest-level program it uses).
This bug was reported by Mahdieh Navabi. | 
|  | Until this commit, once Libidn was installed, insted of its own name and
version, the name and version of Libjpeg were saved (in the target if
Libidn). This robably come from a copy/paste of the rule.
With this commit, this minor bug has been corrected. I also added my name
as an author of `reproduce/software/make/xorg.mk' Makefile since I added
some code there. | 
|  | There weren't any conflicts in this merge. | 
|  | After recently adding util-linux to Maneage build-tree, we had forgot to
delete the unpacked and built source directory after it was installed! This
has been corrected with this commit. | 
|  | Until now, when the user specified an input and software directory, the raw
string they entered was used. But when this string was a relative location,
this could be problematic in general scenarios.
With this commit, the same function that finds the absolute location of the
build directory is used to find the absolute address of the data and
software directories. | 
|  | There weren't any conflicts in this merge. | 
|  | Until now, in order to build Ghostscript, the project used the host's Xorg
libraries. This was because we hadn't yet added the necessary build rules
for them.
With this commit, the instructions to build the necessary Xorg libraries
for Ghostscript have also been added. Also, the shared Ghostscript library
has been built with this commit and two sets of standard fonts are also
included, setting us on the path to build TeXLive from source later.
This task was done with the help and support of Raul Infante-Sainz. | 
|  | Until this commit, there was a problem when building Bison in parallel in
macOS systems. With this commit, this problem has been fixed by updating
Bison to its most recent version (3.6). | 
|  | Only two conflicts came up in the newly added comments of 'paper.mk' in the
Maneage branch. It happened because in this project we don't use
'pdflatex', but 'latex' alone. | 
|  | POSSIBLE EFFECT ON YOUR PROJECT: The changes in this commit may only cause
conflicts to your project if you have changed the software building
Makefiles in your project's branch (e.g., 'basic.mk', 'high-level.mk' and
'python.mk'). If your project has only added analysis, it shouldn't be
affected.
This is a large commit, involving a long series of corrections in a
differnt branch which is now finally being merged into the core Maneage
branch. All changes were related and came up naturally as the low-level
infrastructure was improved. So separating them in the end for the final
merge would have been very time consuming and we are merging them as one
commit.
In general, the software building Makefiles are now much more easier to
read, modify and use, along with several new features that have been
added. See below for the full list.
 - Until now, Maneage needed the host to have a 'make' implementation
   because Make was necessary to build Lzip. Lzip is then used to
   uncompress the source of our own GNU Make. However, in the
   minimalist/slim versions of operating systems (for example used to build
   Docker images) Make isn't included by default. Since Lzip was the only
   program before our own GNU Make was installed, we consulting Antonio
   Diaz Diaz (creator of Lzip) and he kindly added the necessary
   functionality to a new version of Lzip, which we are using now. Hence we
   don't need to assume a Make implementation on the host any more. With
   this commit, Lzip and GNU Make are built without Make, allowing
   everything else to be safely built with our own custom version of GNU
   Make and not using the host's 'make' at all.
 - Until recently (Commit 3d8aa5953c4) GNU Make was built in
   'basic.mk'. Therefore 'basic.mk' was written in a way that it can be
   used with other 'make' implementations also (i.e., important shell
   commands starting with '&&' and ending in '\' without any comments
   between them!). Furthermore, to help in style uniformity, the rules in
   'high-level.mk' and 'python.mk' also followed a similar structure. But
   due to the point above, we can now guarantee that GNU Make is used from
   the very first Makefile, so this hard-to-read structure has been removed
   in the software build recipes and they are much more readable and
   edit-friendly now.
 - Until now, the default backup servers where at some fixed URLs, on our
   own pages or on Gitlab. But recently we uploaded all the necessary
   software to Zenodo (https://doi.org/10.5281/zenodo.3883409) which is
   more suitable for this task (it promises longevity, has a fixed DOI,
   while allowing us to add new content, or new software tarball
   versions). With this commit, a small script has been written to extract
   the most recent Zenodo upload link from the Zenodo DOI and use it for
   downloading the software source codes.
 - Until now, we primarily used the webpage of each software for
   downloading its tarball. But this caused many problems: 1) Some of them
   needed Javascript before the download, 2) Some URLs had a complex
   dependency on the version number, 3) some servers would be randomly down
   for maintenance and etc. So thanks to the point above, we now use the
   Zenodo server as the primary download location. However, if a user wants
   to use a custom software that is not (yet!) in Zenodo, the download
   script gives priority to a custom URL that the users can give as Make
   variables. If that variable is defined, then the script will use that
   URL before going onto Zenodo. We now have a special place for such URLs:
   'reproduce/software/config/urls.conf'. The old URLs (which are a good
   documentation themselves) are preserved here, but are commented by
   default.
 - The software source code downloading and checksum verification step has
   been moved into a Make function called 'import-source' (defined in the
   'build-rules.mk' and loaded in all software Makefiles). Having taken all
   the low-level steps there, I noticed that there is no more need for
   having the tarball as a separate target! So with this commit, a single
   rule is the only place that needs to be edited/added (greatly
   simplifying the software building Makefiles).
 - Following task #15272, A new option has been added to the './project'
   script called '--all-highlevel'. When this option is given, the contents
   of 'TARGETS.conf' are ignored and all the software in Maneage are built
   (selected by parsing the 'versions.conf' file). This new option was
   added to confirm the extensive changes made in all the software building
   recipes and is great for development/testing purposes.
 - Many of the software hadn't been tested for a long time! So after using
   the newly added '--all-highlevel', we noticed that some need to be
   updated. In general, with this commit, 'libpaper' and 'pcre' were added
   as new software, and the versions of the following software was updated:
   'boost', 'flex', 'libtirpc', 'openblas' and 'lzip'. A 'run-parts.in'
   shell script was added in 'reproduce/software/shell/' which is installed
   with 'libpaper'.
 - Even though we intentionally add the necessary flags to add RPATH inside
   the built executable at compilation time, some software don't do it
   (different software on different operating systems!). Until now, for
   historical reasons this check was done in different ways for different
   software on GNU/Linux sytems. But now it is unified: if 'patchelf' is
   present we apply it. Because of this, 'patchelf' has been put as a
   top-level prerequisite, right after Tar and is installed before anything
   else.
 - In 'versions.conf', GNU Libtool is recognized as 'libtool', but in
   'basic.mk', it was 'glibtool'! This caused many confusions and is
   corrected with this commit (in 'basic.mk', it is also 'libtool').
 - A new argument is added to the './project' script to allow easy loading
   of the project's shell and environment for fast/temporary testing of
   things in the same environment as the project. Before activating the
   project's shell, we completely remove all host environment variables to
   simulate the project's environment. It can be called with this command:
   './project shell'. A simple prompt has also been added to highlight that
   the user is using the Maneage shell! | 
|  | Until now, Maneage would accept the given build directory, regardless of
the free memory available there. This could cause confusing situations for
new users who don't know about the minimum storage requirement.
With this commit, after all other checks on the given build directory are
completed, the configure script will check the available space and warns
the user if there is less than almost 5GB free space available in the build
directory (with a 5 second delay).
It won't cause a crash because some projects may require roughly smaller
than this space (the default only needs roughly 2GB). But we also don't
want the host's partition to get too close to being full, causing them
problems elsewhere. We can change the behavior as desired in future
commits. | 
|  | In Commit 105467fe6402 (Software tarballs are downloaded even if not
built), we introduced tests to download the tarballs of software even if
they don't need to be built on the respective host. However some small
typos in the checks existed that could cause a crash on macOS. In
particular in the building of PatchELF and libbsd we had forgot to add the
necessary 'x' before the 'yes' in the conditional to check if a we are on
macOS or not.
With this commit these two checks have been corrected. Also, in the
building of 'isl' and 'mpc', we now check for 'host_cc' (signifying that
the user wants to use their host C compiler for the high-level step)
instead of 'on_mac_os'. The reason is that even on non-macOS systems, a
user may not want to build the C compiler from scratch and use the
'--host-cc' option. In such cases, they don't need to compile 'isl' and
'mpc'. | 
|  | Until now, the English texts that embeds the list of software to
acknowledge in the paper was hard-wired into the low-level coding
('reproduce/software/shell/configure.sh' to be more specific). But this
file is very low-level, thus discouraging users to modify this surrounding
text.
While the list of software packages can be considered to be 'data' and is
fixed, the surrounding text to describe the lists is something the authors
should decide on. Authors of a scientific research paper take
responsibility for the full paper, including for the style of the
acknowledgments, even if these may well evolve into some standard text.
With this commit, authors who do *not* modify
'reproduce/software/config/acknowledge_software.sh' will have a default
text, with only a minor English correction from earlier versions of
Maneage. However, Authors choosing to use their own wording should be able
to modify the text parameters in
`reproduce/software/config/acknowledge_software.sh` in the obvious
way. This is much more modular than asking project authors to go looking
into the long and technical 'configure.sh' script.
Systematic issues: the file
`reproduce/software/config/acknowledge_software.sh` is an executable shell
script, because it has to be called by
`reproduce/software/shell/configure.sh`, which, in principle, does not yet
have access to `GNU make` (if I understand the bootstrap sequence
correctly). It is placed in `config/` rather than `shell/`, because the
user will expect to find configuration files in `config/`, not in `shell/`.
A possible alternative to avoid having a shell script as a configure file
would be to let `reproduce/software/config/acknowledge_software.sh` appear
to be a `make` file, but analyse it in `configure.sh` using `sed` to remove
whitespace around `=`, and adding other hacks to switch from `make` syntax
to `shell` syntax. However, this risks misleading the user, who will not
know whether s/he should follow `make` conventions or `shell` conventions. | 
|  | Some low-level software aren't necessary on some operating systems, for
example GCC can't be built on macOS, hence we don't build it and the
GCC-only dependencies. Also, on GNU/Linux systems users could configure
with '--host-cc' to avoid all the time it takes to build GCC when doing a
fast test.
Until now, in such cases not only was the software not installed, but the
tarballs of the software were also not downloaded. Hence making the output
of '--dist-software' incomplete (as in bug #58561).
With this commit, we now import all the necessary tarballs, when the
software isn't necessary for the particular system, it won't be built or
cited, but its tarball will be present anyway, thus allowing the output of
'--dist-software' to be complete. | 
|  | Until now, when making the link to Gnuastro's configuration files, the
'configure.sh' script would incorrectly link to the old configuration
directory under the 'reproduce/software' directory. With this commit, it is
moved to the proper directory under 'reproduce/analysis'. | 
|  | Some minor conflicts came up in 'download.mk' and 'verify.mk' (as expected
in the "IMPORTANT" commits) and were easily fixed by choosing this branch's
values. | 
|  | Until now, when adding the necessary library flags to the build of XLSX
I/O, we were effectively over-writing the 'LDFLAGS' variables. So the
compiler was effectively not being told where to look for the necessary
libraries.
With this commit, to fix the problem, we now append the new linking flags
to LDFLAGS in XLSX I/O's build, not over-write it. | 
|  | After trying a clean build of Maneage in a Docker image (with a minimal
debian:stable-20200607-slim OS), I noticed that the building of OpenSSL is
failing because it doesn't find the proper Perl functionality. To fix it,
with this commit, Perl is set as a prerequisite of OpenSSL and this fixed
the problem. | 
|  | The project configuration requires a build-directory at configuration time,
two other directories can optionally be given to avoid downloading the
project's necessary data and software. It is possible to give these three
directories as command-line options, or by interactively giving them after
running the configure script.
Until now, when these directories weren't given as command-line options,
and the running shell was non-interactive, the configure script would crash
on the line trying to interactively read the user's given directories (the
'read' command).
With this commit, all the 'read' commands for these three directories are
now put within an 'if' statement. Therefore, when 'read' fails (the shell
is non-interactive), instead of a quiet crash, a descriptive message is
printed, telling the user that cause of the problem, and suggesting a fix.
This bug was found by Michael R. Crusoe. | 
|  | Until now, the description of the input-data directory at configure time
included a description of the input data (created by reading the values of
'INPUTS.conf'). Maintaining this is easy for a single dataset, but it
becomes hard for a general project which may need many input datasets.
To avoid extra complexity (for maintaining this list), the description now
points a user of the project to the 'INPUTS.conf' file and asks them to
look inside of it for seeing the necessary data. This infact helps with the
users becoming familiar with the internal structure of Maneage and will
allow the authors to focus on not having to worry about updating the
low-level 'configure.sh' script. | 
|  | When './project configure' is run, after the basic checks of the compiler,
a small statement is printed telling the user that some configuration
questions will now be asked to start building Maneage on the system. Until
now this description was confusing: it lead the reader to think that the
local configuration (which was recommended to read before continuing) is in
another file.
With this commit, the text has been edited to explictly mention that the
description of the steps following this notice should be read
carefully. Thus avoiding that confusion.
This issue was mentioned by Michael R. Crusoe. | 
|  | Some minor conflicts came up in 'initialize.mk' and 'verify.mk'. For the
former, I chose the version on Maneage, for the latter, I kept the 'master'
version on the checksums of this project, but kept the Maneage version for
the rest of the improvements there (like printing the verified files as
LaTeX comments in 'verify.tex'.
While testing the conflicts, I noticed a bug (in the LaTeX macro for the
number of years in the Menke+20 paper) in the previous build, thanks to the
verification step :-)! Fortunately it wasn't actually printed in the PDF,
so a normal reader won't recognize.
The bug was caused by the recently added meta-data/commented lines in the
'tools-per-year.txt' file: when calculating the number of years studied in
that paper, we were simply counting all the lines and we had forgot to
correct this after adding comments. As a result, the un-used LaTeX macro
file was saying that they have studied 47 years instead of the real 31
years! This element was actually used in the very first (+40 page!) draft
of the paper that was summarized to fit into the journal limits. | 
|  | Possible semantic conflicts (that may not show up as Git conflicts but may
cause a crash in your project after the merge):
   1) The project title (and other basic metadata) should be set in
      'reproduce/analysis/conf/metadata.conf'. Please include this file in
      your merge (if it is ignored because of '.gitattributes'!).
   2) Consider importing the changes in 'initialize.mk' and 'verify.mk' (if
      you have added all analysis Makefiles to the '.gitattributes' file
      (thus not merging any change in them with your branch). For example
      with this command:
        git diff master...maneage -- reproduce/analysis/make/initialize.mk
   3) The old 'verify-txt-no-comments-leading-space' function has been
      replaced by 'verify-txt-no-comments-no-space'. The new function will
      also remove all white-space characters between the columns (not just
      white space characters at the start of the line). Thus the resulting
      check won't involve spacing between columns.
A common set of steps are always necessary to prepare a project for
publication. Until now, we would simply look at previous submissions and
try to follow them, but that was prone to errors and could cause
confusion. The internal infrastructure also didn't have some useful
features to make good publication possible. Now that the submission of a
paper fully devoted to the founding criteria of Maneage is complete
(arXiv:2006.03018), it was time to formalize the necessary steps for easier
submission of a project using Maneage and implement some low-level features
that can make things easier.
With this commit a first draft of the publication checklist has been added
to 'README-hacking.md', it was tested in the submission of arXiv:2006.03018
and zenodo.3872248. To help guide users on implementing the good practices
for output datasets, the outputs of the default project shown in the paper
now use the new features). After reading the checklist, please inspect
these.
Some other relevant changes in this commit:
  - The publication involves a copy of the necessary software
    tarballs. Hence a new target ('dist-software') was also added to
    package all the project's software tarballs in one tarball for easy
    distribution.
  - A new 'dist-lzip' target has been defined for those who want to
    distribute an Lzip-compressed tarball.
  - The '\includetikz' LaTeX macro now has a second argument to allow
    configuring the '\includegraphics' call when the plot should not be
    built, but just imported. | 
|  | When the project is being re-built from the tarball (not the Git
repository), the 'tex/build' and 'tex/tikz' addresses are actual
directories, not symbolic links. In this case, when someone runs './project
configure', it will complain about not being able to delete them (it
assumes they are symbolic links!).
So with this commit, we first check if they are deletable without '-r'. If
so, then they are full directories and we rename them to a backup directory
to allow the rest of the project to continue building a link there. | 
|  | The minor conflict was with 'reproduce/software/make/high-level.mk', and in
particular because we implemented the fix to Maneage's Task #15664 in this
project first. After it was moved to the main Maneage branch some minor
stylistic corrections were done to it, thus causing the conflict. To
resolve the conflict, I simply imported the full Maneage version of the
file with this command:
  git checkout maneage -- reproduce/software/make/high-level.mk
The other conflicts were due to the deleted files (that were resolved as
described in 'README-hacking.md') and the LaTeX files that I had told
'.gitattributes' to ignore from the Maneage branch. | 
|  | When some files should not be merged, until now we were suggesting to also
add deleted files to the '.gitattributes' file. However, this feature of
Git doesn't work for deleted files and they would still show up in the
'master' branch after a merge.
So with this commit, we have added a simple AWK command to run after a
merge that will automatically detect and delete such files (using the
output of 'git status --porcelain').
Also, two minor typos were corrected in the newly added
'servers-backup.conf' file: the copyright year was wrong and there was no
new-line at the end of the file (a good convention!). | 
|  | Until now, Maneage would only build Flock before building everything else
using Make (calling 'basic.mk') in parallel. Flock was necessary to avoid
parallel downloads during the building of software (which could cause
network problems). But after recently trying Maneage on FreeBSD (which is
not yet complete, see bug #58465), we noticed that the BSD implemenation of
Make couldn't parse 'basic.mk' (in particular, complaining with the 'ifeq'
parts) and its shell also had some peculiarities.
It was thus decided to also install our own minimalist shell, Make and
compressor program before calling 'basic.mk'. In this way, 'basic.mk' can
now assume the same GNU Make features that high-level.mk and python.mk
assume. The pre-make building of software is now organized in
'reproduce/software/shell/pre-make-build.sh'.
Another nice feature of this commit is for macOS users: until now the
default macOS Make had problems for parallel building of software, so
'basic.mk' was built in one thread. But now that we can build the core
tools with GNU Make on macOS too, it uses all threads. Furthermore, since
we now run 'basic.mk' with GNU Make, we can use '.ONESHELL' and don't have
to finish every line of a long rule with a backslash to keep variables and
such.
Generally, the pre-make software are now organized like this: first we
build Lzip before anything else: it is downloaded as a simple '.tar' file
that is not compressed (only ~400kb). Once Lzip is built, the pre-make
phase continues with building GNU Make, Dash (a minimalist shell) and
Flock. All of their tarballs are in '.tar.lz'. Maneage then enters
'basic.mk' and the first program it builds is GNU Gzip (itself packaged as
'.tar.lz'). Once Gzip is built, we build all the other compression software
(all downloaded as '.tar.gz'). Afterwards, any compression standard for
other software is fine because we have it.
In the process, a bug related to using backup servers was found in
'reproduce/analysis/bash/download-multi-try' for calling outside of
'basic.mk' and removed Bash-specific features. As a result of that bug-fix,
because we now have multiple servers for software tarballs, the backup
servers now have their own configuration file in
'reproduce/software/config/servers-backup.conf'. This makes it much easier
to maintain the backup server list across the multiple places that we need
it.
Some other minor fixes:
 - In building Bzip2, we need to specify 'CC' so it doesn't use 'gcc'.
 - In building Zip, the 'generic_gcc' Make option caused a crash on FreeBSD
   (which doesn't have GCC).
 - We are now using 'uname -s' to specify if we are on a Linux kernel or
   not, if not, we are still using the old 'on_mac_os' variable.
 - While I was trying to build on FreeBSD, I noticed some further
   corrections that could help. For example the 'makelink' Make-function
   now takes a third argument which can be a different name compared to the
   actual program (used for examle to make a link to '/usr/bin/cc' from
   'gcc'.
 - Until now we didn't know if the host's Make implementation supports
   placing a '@' at the start of the recipe (to avoid printing the actual
   commands to standard output). Especially in the tarball download phase,
   there are many lines that are printed for each download which was really
   annoying. We already used '@' in 'high-level.mk' and 'python.mk' before,
   but now that we also know that 'basic.mk' is called with our custom GNU
   Make, we can use it at the start for a cleaner stdout.
 - Until now, WCSLIB assumed a Fortran compiler, but when the user is on a
   system where we can't install GCC (or has activated the '--host-cc'
   option), it may not be present and the project shouldn't break because
   of this. So with this commit, when a Fortran compiler isn't present,
   WCSLIB will be built with the '--disable-fortran' configuration option.
This commit (task #15667) was completed with help/checks by Raul
Infante-Sainz and Boud Roukema. | 
|  | Until this commit, when the user have previous TeX tarball already
present, the project crashed when trying to re-configure, if there was a
newer version of TeX. This is because TeX are updated yearly. With this
commit, this bug has been fixed. Now, during the installation of TeX, it
checks if this problem happens. If this is the case, then it moves the
old tarball, download the new one and install it. If not, it will just
install the already present tarball or crash because of any other
reason. This probem was recurrent, and each time TeX was updated, the
previous tarball had to be removed manually. But now, with this commit,
it is done automatically. The detection and fix of this bug has been
possible with the help of Mohammad Akhlaghi, thanks! | 
|  | Until this commit, when the user had a previous TeXLive tarball already
present (in their software-tarball directory) compared to the CTAN server,
the project crashed in the configure phase. This was because TeXLive is
updated yearly and we don't yet install TeXLive from source (currently we
use its own package manager, but we plan to fix this in task #15267).
With this commit, we fix the problem by checking the cause of the crash
during the installation of TeX. If the crash is due to this particular
error, we ignore the old tarball and download the new one and install it
(the old one is still kept in '.build/software/tarballs', but will get a
'-OLD' in its name. This probem was recurrent, and every year that TeXLive
is updated, the previous tarball had to be removed manually! But with this
commit, this is done automatically. The detection and fix of this bug has
been possible with the help of Mohammad Akhlaghi, thanks! | 
|  | One of the main reasons to building Maneage is to properly
acknowledge/attribute the authors of software in research. So we have
adopted a standard of never referring to the GNU-based operating systems
running the Linux kernel simply as "Linux", we avoid terms like "Open
Sourse" and use Free Software instead (in the same spirit).
With this commit, a few instances of the cases above have been corrected,
they had slipped through our fingers when we initially imported them into
the project. In the special case of the "Journal for Open Source Software",
we simply replaced it with its abbreviation (JOSS). This was done because
in effect we were generally using journal name abbreviations in almost all
the citations already. To avoid any inconsistancies, the names of the three
other journals that weren't abbreviated are also abbreviated. | 
|  | With this commit, Maneage now includes instructions to build the memory
tracing tool Valgrind and the program 'patch' (to apply corrections/patches
in text files and in particular the sources of programs).
For this version of Valgrind, some patches were necessary for an interface
with OpenMPI 2.x (which is the case now). Also note that this version of
Valgrind's checks can fail with GCC 10.1.x (when using '--host-cc'), and
the failures aren't due to internal problems but due to how the tests are
designed (https://bugs.gentoo.org/707598). So currently if any of
Valgrind's checks fail, Maneage still assumes that Valgrind was built and
installed successfully.
While testing on macOS, we noticed that it needs the macOS-specific 'mig'
program which we can't build in Maneage. DESCRIPTION: The mig command
invokes the Mach Interface Generator to generate Remote Procedure Call
(RPC) code for client-server style Mach IPC from specification files. So a
symbolic link to the system's 'mig' is now added to the project's programs
on macOS systems.
This commit's build of Patch and Valgrind has been tested on two GNU/Linux
distributions (Debian and ArchLinux) as well as macOS.
Work on this commit started by Boud Roukema, but also involved tests and
corrections by Mohammad Akhlaghi and Raul Infante-Sainz. | 
|  | David reported this problem, it happened right after importing IEEEtran,
but for some reason, it didn't happen for me. | 
|  | When entering the name of the "listings" package, I had forgot to add the
final 's', so it wasn't being installed on a clean system! I didn't have a
problem until now, because it remained from previous builds. | 
|  | Until now, two of the software BibTeX sources (Matplolib and Sympy) had an
"abstract" entry that was long, not similar to the rest, and not relevant
in this context, so they are removed with this commit. |