| Age | Commit message (Collapse) | Author | Lines | 
|---|
|  | Until now, the 'less' software package (used to view large files easily on
the command-line and used by Git for things like 'git diff' or 'git log')
only depended on 'patchelf' (which is a very low-level software).
However, as Boud reported in bug #59811 [1], building less would crash with
an error saying "Cannot find terminal libraries" in some systems (including
the proposed Docker image of 'README.md' which I confirmed
afterwards). Looking into the 'configure' script of 'less', I noticed that
'less' is actually just checking for some functions provided by the ncurses
library!
With this commit, 'less' depends on 'ncurses'. I was able to confirm that
with this change, 'less' successfully builds within the Docker image.
[1] https://savannah.nongnu.org/bugs/?59811 | 
|  | Having entered 2021, it was necessary to update the copyright years at the
top of the source files. We recommend that you do this for all your
project-specific source files also. | 
|  | Until now, when building the high-level (optional) software, we would give
both 'CPPFLAGS' and 'C_INCLUDE_PATH' the same value/directory in
'high-level.mk'. But we recently found that on macOS's C compiler
('clang'), if a directory is included in both 'CPPFLAGS' and
'C_INCLUDE_PATH', then that directory is ignored in 'CPPFLAGS' (which has
higher priority). This caused linking problems when the version of a
software on the host was different from the Maneage version.
With this commit, 'C_INCLUDE_PATH' is not set on macOS any more and this
fixed the problem on the reported systems.
This bug was fixed with the help of Mohammad Akhlaghi and Mahdieh Navabi. | 
|  | Less is rarely used in non-interactive mode and is primarily intended for
interactively viewing large files. So its need within Maneage (for batch
processing) wasn't often felt until now. However, when running './project
shell' (which completely closes-off the outside environment), or building a
Maneage'd project within a minimal container that doesn't have less, it
becomes hard to use Git (and in particular its 'diff' output which depends
on 'less').
With this commit, Less has been added as a dependency of Git in
'basic.mk'. In total its built product is roughly 800KB and builds within a
second or two. So it isn't a burden on any project. But it can be very
useful when the projects are being developed within the Maneage environment
itself. | 
|  | In a recent build on a macOS, we recognized that Texinfo needs the
'libintl.h' headers of Gettext. However, Gettext depends on M4, and until
now we had set M4 to depend on Texinfo. Therefore adding Gettext as a
dependency of Texinfo would cause a circular dependency.
On the macOS, we temporarily disabled M4's Texinfo dependency, and the
build went through. I also checked on my GNU/Linux system: temporarily
renamed all Texinfo built files from my system and done a clean build of M4
and it succeeded. To be further safe, I built Maneage from this commit
(where M4 doesn't depend on Texinfo) in a Docker container, and it went
through with no problems. So the current M4 version indeed doesn't need
Texinfo. I think adding Texinfo as a dependency of M4 was a historic issue
from the early days.
In the process, I also cleaned 'basic.mk' a little:
 - A "# Level N" comment was added on top of each group of software that
   can be built in parallel (generally).
 - GNU Nano was moved to the end of the file (to be "Level 6").
 - Some comments were edited in some places. | 
|  | After a fresh build of Maneage with a newly downloaded TeXLive, I noticed
that it is complaining about not finding 'xstring.sty', apparently some
package that depeneded on it is no longer including it itself!
It is thus now added to the packages that are built by Maneage's TeXLive. | 
|  | Until now, the core Maneage branch included some configuration files for
Gnuastro's programs. This was actually a remnant of the distant past when
Maneage didn't actually build its own software and we had to rely on the
host's software versions. This file contained the configuration files
specific to Gnuastro for this project and also had a feature to avoid
checking the host's own configuration files.
However, we now build all our software ourselves with fixed configuration
files (for the version that is being installed and its version is
stored). So those extra configuration files were just extra and caused
confusion and problems in some scenarios. With this commit, those extra
files are now removed.
Also, two small issues are also addressed in parallel with this commit:
 - When running './project make clean', the 'hardware-parameters.tex' macro
   file (which is created by './project configure' is not deleted.
 - The project title is now written into the default output's PDF's
   properties (through 'hypersetup' in 'tex/src/preamble-header.tex')
   through the LaTeX macro.
All these issues were found and fixed with the help of Samane Raji. | 
|  | Until now, during the configure step it was checked if the host Operative
System were GNU/Linux, and if not, we assumed it is macOS.  However, it can
be any other different OS! With this commit, now we explicity check if the
system is GNU/Linux or Darwin (macOS). If it is not any of them, a warning
message says to the user that the host system is different from which we
have checked so far (and invite to contact us if there is any problem).
In addition to this, if the system is macOS, now it checks if Xcode is
already installed in the host system. If it is not installed, a warning
message informs the user to do that in case a problem/crash in the
configure step occurs. We have found that it is convenient to have Xcode
installed in order to avoid some problems. | 
|  | Tcl/Tk are a set of tools to provide Graphic User Interface (GUI) support
in some software. But they are not yet natively built within Maneage,
primarily because we have higher-priority work right now. GUI tools in
general aren't high on our priority list right now because GUI tools are
generally good for human interaction (which is contrary to the reproducible
philosophy), not automatic analysis (a core concept in reproducibility). So
even later, when we do include Tcl/Tk in Maneage, their direct usage will
be discouraged.
Until this commit, because we don't yet build Tcl/Tk, the default maneage
install of the statistical package R failed on a Debian Stretch, with 6227
repeats of the line:
'/usr/lib//tcl8.5/tclConfig.sh: line 2: dpkg-architecture:
command not found'
To fix this problem (atleast until Tcl/Tk is installed within Maneage), R
is now configured with the '--without-tcltk' option which fixed the
problem. Please see the description above the R installation instructions
in 'reproduce/software/make/high-level.mk' for more. | 
|  | Following the previous commit, we recognized that the 'IFS' terms are not
necessary and can be even cause problems. So all their occurances in the
scripts of Maneage have been removed with this commit. | 
|  | Until a recent commit, the IFS='"' was added at the start of the variables
in this shell script and as a result, the SPACE character wasn't being used
as a delimiter. This caused a major problem when downloading the tarballs
(all the backup servers were considered as the top link).
With this commit we removed these 'IFS' statements). Because we now check
for the existance of meta-characters in the build directory name, there is
no more problem, and also generally both the calling command and
internally, we have double-qutations around the variable names. So removal
of IFS will not affect the result in this scenario.
This bug was found by Mohammadreza Khellat. | 
|  | While a project is under development, the raw analysis software are not the
only necessary software in a project. We also need tools to all the edit
plain-text files within the Maneaged project. Usually people use their
operating system's plain-text editor. However, when working on the project
on a new computer, or in a container, the plain-text editors will have
different versions, or may not be present at all! This can be very annoying
and frustrating!
With this commit, Maneage now installs GNU Nano as part of the basic
tools. GNU Nano is a very simple and small plain text editor (the installed
size is only ~3.5MB, and it is friendly to new users). Therefore, any
Maneaged project can assume atleast Nano will be present (in particular
when no editor is available on the running system!). GNU Emacs and VIM
(both without extra dependencies, in particular without GUI support) are
also optionally available in 'high-level.mk' (by adding them to
'TARGETS.conf').
The basic idea for the more advanced editors (Emacs and VIM) is that
project authors can add their favorite editor while they are working on the
project, but upon publication they can remove them from 'TARGETS.conf'.
A few other minor things came up during this work and are now also fixed:
 - The 'file' program and its libraries like 'libmagic' were linking to
   system's 'libseccomp'! This dependency then leaked into Nano (which
   depends on 'libmagic'). But this is just an extra feature of 'file',
   only for the Linux kernel. Also, we have no dependency on it so far. So
   'file' is not configured to not build with 'libseccomp'.
 - A typo was fixed in the line where the physical core information is
   being read on macOS.
 - The top-level directories when running './project shell' are now quoted
   (in case they have special characters). | 
|  | Until now, no machine-related specifications were being documented in the
workflow. This information can become helpful when observing differences in
the outcome of both software and analysis segments of the workflow by
others (some software may behave differently based on host machine).
With this commit, the host machine's 'hardware class' and 'byte-order' are
collected and now available as LaTeX macros for the authors to use in the
paper. Currently it is placed in the acknowledgments, right after
mentioning the Maneage commit.
Furthermore, the project and configuration scripts are now capable of
dealing with input directory names that have SPACE (and other special
characters) by putting them inside double-quotes. However, having spaces
and metacharacters in the address of the build directory could cause
build/install failure for some software source files which are beyond the
control of Maneage. So we now check the user's given build directory
string, and if the string has any '@', '#', '$', '%', '^', '&', '*', '(',
')', '+', ';', and ' ' (SPACE), it will ask the user to provide a different
directory. | 
|  | Until now, if the software source tarballs already existed on the system
they would be copied inside the project. However, the software source
tarballs are sometimes/mostly larger than their actual product and can
consume significant space (~375 MB in the core branch!).
With this commit, when the software are present on the system, their
symbolic link will be placed in 'BDIR/software/tarballs', not a full
copy. Also, because the tarballs in software tarball directory may
themselves be links, we use 'realpath' to find the final place of the
actual file and link to that location. Therefore if 'realpath' can't be
found (prior to installing Coreutils in Maneage), we will copy the tarballs
from the given software tarball directory. After Maneage has installed
Coreutils, the project's own 'realpath' will be used. Of course, if the
software are downloaded, their full downloaded copy will be kept in
'BDIR/software/tarballs', nothing has changed in the downloading scenario. | 
|  | It was a long time that the Maneage software versions hadn't been updated.
With this commit, the versions of all basic software were checked and 17 of
that had newer versions were updated. Also, 16 high-level programs and
libraries were updated as well as 7 Python modules. The full list is
available below.
Basic Software (affecting all projects)
---------------------------------------
bash            5.0.11 -> 5.0.18
binutils        2.32 -> 2.35
coreutils       8.31 -> 8.32
curl            7.65.3 -> 7.71.1
file            5.36 -> 5.39
gawk            5.0.1 -> 5.1.0
gcc             9.2.0 -> 10.2.0
gettext         0.20.2 -> 0.21
git             2.26.2 -> 2.28.0
gmp             6.1.2 -> 6.2.0
grep            3.3 -> 3.4
libbsd          0.9.1 -> 0.10.0
ncurses         6.1 -> 6.2
perl            5.30.0 -> 5.32.0
sed             4.7 -> 4.8
texinfo         6.6 -> 6.7
xz              5.2.4 -> 5.2.5
Custom programs/libraries
-------------------------
astrometrynet   0.77 -> 0.80
automake        0.16.1 -> 0.16.2
bison           3.6 -> 3.7
cfitsio         3.47 -> 3.48
cmake           3.17.0 -> 3.18.1
freetype        2.9 -> 2.10.2
gdb             8.3 -> 9.2
ghostscript     9.50 -> 9.52
gnuastro        0.11 -> 0.12
libgit2         0.28.2 -> 1.0.1
libidn          1.35 -> 1.36
openmpi         4.0.1 -> 4.0.4
R               3.6.2 -> 4.0.2
python          3.7.4 -> 3.8.5
wcslib          6.4 -> 7.3
yaml            0.2.2 -> 0.2.5
Python modules
--------------
cython          0.29.6 -> 0.29.21
h5py            2.9.0 -> 2.10.0
matplotlib      3.1.1 -> 3.3.0
mpi4py          3.0.2 -> 3.0.3
numpy           1.17.2 -> 1.19.1
pybind11        2.4.3 -> 2.5.0
scipy           1.3.1 -> 1.5.2 | 
|  | When the host C compiler is used (either by calling '--host-cc' or on OSs
that we can't build the GNU C Compiler), Maneage will also not build the
Fortran compiler 'gfortran'. Until now, the './project configure' script
would give a big warning about the need for 'gfortran' and the fact that it
is missing, and would for 5 seconds, but it would continue anyway.
For projects that don't need 'gfortran', this can be confusing to the users
and for those that need 'gfortran', it means that a lot of time and cpu
cycles are wasted compiling non-fortran software that are unusable in the
end.
With this commit, the 'need_gfortarn' variable has been added
'reproduce/software/shell/configure.sh', in a new part that is devoted to
project-specific features. If it equals '0', then the 'gfortran' test (and
message!) isn't done at all, but if it is set to '1', then the configure
stage will halt immediately gfortran is not found and not built.
The default operations of the core Maneage branch don't need 'gfortran', so
by default it is set to 0. But 'gfortran' is necessary for all projects
that use Numpy (Python's numeric library) for example. So if your project
needs 'gfortran', please set this new variable to 1. As mentioned in the
comments of 'configure.sh', ideally we should detect this automatically,
but we haven't had the time to implement it yet. | 
|  | Prior to this commit, compilation of OpenMPI used the default OpenMPI
choices of deciding which libraries should be used in relating to a job
scheduler [1] (such as Slurm [2]). Given that the user on a multi-user
cluster has to accept the sysadmin's choice of a job scheduler, the
question of whether to (1) link with OpenMPI's own libraries (and increase
the reproducibility of the science project) or rather (2) link with the
sysadmin managed libraries (more likely to be compatible with the host's
job scheduler), is an open question of which the best strategy for
reproducibility needs to be debated and studied.
In this commit, strategy (1) is adopted. The options '--withpmix=internal'
and '--with-hwloc=internal' are added to the configure command. The working
assumption is that the Maneage version of OpenMPI is likely to be modern
enough to be compatible with the native job scheduler such as
Slurm. Compilation without any 'pmix' option gave a fail in at least one
case; it appears that an external pmix library was sought by the configure
script.
As of OpenMPI 4.0.1, the internal libevent library is used by default, so
there appears to be no option to force it to be chosen internally.
This commit also includes the option '--without-verbs'.  This option
removes a library related to "infiniband", "verbs", "openib" and "BTL";
this library appears to be deprecated. See [3], [4] for discussion.
Please add feedback and discussion to the Maneage task about openmpi
linking strategies (1) (internal) and (2) (external) at Savannah [5].
[1] https://en.wikipedia.org/wiki/Job_scheduler#Batch_queuing_for_HPC_clusters
[2] https://en.wikipedia.org/wiki/Slurm_Workload_Manager - To avoid a name
    clash, 'slurm-wlm' is the metapackage in Debian for the client
    commands, the compute node daemon, and the central node daemon. An
    unrelated package 'slurm' also exists.
[3] https://www-lb.open-mpi.org/faq/?category=openfabrics#ofa-device-error
[4] https://www-lb.open-mpi.org/faq/?category=building
[5] https://savannah.nongnu.org/task/index.php?15737 | 
|  | Until now, if a project needed the healpy software package, Maneage would
crash with the following error message (abridged for full name in build
directory). This was caused by a typo in the version of 'healpix' (the
dependency of 'healpy').
  make: *** No rule to make target '.../version-info/proglib/healpix-'
With this commit, the typo in line 334 of 'python.mk' is fixed, so that
when '$(ipydir)/healpy-$(healpy-version)' gets called it correctly searches
for a rule to make '$(ibidir)/healpix-$(healpix-version)'. | 
|  | In the previous commit (Commit 1bc00c9: Only using clang in macOS systems
that also have GCC) we set the used C compiler for high-level programs to
be 'clang' on macOS systems. But I forgot to do the same kind of change in
the configure script (to prefer 'clang' when we are testing for a C
compiler on the host).
With this commit, the compiler checking phases of the configure script have
been improved, so on macOS systems, we now first search for 'clang', then
search for 'gcc'.
While doing this, I also noticed that the 'rpath' checking command was done
before we actually define 'instdir'!!! So in effect, the 'rpath' directory
was being set to '/lib'! So with this commit, this test has been taken to
after defining 'instdir'. | 
|  | Until now, when Maneage was built on a macOS that had both a clang and GCC,
we would make links to both. But this cause many conflicts in some
high-level programs (for example Numpy and etc, all the programs where we
have explicity set 'export CC=clang' before the build recipe). This happens
because the GCC that is built on a macOS isn't complete for some
operations.
To fix this problem, when we are on a macOS, we explicity set 'gcc' to
point to 'clang' and 'g++' to point to 'clang++'. We also don't link to the
host's C-preprocessor ('cpp') on macOS systems because this is only a GNU
feature and using the GNU CPP is also known to have some basic
problems. For example this was reported by Mahdieh Nabavi (which was the
main trigger for this work):
  ld: Symbol not found: ___keymgr_global
    Referenced from: /Users/Mahdieh/build/software/installed/bin/cpp
    Expected in: /usr/lib/libSystem.B.dylib
Also, to avoid linking to another link on the host tools (in the 'makelink'
function of 'basic.mk'), we are now using 'realpath'. | 
|  | Until now, when reading the host's PATH environment variable we weren't
accounting for directory names with a space character. This was most
prominently visible in the 'low-level-links' step where we put links to
some core system components into the project's build directory (mainly for
prorietary systems like macOS).
To address the problem, double quotations have been placed around the part
that we extract 'ccache' from the PATH, and the part where we make the
symbolic link. In the process the comments above 'makelink' were made more
clear and 'low-level-links' now depends on 'grep' (which is the
highest-level program it uses).
This bug was reported by Mahdieh Navabi. | 
|  | Until this commit, once Libidn was installed, insted of its own name and
version, the name and version of Libjpeg were saved (in the target if
Libidn). This robably come from a copy/paste of the rule.
With this commit, this minor bug has been corrected. I also added my name
as an author of `reproduce/software/make/xorg.mk' Makefile since I added
some code there. | 
|  | After recently adding util-linux to Maneage build-tree, we had forgot to
delete the unpacked and built source directory after it was installed! This
has been corrected with this commit. | 
|  | Until now, when the user specified an input and software directory, the raw
string they entered was used. But when this string was a relative location,
this could be problematic in general scenarios.
With this commit, the same function that finds the absolute location of the
build directory is used to find the absolute address of the data and
software directories. | 
|  | Until now, in order to build Ghostscript, the project used the host's Xorg
libraries. This was because we hadn't yet added the necessary build rules
for them.
With this commit, the instructions to build the necessary Xorg libraries
for Ghostscript have also been added. Also, the shared Ghostscript library
has been built with this commit and two sets of standard fonts are also
included, setting us on the path to build TeXLive from source later.
This task was done with the help and support of Raul Infante-Sainz. | 
|  | Until this commit, there was a problem when building Bison in parallel in
macOS systems. With this commit, this problem has been fixed by updating
Bison to its most recent version (3.6). | 
|  | POSSIBLE EFFECT ON YOUR PROJECT: The changes in this commit may only cause
conflicts to your project if you have changed the software building
Makefiles in your project's branch (e.g., 'basic.mk', 'high-level.mk' and
'python.mk'). If your project has only added analysis, it shouldn't be
affected.
This is a large commit, involving a long series of corrections in a
differnt branch which is now finally being merged into the core Maneage
branch. All changes were related and came up naturally as the low-level
infrastructure was improved. So separating them in the end for the final
merge would have been very time consuming and we are merging them as one
commit.
In general, the software building Makefiles are now much more easier to
read, modify and use, along with several new features that have been
added. See below for the full list.
 - Until now, Maneage needed the host to have a 'make' implementation
   because Make was necessary to build Lzip. Lzip is then used to
   uncompress the source of our own GNU Make. However, in the
   minimalist/slim versions of operating systems (for example used to build
   Docker images) Make isn't included by default. Since Lzip was the only
   program before our own GNU Make was installed, we consulting Antonio
   Diaz Diaz (creator of Lzip) and he kindly added the necessary
   functionality to a new version of Lzip, which we are using now. Hence we
   don't need to assume a Make implementation on the host any more. With
   this commit, Lzip and GNU Make are built without Make, allowing
   everything else to be safely built with our own custom version of GNU
   Make and not using the host's 'make' at all.
 - Until recently (Commit 3d8aa5953c4) GNU Make was built in
   'basic.mk'. Therefore 'basic.mk' was written in a way that it can be
   used with other 'make' implementations also (i.e., important shell
   commands starting with '&&' and ending in '\' without any comments
   between them!). Furthermore, to help in style uniformity, the rules in
   'high-level.mk' and 'python.mk' also followed a similar structure. But
   due to the point above, we can now guarantee that GNU Make is used from
   the very first Makefile, so this hard-to-read structure has been removed
   in the software build recipes and they are much more readable and
   edit-friendly now.
 - Until now, the default backup servers where at some fixed URLs, on our
   own pages or on Gitlab. But recently we uploaded all the necessary
   software to Zenodo (https://doi.org/10.5281/zenodo.3883409) which is
   more suitable for this task (it promises longevity, has a fixed DOI,
   while allowing us to add new content, or new software tarball
   versions). With this commit, a small script has been written to extract
   the most recent Zenodo upload link from the Zenodo DOI and use it for
   downloading the software source codes.
 - Until now, we primarily used the webpage of each software for
   downloading its tarball. But this caused many problems: 1) Some of them
   needed Javascript before the download, 2) Some URLs had a complex
   dependency on the version number, 3) some servers would be randomly down
   for maintenance and etc. So thanks to the point above, we now use the
   Zenodo server as the primary download location. However, if a user wants
   to use a custom software that is not (yet!) in Zenodo, the download
   script gives priority to a custom URL that the users can give as Make
   variables. If that variable is defined, then the script will use that
   URL before going onto Zenodo. We now have a special place for such URLs:
   'reproduce/software/config/urls.conf'. The old URLs (which are a good
   documentation themselves) are preserved here, but are commented by
   default.
 - The software source code downloading and checksum verification step has
   been moved into a Make function called 'import-source' (defined in the
   'build-rules.mk' and loaded in all software Makefiles). Having taken all
   the low-level steps there, I noticed that there is no more need for
   having the tarball as a separate target! So with this commit, a single
   rule is the only place that needs to be edited/added (greatly
   simplifying the software building Makefiles).
 - Following task #15272, A new option has been added to the './project'
   script called '--all-highlevel'. When this option is given, the contents
   of 'TARGETS.conf' are ignored and all the software in Maneage are built
   (selected by parsing the 'versions.conf' file). This new option was
   added to confirm the extensive changes made in all the software building
   recipes and is great for development/testing purposes.
 - Many of the software hadn't been tested for a long time! So after using
   the newly added '--all-highlevel', we noticed that some need to be
   updated. In general, with this commit, 'libpaper' and 'pcre' were added
   as new software, and the versions of the following software was updated:
   'boost', 'flex', 'libtirpc', 'openblas' and 'lzip'. A 'run-parts.in'
   shell script was added in 'reproduce/software/shell/' which is installed
   with 'libpaper'.
 - Even though we intentionally add the necessary flags to add RPATH inside
   the built executable at compilation time, some software don't do it
   (different software on different operating systems!). Until now, for
   historical reasons this check was done in different ways for different
   software on GNU/Linux sytems. But now it is unified: if 'patchelf' is
   present we apply it. Because of this, 'patchelf' has been put as a
   top-level prerequisite, right after Tar and is installed before anything
   else.
 - In 'versions.conf', GNU Libtool is recognized as 'libtool', but in
   'basic.mk', it was 'glibtool'! This caused many confusions and is
   corrected with this commit (in 'basic.mk', it is also 'libtool').
 - A new argument is added to the './project' script to allow easy loading
   of the project's shell and environment for fast/temporary testing of
   things in the same environment as the project. Before activating the
   project's shell, we completely remove all host environment variables to
   simulate the project's environment. It can be called with this command:
   './project shell'. A simple prompt has also been added to highlight that
   the user is using the Maneage shell! | 
|  | Until now, Maneage would accept the given build directory, regardless of
the free memory available there. This could cause confusing situations for
new users who don't know about the minimum storage requirement.
With this commit, after all other checks on the given build directory are
completed, the configure script will check the available space and warns
the user if there is less than almost 5GB free space available in the build
directory (with a 5 second delay).
It won't cause a crash because some projects may require roughly smaller
than this space (the default only needs roughly 2GB). But we also don't
want the host's partition to get too close to being full, causing them
problems elsewhere. We can change the behavior as desired in future
commits. | 
|  | In Commit 105467fe6402 (Software tarballs are downloaded even if not
built), we introduced tests to download the tarballs of software even if
they don't need to be built on the respective host. However some small
typos in the checks existed that could cause a crash on macOS. In
particular in the building of PatchELF and libbsd we had forgot to add the
necessary 'x' before the 'yes' in the conditional to check if a we are on
macOS or not.
With this commit these two checks have been corrected. Also, in the
building of 'isl' and 'mpc', we now check for 'host_cc' (signifying that
the user wants to use their host C compiler for the high-level step)
instead of 'on_mac_os'. The reason is that even on non-macOS systems, a
user may not want to build the C compiler from scratch and use the
'--host-cc' option. In such cases, they don't need to compile 'isl' and
'mpc'. | 
|  | Until now, the English texts that embeds the list of software to
acknowledge in the paper was hard-wired into the low-level coding
('reproduce/software/shell/configure.sh' to be more specific). But this
file is very low-level, thus discouraging users to modify this surrounding
text.
While the list of software packages can be considered to be 'data' and is
fixed, the surrounding text to describe the lists is something the authors
should decide on. Authors of a scientific research paper take
responsibility for the full paper, including for the style of the
acknowledgments, even if these may well evolve into some standard text.
With this commit, authors who do *not* modify
'reproduce/software/config/acknowledge_software.sh' will have a default
text, with only a minor English correction from earlier versions of
Maneage. However, Authors choosing to use their own wording should be able
to modify the text parameters in
`reproduce/software/config/acknowledge_software.sh` in the obvious
way. This is much more modular than asking project authors to go looking
into the long and technical 'configure.sh' script.
Systematic issues: the file
`reproduce/software/config/acknowledge_software.sh` is an executable shell
script, because it has to be called by
`reproduce/software/shell/configure.sh`, which, in principle, does not yet
have access to `GNU make` (if I understand the bootstrap sequence
correctly). It is placed in `config/` rather than `shell/`, because the
user will expect to find configuration files in `config/`, not in `shell/`.
A possible alternative to avoid having a shell script as a configure file
would be to let `reproduce/software/config/acknowledge_software.sh` appear
to be a `make` file, but analyse it in `configure.sh` using `sed` to remove
whitespace around `=`, and adding other hacks to switch from `make` syntax
to `shell` syntax. However, this risks misleading the user, who will not
know whether s/he should follow `make` conventions or `shell` conventions. | 
|  | Some low-level software aren't necessary on some operating systems, for
example GCC can't be built on macOS, hence we don't build it and the
GCC-only dependencies. Also, on GNU/Linux systems users could configure
with '--host-cc' to avoid all the time it takes to build GCC when doing a
fast test.
Until now, in such cases not only was the software not installed, but the
tarballs of the software were also not downloaded. Hence making the output
of '--dist-software' incomplete (as in bug #58561).
With this commit, we now import all the necessary tarballs, when the
software isn't necessary for the particular system, it won't be built or
cited, but its tarball will be present anyway, thus allowing the output of
'--dist-software' to be complete. | 
|  | Until now, when making the link to Gnuastro's configuration files, the
'configure.sh' script would incorrectly link to the old configuration
directory under the 'reproduce/software' directory. With this commit, it is
moved to the proper directory under 'reproduce/analysis'. | 
|  | Until now, when adding the necessary library flags to the build of XLSX
I/O, we were effectively over-writing the 'LDFLAGS' variables. So the
compiler was effectively not being told where to look for the necessary
libraries.
With this commit, to fix the problem, we now append the new linking flags
to LDFLAGS in XLSX I/O's build, not over-write it. | 
|  | After trying a clean build of Maneage in a Docker image (with a minimal
debian:stable-20200607-slim OS), I noticed that the building of OpenSSL is
failing because it doesn't find the proper Perl functionality. To fix it,
with this commit, Perl is set as a prerequisite of OpenSSL and this fixed
the problem. | 
|  | The project configuration requires a build-directory at configuration time,
two other directories can optionally be given to avoid downloading the
project's necessary data and software. It is possible to give these three
directories as command-line options, or by interactively giving them after
running the configure script.
Until now, when these directories weren't given as command-line options,
and the running shell was non-interactive, the configure script would crash
on the line trying to interactively read the user's given directories (the
'read' command).
With this commit, all the 'read' commands for these three directories are
now put within an 'if' statement. Therefore, when 'read' fails (the shell
is non-interactive), instead of a quiet crash, a descriptive message is
printed, telling the user that cause of the problem, and suggesting a fix.
This bug was found by Michael R. Crusoe. | 
|  | Until now, the description of the input-data directory at configure time
included a description of the input data (created by reading the values of
'INPUTS.conf'). Maintaining this is easy for a single dataset, but it
becomes hard for a general project which may need many input datasets.
To avoid extra complexity (for maintaining this list), the description now
points a user of the project to the 'INPUTS.conf' file and asks them to
look inside of it for seeing the necessary data. This infact helps with the
users becoming familiar with the internal structure of Maneage and will
allow the authors to focus on not having to worry about updating the
low-level 'configure.sh' script. | 
|  | When './project configure' is run, after the basic checks of the compiler,
a small statement is printed telling the user that some configuration
questions will now be asked to start building Maneage on the system. Until
now this description was confusing: it lead the reader to think that the
local configuration (which was recommended to read before continuing) is in
another file.
With this commit, the text has been edited to explictly mention that the
description of the steps following this notice should be read
carefully. Thus avoiding that confusion.
This issue was mentioned by Michael R. Crusoe. | 
|  | Possible semantic conflicts (that may not show up as Git conflicts but may
cause a crash in your project after the merge):
   1) The project title (and other basic metadata) should be set in
      'reproduce/analysis/conf/metadata.conf'. Please include this file in
      your merge (if it is ignored because of '.gitattributes'!).
   2) Consider importing the changes in 'initialize.mk' and 'verify.mk' (if
      you have added all analysis Makefiles to the '.gitattributes' file
      (thus not merging any change in them with your branch). For example
      with this command:
        git diff master...maneage -- reproduce/analysis/make/initialize.mk
   3) The old 'verify-txt-no-comments-leading-space' function has been
      replaced by 'verify-txt-no-comments-no-space'. The new function will
      also remove all white-space characters between the columns (not just
      white space characters at the start of the line). Thus the resulting
      check won't involve spacing between columns.
A common set of steps are always necessary to prepare a project for
publication. Until now, we would simply look at previous submissions and
try to follow them, but that was prone to errors and could cause
confusion. The internal infrastructure also didn't have some useful
features to make good publication possible. Now that the submission of a
paper fully devoted to the founding criteria of Maneage is complete
(arXiv:2006.03018), it was time to formalize the necessary steps for easier
submission of a project using Maneage and implement some low-level features
that can make things easier.
With this commit a first draft of the publication checklist has been added
to 'README-hacking.md', it was tested in the submission of arXiv:2006.03018
and zenodo.3872248. To help guide users on implementing the good practices
for output datasets, the outputs of the default project shown in the paper
now use the new features). After reading the checklist, please inspect
these.
Some other relevant changes in this commit:
  - The publication involves a copy of the necessary software
    tarballs. Hence a new target ('dist-software') was also added to
    package all the project's software tarballs in one tarball for easy
    distribution.
  - A new 'dist-lzip' target has been defined for those who want to
    distribute an Lzip-compressed tarball.
  - The '\includetikz' LaTeX macro now has a second argument to allow
    configuring the '\includegraphics' call when the plot should not be
    built, but just imported. | 
|  | When some files should not be merged, until now we were suggesting to also
add deleted files to the '.gitattributes' file. However, this feature of
Git doesn't work for deleted files and they would still show up in the
'master' branch after a merge.
So with this commit, we have added a simple AWK command to run after a
merge that will automatically detect and delete such files (using the
output of 'git status --porcelain').
Also, two minor typos were corrected in the newly added
'servers-backup.conf' file: the copyright year was wrong and there was no
new-line at the end of the file (a good convention!). | 
|  | Until now, Maneage would only build Flock before building everything else
using Make (calling 'basic.mk') in parallel. Flock was necessary to avoid
parallel downloads during the building of software (which could cause
network problems). But after recently trying Maneage on FreeBSD (which is
not yet complete, see bug #58465), we noticed that the BSD implemenation of
Make couldn't parse 'basic.mk' (in particular, complaining with the 'ifeq'
parts) and its shell also had some peculiarities.
It was thus decided to also install our own minimalist shell, Make and
compressor program before calling 'basic.mk'. In this way, 'basic.mk' can
now assume the same GNU Make features that high-level.mk and python.mk
assume. The pre-make building of software is now organized in
'reproduce/software/shell/pre-make-build.sh'.
Another nice feature of this commit is for macOS users: until now the
default macOS Make had problems for parallel building of software, so
'basic.mk' was built in one thread. But now that we can build the core
tools with GNU Make on macOS too, it uses all threads. Furthermore, since
we now run 'basic.mk' with GNU Make, we can use '.ONESHELL' and don't have
to finish every line of a long rule with a backslash to keep variables and
such.
Generally, the pre-make software are now organized like this: first we
build Lzip before anything else: it is downloaded as a simple '.tar' file
that is not compressed (only ~400kb). Once Lzip is built, the pre-make
phase continues with building GNU Make, Dash (a minimalist shell) and
Flock. All of their tarballs are in '.tar.lz'. Maneage then enters
'basic.mk' and the first program it builds is GNU Gzip (itself packaged as
'.tar.lz'). Once Gzip is built, we build all the other compression software
(all downloaded as '.tar.gz'). Afterwards, any compression standard for
other software is fine because we have it.
In the process, a bug related to using backup servers was found in
'reproduce/analysis/bash/download-multi-try' for calling outside of
'basic.mk' and removed Bash-specific features. As a result of that bug-fix,
because we now have multiple servers for software tarballs, the backup
servers now have their own configuration file in
'reproduce/software/config/servers-backup.conf'. This makes it much easier
to maintain the backup server list across the multiple places that we need
it.
Some other minor fixes:
 - In building Bzip2, we need to specify 'CC' so it doesn't use 'gcc'.
 - In building Zip, the 'generic_gcc' Make option caused a crash on FreeBSD
   (which doesn't have GCC).
 - We are now using 'uname -s' to specify if we are on a Linux kernel or
   not, if not, we are still using the old 'on_mac_os' variable.
 - While I was trying to build on FreeBSD, I noticed some further
   corrections that could help. For example the 'makelink' Make-function
   now takes a third argument which can be a different name compared to the
   actual program (used for examle to make a link to '/usr/bin/cc' from
   'gcc'.
 - Until now we didn't know if the host's Make implementation supports
   placing a '@' at the start of the recipe (to avoid printing the actual
   commands to standard output). Especially in the tarball download phase,
   there are many lines that are printed for each download which was really
   annoying. We already used '@' in 'high-level.mk' and 'python.mk' before,
   but now that we also know that 'basic.mk' is called with our custom GNU
   Make, we can use it at the start for a cleaner stdout.
 - Until now, WCSLIB assumed a Fortran compiler, but when the user is on a
   system where we can't install GCC (or has activated the '--host-cc'
   option), it may not be present and the project shouldn't break because
   of this. So with this commit, when a Fortran compiler isn't present,
   WCSLIB will be built with the '--disable-fortran' configuration option.
This commit (task #15667) was completed with help/checks by Raul
Infante-Sainz and Boud Roukema. | 
|  | Until this commit, when the user had a previous TeXLive tarball already
present (in their software-tarball directory) compared to the CTAN server,
the project crashed in the configure phase. This was because TeXLive is
updated yearly and we don't yet install TeXLive from source (currently we
use its own package manager, but we plan to fix this in task #15267).
With this commit, we fix the problem by checking the cause of the crash
during the installation of TeX. If the crash is due to this particular
error, we ignore the old tarball and download the new one and install it
(the old one is still kept in '.build/software/tarballs', but will get a
'-OLD' in its name. This probem was recurrent, and every year that TeXLive
is updated, the previous tarball had to be removed manually! But with this
commit, this is done automatically. The detection and fix of this bug has
been possible with the help of Mohammad Akhlaghi, thanks! | 
|  | One of the main reasons to building Maneage is to properly
acknowledge/attribute the authors of software in research. So we have
adopted a standard of never referring to the GNU-based operating systems
running the Linux kernel simply as "Linux", we avoid terms like "Open
Sourse" and use Free Software instead (in the same spirit).
With this commit, a few instances of the cases above have been corrected,
they had slipped through our fingers when we initially imported them into
the project. In the special case of the "Journal for Open Source Software",
we simply replaced it with its abbreviation (JOSS). This was done because
in effect we were generally using journal name abbreviations in almost all
the citations already. To avoid any inconsistancies, the names of the three
other journals that weren't abbreviated are also abbreviated. | 
|  | With this commit, Maneage now includes instructions to build the memory
tracing tool Valgrind and the program 'patch' (to apply corrections/patches
in text files and in particular the sources of programs).
For this version of Valgrind, some patches were necessary for an interface
with OpenMPI 2.x (which is the case now). Also note that this version of
Valgrind's checks can fail with GCC 10.1.x (when using '--host-cc'), and
the failures aren't due to internal problems but due to how the tests are
designed (https://bugs.gentoo.org/707598). So currently if any of
Valgrind's checks fail, Maneage still assumes that Valgrind was built and
installed successfully.
While testing on macOS, we noticed that it needs the macOS-specific 'mig'
program which we can't build in Maneage. DESCRIPTION: The mig command
invokes the Mach Interface Generator to generate Remote Procedure Call
(RPC) code for client-server style Mach IPC from specification files. So a
symbolic link to the system's 'mig' is now added to the project's programs
on macOS systems.
This commit's build of Patch and Valgrind has been tested on two GNU/Linux
distributions (Debian and ArchLinux) as well as macOS.
Work on this commit started by Boud Roukema, but also involved tests and
corrections by Mohammad Akhlaghi and Raul Infante-Sainz. | 
|  | Until now, two of the software BibTeX sources (Matplolib and Sympy) had an
"abstract" entry that was long, not similar to the rest, and not relevant
in this context, so they are removed with this commit. | 
|  | In time, some of the copyright license description had been mistakenly
shortened to two paragraphs instead of the original three that is
recommended in the GPL. With this commit, they are corrected to be exactly
in the same three paragraph format suggested by GPL.
The following files also didn't have a copyright notice, so one was added
for them:
    reproduce/software/make/README.md
    reproduce/software/bibtex/healpix.tex
    reproduce/analysis/config/delete-me-num.conf
    reproduce/analysis/config/verify-outputs.conf | 
|  | Until this commit, when the version of Gnuastro doesn't match with the
version that the project was designed to use, the warning message saying
how to run the configure step was not showing the option `-e'.  This
situation is normal when updating the version of Gnuastro to the most
recent one (with the project already configured). However, the use of this
option is more convenient than giving the top-build directory, etc, every
time. With this commit, the warning message has been changed in order show
also the option `-e' in the re-configure of the project. | 
|  | Until this commit, Scamp was installed with the option
`--enable-plplot=yes' (the default). However, Maneage does not have PLplot
included. As it is possible to install Scamp without PLplot (in that case
it won't generate plots), with this commit this option has been set to
`no'. As a consequence, Scamp will be installed even if the host system
does not have PLplot without crashing (but it won't make any plot). | 
|  | Until now Maneage used the host's GNU Gettext if it was present. Gettext is
a relatively low-level software that enables programs to print messages in
different languages based on the host environment. Even though it has not
direct effect on the running of the software for Maneage and the lanugage
environment in Maneage is pre-determined, it is necessary to have it
because if the basic programs see it in the host they will link with it and
will have problems if/when the host's Gettext is updated.
With this commit (which is actually a squashed rebase of 9 commits by Raul
and Mohammad), Gettext and its two extra dependencies (libxml2 and
libunistring) are now installed within Maneage as a basic software and
built before GNU Bash. As a result, all programs built afterwards will
successfully link with our own internal version of Gettext and
libraries. To get this working, some of the basic software dependencies had
to updated and re-ordered and it has been tested in both GNU/Linux and
macoS.
Some other minor issues that are fixed with this commit
 - Until this commit, when TeX was not installed, the warning message
   saying how to run the configure step in order to re-configure the
   project was not showing the option `-e'. However, the use of this option
   is more convenient than entering the top-build directory and etc every
   time. So with this commit, the warning message has been changed in order
   use the option `-e' in the re-configure of the project.
 - Until now, on macOS systems, Bash was not linking with our internally
   built `libncurses'. With this commit, this has been fixed by setting
   `--withcurses=yes' for Bash's configure script. | 
|  | Until now there we had manually inserted a `\' before the `_' of sip_tpv
program. However, we also recently added a step in the configure script to
add a `\' before every `_' when writing the final LaTeX macro. This was
because some C compilers (when the host's is used) have an `_' in their
version that we had no control over.
With this commit, the `\' is removed from `sip_tpv' in its build-rule and
we let the backslash be inserted automatically. | 
|  | Until now, when you changed the version of a software in an already-built
system, its tarball would be downloaded, but it wouldn't actually
build. The only way would be to force the build by deleting the main target
of that file (under `.local/version-info/TYPE/PROGRAM'). This was because
the tarballs were an order-only prerequisite which was implemented some
time ago based on some theoretical argument that if the tarball dates
changes, the software should not be rebuilt (because we check the
checksum).
However, the problems this causes are more than those it solves: Users may
forget to delete the main target of the program and mistakenly think that
they are using the new version. The fact that all the numbers going into
the paper also contain this number further hides this.
With this commit, tarballs are no longer order-only and any time a version
of a software is updated, it will be automatically built and not cause
confusion and manual intervention by the users. As a result of this change,
I also had to correct the way we find the tarball from the list of
prerequisites. |