Age | Commit message (Collapse) | Author | Lines |
|
This commit primarily affects the configuration step of Maneage'd projects,
and in particular, updated versions of the many of the software (see
P.S.). So it shouldn't affect your high-level analysis other than the
version bumps of the software you use (and the software's possibly
improve/changed behavior).
The following software (and thus their dependencies) couldn't be updated as
described below:
- Cryptography: isn't building because it depends on a new
setuptools-rust package that has problems
(https://savannah.nongnu.org/bugs/index.php?61731), so it has been
commented in 'versions.conf'.
- SecretStorage: because it depends on Cryptography.
- Keyring: because it depends on SecretStorage.
- Astroquery: because it depends on Keyring.
This is a "squashed" commit after rebasing a development branch of 60
commits corresponding to a roughly two-month time interval. The following
people contributed to this branch.
- Boudewijn Roukema added all the R software infrastructure and the R
packages, as well as greatly helping in fixing many bugs during the
update.
- Raul Infante-Sainz helped in testing and debugging the build.
- Pedram Ashofteh Ardakani found and fixed a bug.
- Zahra Sharbaf helped in testing and found several bugs.
Below a description of the most noteworthy points is given.
- Software tarballs: all updated software now have a unified format
tarball (ustar; if not possible, pax) and unified compression (Lzip) in
Maneage's software repository in Zenodo
(https://doi.org/10.5281/zenodo.3883409). For more on this See
https://savannah.nongnu.org/task/?15699 . This won't affect any extra
software you would like to add; you can use any format recognized by
GNU Tar, and all common compression algorithms. This new requirement is
only for software that get merged to the core Maneage branch.
- Metastore (and thus libbsd and libmd) moved to highlevel: Metastore
(and the packages it depends on) is a high-level product that is only
relevant during the project development (like Emacs!): when the user
wants the file meta data (like dates) to be unchanged after checking
out branches. So it should be considered a high-level software, not
basic. Metastore also usually causes many more headaches and error
messages, so personally, I have stopped using it! Instead I simply
merge my branches in a separate clone, then pull the merge commit: in
this way, the files of my project aren't re-written during the checkout
phase and therefore their dates are untouched (which can conflict with
Make's dates on configuration files).
- The un-official cloned version of Flex (2.6.4-91 until this commit) was
causing problems in the building of Netpbm, so with this commit, it has
been moved back to version 2.6.4.
- Netpbm's official page had version 10.73.38 as the latest stable
tarball that was just released in late 2021. But I couldn't find our
previously-used version 10.86.99 anywhere (to see when it was released
and why we used it! Its at last more than one year old!). So the
official stable version is being used now.
- Improved instructions in 'README.md' for building software environment
in a Docker container (while having project source and output data
products on the local system; including the usage of the host's
'/dev/shm' to speed up temporary operations).
- Until now, the convention in Maneage was to put eight SPACE characters
before the comment lines within recipes. This was done because by
default GNU Emacs (also many other editors) show a TAB as eight
characters. However, in other text editors, online browsers, or even
the Git diff, a TAB can correspond to a different number of
characters. In such cases, the Maneage recipes wouldn't look too
interesting (the comments and the recipe commands would show a
different indentation!).
With this commit, all the comment lines in the Makefiles within the
core Maneage branch have a hash ('#') as their first character and a
TAB as the second. This allows the comment lines in recipes to have the
same indentation as code; making the code much more easier to read in a
general scenario including a 'git diff' (editor agnostic!).
P.S. List of updated software with their old and new versions
- Software with no version update are not mentioned.
- The old version of newly added software are shown with '--'.
Name (Basic) Old version New version
------------ ----------- -----------
Bzip2 1.0.6 1.0.8
CURL 7.71.1 7.79.1
Dash 0.5.10.2 0.5.11.5
File 5.39 5.41
Flock 0.2.3 0.4.0
GNU Bash 5.0.18 5.1.8
GNU Binutils 2.35 2.37
GNU Coreutils 8.32 9.0
GNU GCC 10.2.0 11.2.0
GNU M4 1.4.18 1.4.19
GNU Readline 8.0 8.1.1
GNU Tar 1.32 1.34
GNU Texinfo 6.7 6.8
GNU diffutils 3.7 3.8
GNU findutils 4.7.0 4.8.0
GNU gmp 6.2.0 6.2.1
GNU grep 3.4 3.7
GNU gzip 1.10 1.11
GNU libunistring 0.9.10 1.0
GNU mpc 1.1.0 1.2.1
GNU mpfr 4.0.2 4.1.0
GNU nano 5.2 6.0
GNU ncurses 6.2 6.3
GNU wget 1.20.3 1.21.2
Git 2.28.0 2.34.0
Less 563 590
Libxml2 2.9.9 2.9.12
Lzip 1.22-rc2 1.22
OpenSLL 1.1.1a 3.0.0
Patchelf 0.10 0.13
Perl 5.32.0 5.34.0
Podlators -- 4.14
Name (Highlevel) Old version New version
---------------- ----------- -----------
Apachelog4cxx 0.10.0-603 0.12.1
Astrometry.net 0.80 0.85
Boost 1.73.0 1.77.0
CFITSIO 3.48 4.0.0
Cmake 3.18.1 3.21.4
Eigen 3.3.7 3.4.0
Expat 2.2.9 2.4.1
FFTW 3.3.8 3.3.10
Flex 2.6.4-91 2.6.4
Fontconfig 2.13.1 2.13.94
Freetype 2.10.2 2.11.0
GNU Astronomy Utilities 0.12 0.16.1-e0f1
GNU Autoconf 2.69.200-babc 2.71
GNU Automake 1.16.2 1.16.5
GNU Bison 3.7 3.8.2
GNU Emacs 27.1 27.2
GNU GDB 9.2 11.1
GNU GSL 2.6 2.7
GNU Help2man 1.47.11 1.48.5
Ghostscript 9.52 9.55.0
ICU -- 70.1
ImageMagick 7.0.8-67 7.1.0-13
Libbsd 0.10.0 0.11.3
Libffi 3.2.1 3.4.2
Libgit2 1.0.1 1.3.0
Libidn 1.36 1.38
Libjpeg 9b 9d
Libmd -- 1.0.4
Libtiff 4.0.10 4.3.0
Libx11 1.6.9 1.7.2
Libxt 1.2.0 1.2.1
Netpbm 10.86.99 10.73.38
OpenBLAS 0.3.10 0.3.18
OpenMPI 4.0.4 4.1.1
Pixman 0.38.0 0.40.0
Python 3.8.5 3.10.0
R 4.0.2 4.1.2
SWIG 3.0.12 4.0.2
Util-linux 2.35 2.37.2
Util-macros 1.19.2 1.19.3
Valgrind 3.15.0 3.18.1
WCSLIB 7.3 7.7
Xcb-proto 1.14 1.14.1
Xorgproto 2020.1 2021.5
Name (Python) Old version New version
------------- ----------- -----------
Astropy 4.0 5.0
Beautifulsoup4 4.7.1 4.10.0
Beniget -- 0.4.1
Cffi 1.12.2 1.15.0
Cryptography 2.6.1 36.0.1
Cycler 0.10.0 0.11.0+}
Cython 0.29.21 0.29.24
Esutil 0.6.4 0.6.9
Extension-helpers -- 0.1
Galsim 2.2.1 2.3.3
Gast -- 0.5.3
Jinja2 -- 3.0.3
MPI4py 3.0.3 3.1.3
Markupsafe -- 2.0.1
Numpy 1.19.1 1.21.3
Packaging -- 21.3
Pillow -- 8.4.0
Ply -- 3.11
Pyerfa -- 2.0.0.1
Pyparsing 2.3.1 3.0.4
Pythran -- 0.11.0
Scipy 1.5.2 1.7.3
Setuptools 41.6.0 58.3.0
Six 1.12.0 1.16.0
Uncertainties 3.1.2 3.1.6
Wheel -- 0.37.0
Name (R) Old version New version
-------- ----------- -----------
Cli -- 2.5.0
Colorspace -- 2.0-1
Cowplot -- 1.1.1
Crayon -- 1.4.1
Digest -- 0.6.27
Ellipsis -- 0.3.2
Fansi -- 0.5.0
Farver -- 2.1.0
Ggplot2 -- 3.3.4
Glue -- 1.4.2
GridExtra -- 2.3
Gtable -- 0.3.0
Isoband -- 0.2.4
Labeling -- 0.4.2
Lifecycle -- 1.0.0
Magrittr -- 2.0.1
MASS -- 7.3-54
Mgcv -- 1.8-36
Munsell -- 0.5.0
Pillar -- 1.6.1
R-Pkgconfig -- 2.0.3
R6 -- 2.5.0
RColorBrewer -- 1.1-2
Rlang -- 0.4.11
Scales -- 1.1.1
Tibble -- 3.1.2
Utf8 -- 1.2.1
Vctrs -- 0.3.8
ViridisLite -- 0.4.0
Withr -- 2.4.2
|
|
On systems that allow it (like GNU/Linux systems), Maneage will build the
necessary software in shared memory (a directory that is actually in the
RAM, not on an SSD/HDD, on GNU/Linux systems, it is '/dev/shm'). This
allows Maneage to operate faster and not harm the HDD/SSD with all the
temporary writing of many small files.
Until now, we would only check that this directory exists and that it has
enough space. However, some systems also set the 'noexec' flag on shared
memory for security reasons [1]. This causes Maneage to crash upon building
of the software in later phases.
With this commit, at the very start of the configuration step, and after
all other shared-memory checks are done, a dummy executable script file is
created there and its execution is tested. If it doesn't work, shared
memory will not be used at all.
In the process, the steps dealing with the software building directory in
the configure script have been brought in one place and comments were added
to further clarify every step.
This commit was initially done by Boud Roukema and later edited by Mohammad
Akhlaghi.
[1] https://web.archive.org/web/20210624192819/https://serverfault.com/questions/72356/how-useful-is-mounting-tmp-noexec
|
|
Once a year, the texlive update system becomes incompatible with the
version from the previous year. Since a texlive install failure is
considered non-fatal by 'high-level.mk', so until now, the user could miss
the printed message and mistakenly believe that the configure is valid.
This commit explicitly adds a 10-second delay that should be enough for a
user who does the 'configure --existing-conf' step alone to notice that
there is a TeX Live problem. It also adds the explicit instruction of how
to allow an update from an earlier year's texlive installer to the warning
message (by deleting '.build/software/tarballs/install-tl-unx.tar.gz'). I
had to rediscover this a few times for old Maneage installs.
Also, a few lines in 'reproduce/software/shell/configure.sh' were indented
with a TAB (that is not recommended because TAB is displayed with different
widths on different browsers). So while doing this commit, those TABs were
also converted to a space.
|
|
When built in 'group' mode, the write permissions of all created files will
be activated for a certain group of users in the host operating system. The
user specifies the name of the group with the '--group' option at configure
time. At the very start, the './project' script checks to see if the given
group name actually exists or not (to avoid hard-to-debug errors popping up
later).
Until now, the checking 'sg' command (that was used to build the project
with group-writable permissions) would always fail due to the excessive
number of redirections. Therefore, it would always print the error message
and abort.
With this commit, the output of 'sg' is no longer re-directed (which also
helps users in debuggin). If the group does actually exist, it will just
print a small statement saying so, and if it fails, the error message is
printed. This fixed the problem, allowing maneage to be built in
group-mode.
I also noticed that the variable name keeping the group name
('reproducible_paper_group_name') used the old name for the project (which
was "Reproducible paper template"! So it has been changed/corrected to
'maneage_group_name'.
|
|
With a recent update of macOS systems (macOS Big Sur 11.2.3 and Xcode
12.4), there are many warnings when building C programs (for example the
simple program we compile to check the compiler, or some of the software
like `gzip'). It prints hundreds of warning lines for every source file
that are irrelevant for our builds, but really clutters the output.
With this commit, these warnings are disabled by adding
`-Wno-nullability-completeness' to the 'CPPFLAGS' environment
variable. This has also been added to the very first check of the C
compiler in the configure step.
|
|
Until now, each time there was a problem in the configuration of Maneage'd
projects and debugging was necessary, we had to take the following changes:
- Run the configuration on a single thread ('-j1') to see the building of
only the problematic software.
- Disable the Zenodo check manually by commenting those parts of
'reproduce/software/shell/configure.sh'. Because the internet connection
wastes a few seconds and is thus very annoying during repeated runs!
- Manually remove the '-k' option that was passed to Make (when building
the software). With the '-k', Make keeps going with the execution of
other targets if something crashes and this usually causes confusions
during the debugging.
Doing the manual changes within the code was both very annoying and prone
to errors (forgetting to correct it!).
With this commit, the existing '--debug' option has been generalized to the
software configuration phase of Maneage also. Until now, it was only
available in the analysis phase (and would directly be passed to the 'make'
command that would run the analysis). When this option is used, and the
project is in the software configuration phase, the Zenodo check won't be
done, it will use one single thread ('-j1'), and it will stop the execution
as soon as an error occurs (Make is not run with '-k').
|
|
Until now, the build directory contained a 'software/' directory (that
hosted all the built software), a 'tex/' subdirectory for the final
building of the paper, and many other directories containing
intermediate/final data of the specific project. But this mixing of built
software and data is against our modularity and minimal complexity
principles: built software and built data are separate things and keeping
them separate will enable many optimizations.
With this commit, the build directory of the core Maneage branch will only
contain two sub-directories: 'software/' and 'analysis/'. The 'software/'
directory has the same contents as before and is not touched in this
commit. However, the 'analysis/' directory is new and everything created in
the './project make' phase of the project will be created inside of this
directory.
To facilitate easy access to these top-level built directories, two new
variables are defined at the top of 'initialize.mk': 'badir', which is
short for "built-analysis directory" and 'bsdir', which is short for
"built-software directory".
HOW TO IMPLEMENT THIS CHANGE IN YOUR PROJECT. It is easy: simply replace
all occurances of '$(BDIR)' in your project's subMakefiles (except the ones
below) to '$(badir)'. To confirm if everything is fine before building your
project from scratch after merging, you can run the following command to
see where 'BDIR' is used and confirm the only remaning cases.
$ grep -r BDIR reproduce/analysis/*
--> make/verify.mk: innobdir=$$(echo $$infile | sed -e's|$(BDIR)/||g'); \
--> make/initialize.mk:badir=$(BDIR)/analysis
--> make/initialize.mk:bsdir=$(BDIR)/software
--> make/initialize.mk: $$sys_rm -rf $(BDIR)
--> make/top-prepare.mk:all: $(BDIR)/software/preparation-done.mk
'BDIR' should only be present in lines of the files above. If you see
'$(BDIR)' used anywhere else, simply change it to '$(badir)'. Ofcourse, if
your project assumes BDIR in other contexts, feel free to keep it, it will
not conflict. If anything un-expected happens, please post a comment on the
link below (you need to be registered on Savannah to post a comment):
https://savannah.nongnu.org/task/?15855
One consequence of this change is that the 'analysis/' subdirectory can be
optionally mounted on a separate partition. The need for this actually came
up for some new users of Maneage in a Docker image. Docker can fix
portability problems on systems that we haven't yet supported (even
Windows!), or had a chance to fix low-level issues on. However, Docker
doesn't have a GUI interface. So to see the built PDF or intermediate data,
it was necessary to copy the built data to the host system after every
change, which is annoying during working on a project. It would also need
two copies of the source: one in the host, one in the container. All these
frustrations can be fixed with this new feature.
To describe this scenario, README.md now has a new section titled "Only
software environment in the Docker image". It explains step-by-step how you
can make a Docker image to only host the built software environment. While
your project's source, software tarballs and 'BDIR/analysis' directories
are on your host operating system. It has been tested before this commit
and works very nicely.
|
|
Having entered 2021, it was necessary to update the copyright years at the
top of the source files. We recommend that you do this for all your
project-specific source files also.
|
|
Until now, the core Maneage branch included some configuration files for
Gnuastro's programs. This was actually a remnant of the distant past when
Maneage didn't actually build its own software and we had to rely on the
host's software versions. This file contained the configuration files
specific to Gnuastro for this project and also had a feature to avoid
checking the host's own configuration files.
However, we now build all our software ourselves with fixed configuration
files (for the version that is being installed and its version is
stored). So those extra configuration files were just extra and caused
confusion and problems in some scenarios. With this commit, those extra
files are now removed.
Also, two small issues are also addressed in parallel with this commit:
- When running './project make clean', the 'hardware-parameters.tex' macro
file (which is created by './project configure' is not deleted.
- The project title is now written into the default output's PDF's
properties (through 'hypersetup' in 'tex/src/preamble-header.tex')
through the LaTeX macro.
All these issues were found and fixed with the help of Samane Raji.
|
|
Until now, during the configure step it was checked if the host Operative
System were GNU/Linux, and if not, we assumed it is macOS. However, it can
be any other different OS! With this commit, now we explicity check if the
system is GNU/Linux or Darwin (macOS). If it is not any of them, a warning
message says to the user that the host system is different from which we
have checked so far (and invite to contact us if there is any problem).
In addition to this, if the system is macOS, now it checks if Xcode is
already installed in the host system. If it is not installed, a warning
message informs the user to do that in case a problem/crash in the
configure step occurs. We have found that it is convenient to have Xcode
installed in order to avoid some problems.
|
|
Following the previous commit, we recognized that the 'IFS' terms are not
necessary and can be even cause problems. So all their occurances in the
scripts of Maneage have been removed with this commit.
|
|
While a project is under development, the raw analysis software are not the
only necessary software in a project. We also need tools to all the edit
plain-text files within the Maneaged project. Usually people use their
operating system's plain-text editor. However, when working on the project
on a new computer, or in a container, the plain-text editors will have
different versions, or may not be present at all! This can be very annoying
and frustrating!
With this commit, Maneage now installs GNU Nano as part of the basic
tools. GNU Nano is a very simple and small plain text editor (the installed
size is only ~3.5MB, and it is friendly to new users). Therefore, any
Maneaged project can assume atleast Nano will be present (in particular
when no editor is available on the running system!). GNU Emacs and VIM
(both without extra dependencies, in particular without GUI support) are
also optionally available in 'high-level.mk' (by adding them to
'TARGETS.conf').
The basic idea for the more advanced editors (Emacs and VIM) is that
project authors can add their favorite editor while they are working on the
project, but upon publication they can remove them from 'TARGETS.conf'.
A few other minor things came up during this work and are now also fixed:
- The 'file' program and its libraries like 'libmagic' were linking to
system's 'libseccomp'! This dependency then leaked into Nano (which
depends on 'libmagic'). But this is just an extra feature of 'file',
only for the Linux kernel. Also, we have no dependency on it so far. So
'file' is not configured to not build with 'libseccomp'.
- A typo was fixed in the line where the physical core information is
being read on macOS.
- The top-level directories when running './project shell' are now quoted
(in case they have special characters).
|
|
Until now, no machine-related specifications were being documented in the
workflow. This information can become helpful when observing differences in
the outcome of both software and analysis segments of the workflow by
others (some software may behave differently based on host machine).
With this commit, the host machine's 'hardware class' and 'byte-order' are
collected and now available as LaTeX macros for the authors to use in the
paper. Currently it is placed in the acknowledgments, right after
mentioning the Maneage commit.
Furthermore, the project and configuration scripts are now capable of
dealing with input directory names that have SPACE (and other special
characters) by putting them inside double-quotes. However, having spaces
and metacharacters in the address of the build directory could cause
build/install failure for some software source files which are beyond the
control of Maneage. So we now check the user's given build directory
string, and if the string has any '@', '#', '$', '%', '^', '&', '*', '(',
')', '+', ';', and ' ' (SPACE), it will ask the user to provide a different
directory.
|
|
It was a long time that the Maneage software versions hadn't been updated.
With this commit, the versions of all basic software were checked and 17 of
that had newer versions were updated. Also, 16 high-level programs and
libraries were updated as well as 7 Python modules. The full list is
available below.
Basic Software (affecting all projects)
---------------------------------------
bash 5.0.11 -> 5.0.18
binutils 2.32 -> 2.35
coreutils 8.31 -> 8.32
curl 7.65.3 -> 7.71.1
file 5.36 -> 5.39
gawk 5.0.1 -> 5.1.0
gcc 9.2.0 -> 10.2.0
gettext 0.20.2 -> 0.21
git 2.26.2 -> 2.28.0
gmp 6.1.2 -> 6.2.0
grep 3.3 -> 3.4
libbsd 0.9.1 -> 0.10.0
ncurses 6.1 -> 6.2
perl 5.30.0 -> 5.32.0
sed 4.7 -> 4.8
texinfo 6.6 -> 6.7
xz 5.2.4 -> 5.2.5
Custom programs/libraries
-------------------------
astrometrynet 0.77 -> 0.80
automake 0.16.1 -> 0.16.2
bison 3.6 -> 3.7
cfitsio 3.47 -> 3.48
cmake 3.17.0 -> 3.18.1
freetype 2.9 -> 2.10.2
gdb 8.3 -> 9.2
ghostscript 9.50 -> 9.52
gnuastro 0.11 -> 0.12
libgit2 0.28.2 -> 1.0.1
libidn 1.35 -> 1.36
openmpi 4.0.1 -> 4.0.4
R 3.6.2 -> 4.0.2
python 3.7.4 -> 3.8.5
wcslib 6.4 -> 7.3
yaml 0.2.2 -> 0.2.5
Python modules
--------------
cython 0.29.6 -> 0.29.21
h5py 2.9.0 -> 2.10.0
matplotlib 3.1.1 -> 3.3.0
mpi4py 3.0.2 -> 3.0.3
numpy 1.17.2 -> 1.19.1
pybind11 2.4.3 -> 2.5.0
scipy 1.3.1 -> 1.5.2
|
|
When the host C compiler is used (either by calling '--host-cc' or on OSs
that we can't build the GNU C Compiler), Maneage will also not build the
Fortran compiler 'gfortran'. Until now, the './project configure' script
would give a big warning about the need for 'gfortran' and the fact that it
is missing, and would for 5 seconds, but it would continue anyway.
For projects that don't need 'gfortran', this can be confusing to the users
and for those that need 'gfortran', it means that a lot of time and cpu
cycles are wasted compiling non-fortran software that are unusable in the
end.
With this commit, the 'need_gfortarn' variable has been added
'reproduce/software/shell/configure.sh', in a new part that is devoted to
project-specific features. If it equals '0', then the 'gfortran' test (and
message!) isn't done at all, but if it is set to '1', then the configure
stage will halt immediately gfortran is not found and not built.
The default operations of the core Maneage branch don't need 'gfortran', so
by default it is set to 0. But 'gfortran' is necessary for all projects
that use Numpy (Python's numeric library) for example. So if your project
needs 'gfortran', please set this new variable to 1. As mentioned in the
comments of 'configure.sh', ideally we should detect this automatically,
but we haven't had the time to implement it yet.
|
|
In the previous commit (Commit 1bc00c9: Only using clang in macOS systems
that also have GCC) we set the used C compiler for high-level programs to
be 'clang' on macOS systems. But I forgot to do the same kind of change in
the configure script (to prefer 'clang' when we are testing for a C
compiler on the host).
With this commit, the compiler checking phases of the configure script have
been improved, so on macOS systems, we now first search for 'clang', then
search for 'gcc'.
While doing this, I also noticed that the 'rpath' checking command was done
before we actually define 'instdir'!!! So in effect, the 'rpath' directory
was being set to '/lib'! So with this commit, this test has been taken to
after defining 'instdir'.
|
|
Until now, when the user specified an input and software directory, the raw
string they entered was used. But when this string was a relative location,
this could be problematic in general scenarios.
With this commit, the same function that finds the absolute location of the
build directory is used to find the absolute address of the data and
software directories.
|
|
POSSIBLE EFFECT ON YOUR PROJECT: The changes in this commit may only cause
conflicts to your project if you have changed the software building
Makefiles in your project's branch (e.g., 'basic.mk', 'high-level.mk' and
'python.mk'). If your project has only added analysis, it shouldn't be
affected.
This is a large commit, involving a long series of corrections in a
differnt branch which is now finally being merged into the core Maneage
branch. All changes were related and came up naturally as the low-level
infrastructure was improved. So separating them in the end for the final
merge would have been very time consuming and we are merging them as one
commit.
In general, the software building Makefiles are now much more easier to
read, modify and use, along with several new features that have been
added. See below for the full list.
- Until now, Maneage needed the host to have a 'make' implementation
because Make was necessary to build Lzip. Lzip is then used to
uncompress the source of our own GNU Make. However, in the
minimalist/slim versions of operating systems (for example used to build
Docker images) Make isn't included by default. Since Lzip was the only
program before our own GNU Make was installed, we consulting Antonio
Diaz Diaz (creator of Lzip) and he kindly added the necessary
functionality to a new version of Lzip, which we are using now. Hence we
don't need to assume a Make implementation on the host any more. With
this commit, Lzip and GNU Make are built without Make, allowing
everything else to be safely built with our own custom version of GNU
Make and not using the host's 'make' at all.
- Until recently (Commit 3d8aa5953c4) GNU Make was built in
'basic.mk'. Therefore 'basic.mk' was written in a way that it can be
used with other 'make' implementations also (i.e., important shell
commands starting with '&&' and ending in '\' without any comments
between them!). Furthermore, to help in style uniformity, the rules in
'high-level.mk' and 'python.mk' also followed a similar structure. But
due to the point above, we can now guarantee that GNU Make is used from
the very first Makefile, so this hard-to-read structure has been removed
in the software build recipes and they are much more readable and
edit-friendly now.
- Until now, the default backup servers where at some fixed URLs, on our
own pages or on Gitlab. But recently we uploaded all the necessary
software to Zenodo (https://doi.org/10.5281/zenodo.3883409) which is
more suitable for this task (it promises longevity, has a fixed DOI,
while allowing us to add new content, or new software tarball
versions). With this commit, a small script has been written to extract
the most recent Zenodo upload link from the Zenodo DOI and use it for
downloading the software source codes.
- Until now, we primarily used the webpage of each software for
downloading its tarball. But this caused many problems: 1) Some of them
needed Javascript before the download, 2) Some URLs had a complex
dependency on the version number, 3) some servers would be randomly down
for maintenance and etc. So thanks to the point above, we now use the
Zenodo server as the primary download location. However, if a user wants
to use a custom software that is not (yet!) in Zenodo, the download
script gives priority to a custom URL that the users can give as Make
variables. If that variable is defined, then the script will use that
URL before going onto Zenodo. We now have a special place for such URLs:
'reproduce/software/config/urls.conf'. The old URLs (which are a good
documentation themselves) are preserved here, but are commented by
default.
- The software source code downloading and checksum verification step has
been moved into a Make function called 'import-source' (defined in the
'build-rules.mk' and loaded in all software Makefiles). Having taken all
the low-level steps there, I noticed that there is no more need for
having the tarball as a separate target! So with this commit, a single
rule is the only place that needs to be edited/added (greatly
simplifying the software building Makefiles).
- Following task #15272, A new option has been added to the './project'
script called '--all-highlevel'. When this option is given, the contents
of 'TARGETS.conf' are ignored and all the software in Maneage are built
(selected by parsing the 'versions.conf' file). This new option was
added to confirm the extensive changes made in all the software building
recipes and is great for development/testing purposes.
- Many of the software hadn't been tested for a long time! So after using
the newly added '--all-highlevel', we noticed that some need to be
updated. In general, with this commit, 'libpaper' and 'pcre' were added
as new software, and the versions of the following software was updated:
'boost', 'flex', 'libtirpc', 'openblas' and 'lzip'. A 'run-parts.in'
shell script was added in 'reproduce/software/shell/' which is installed
with 'libpaper'.
- Even though we intentionally add the necessary flags to add RPATH inside
the built executable at compilation time, some software don't do it
(different software on different operating systems!). Until now, for
historical reasons this check was done in different ways for different
software on GNU/Linux sytems. But now it is unified: if 'patchelf' is
present we apply it. Because of this, 'patchelf' has been put as a
top-level prerequisite, right after Tar and is installed before anything
else.
- In 'versions.conf', GNU Libtool is recognized as 'libtool', but in
'basic.mk', it was 'glibtool'! This caused many confusions and is
corrected with this commit (in 'basic.mk', it is also 'libtool').
- A new argument is added to the './project' script to allow easy loading
of the project's shell and environment for fast/temporary testing of
things in the same environment as the project. Before activating the
project's shell, we completely remove all host environment variables to
simulate the project's environment. It can be called with this command:
'./project shell'. A simple prompt has also been added to highlight that
the user is using the Maneage shell!
|
|
Until now, Maneage would accept the given build directory, regardless of
the free memory available there. This could cause confusing situations for
new users who don't know about the minimum storage requirement.
With this commit, after all other checks on the given build directory are
completed, the configure script will check the available space and warns
the user if there is less than almost 5GB free space available in the build
directory (with a 5 second delay).
It won't cause a crash because some projects may require roughly smaller
than this space (the default only needs roughly 2GB). But we also don't
want the host's partition to get too close to being full, causing them
problems elsewhere. We can change the behavior as desired in future
commits.
|
|
Until now, the English texts that embeds the list of software to
acknowledge in the paper was hard-wired into the low-level coding
('reproduce/software/shell/configure.sh' to be more specific). But this
file is very low-level, thus discouraging users to modify this surrounding
text.
While the list of software packages can be considered to be 'data' and is
fixed, the surrounding text to describe the lists is something the authors
should decide on. Authors of a scientific research paper take
responsibility for the full paper, including for the style of the
acknowledgments, even if these may well evolve into some standard text.
With this commit, authors who do *not* modify
'reproduce/software/config/acknowledge_software.sh' will have a default
text, with only a minor English correction from earlier versions of
Maneage. However, Authors choosing to use their own wording should be able
to modify the text parameters in
`reproduce/software/config/acknowledge_software.sh` in the obvious
way. This is much more modular than asking project authors to go looking
into the long and technical 'configure.sh' script.
Systematic issues: the file
`reproduce/software/config/acknowledge_software.sh` is an executable shell
script, because it has to be called by
`reproduce/software/shell/configure.sh`, which, in principle, does not yet
have access to `GNU make` (if I understand the bootstrap sequence
correctly). It is placed in `config/` rather than `shell/`, because the
user will expect to find configuration files in `config/`, not in `shell/`.
A possible alternative to avoid having a shell script as a configure file
would be to let `reproduce/software/config/acknowledge_software.sh` appear
to be a `make` file, but analyse it in `configure.sh` using `sed` to remove
whitespace around `=`, and adding other hacks to switch from `make` syntax
to `shell` syntax. However, this risks misleading the user, who will not
know whether s/he should follow `make` conventions or `shell` conventions.
|
|
Until now, when making the link to Gnuastro's configuration files, the
'configure.sh' script would incorrectly link to the old configuration
directory under the 'reproduce/software' directory. With this commit, it is
moved to the proper directory under 'reproduce/analysis'.
|
|
The project configuration requires a build-directory at configuration time,
two other directories can optionally be given to avoid downloading the
project's necessary data and software. It is possible to give these three
directories as command-line options, or by interactively giving them after
running the configure script.
Until now, when these directories weren't given as command-line options,
and the running shell was non-interactive, the configure script would crash
on the line trying to interactively read the user's given directories (the
'read' command).
With this commit, all the 'read' commands for these three directories are
now put within an 'if' statement. Therefore, when 'read' fails (the shell
is non-interactive), instead of a quiet crash, a descriptive message is
printed, telling the user that cause of the problem, and suggesting a fix.
This bug was found by Michael R. Crusoe.
|
|
Until now, the description of the input-data directory at configure time
included a description of the input data (created by reading the values of
'INPUTS.conf'). Maintaining this is easy for a single dataset, but it
becomes hard for a general project which may need many input datasets.
To avoid extra complexity (for maintaining this list), the description now
points a user of the project to the 'INPUTS.conf' file and asks them to
look inside of it for seeing the necessary data. This infact helps with the
users becoming familiar with the internal structure of Maneage and will
allow the authors to focus on not having to worry about updating the
low-level 'configure.sh' script.
|
|
When './project configure' is run, after the basic checks of the compiler,
a small statement is printed telling the user that some configuration
questions will now be asked to start building Maneage on the system. Until
now this description was confusing: it lead the reader to think that the
local configuration (which was recommended to read before continuing) is in
another file.
With this commit, the text has been edited to explictly mention that the
description of the steps following this notice should be read
carefully. Thus avoiding that confusion.
This issue was mentioned by Michael R. Crusoe.
|
|
Possible semantic conflicts (that may not show up as Git conflicts but may
cause a crash in your project after the merge):
1) The project title (and other basic metadata) should be set in
'reproduce/analysis/conf/metadata.conf'. Please include this file in
your merge (if it is ignored because of '.gitattributes'!).
2) Consider importing the changes in 'initialize.mk' and 'verify.mk' (if
you have added all analysis Makefiles to the '.gitattributes' file
(thus not merging any change in them with your branch). For example
with this command:
git diff master...maneage -- reproduce/analysis/make/initialize.mk
3) The old 'verify-txt-no-comments-leading-space' function has been
replaced by 'verify-txt-no-comments-no-space'. The new function will
also remove all white-space characters between the columns (not just
white space characters at the start of the line). Thus the resulting
check won't involve spacing between columns.
A common set of steps are always necessary to prepare a project for
publication. Until now, we would simply look at previous submissions and
try to follow them, but that was prone to errors and could cause
confusion. The internal infrastructure also didn't have some useful
features to make good publication possible. Now that the submission of a
paper fully devoted to the founding criteria of Maneage is complete
(arXiv:2006.03018), it was time to formalize the necessary steps for easier
submission of a project using Maneage and implement some low-level features
that can make things easier.
With this commit a first draft of the publication checklist has been added
to 'README-hacking.md', it was tested in the submission of arXiv:2006.03018
and zenodo.3872248. To help guide users on implementing the good practices
for output datasets, the outputs of the default project shown in the paper
now use the new features). After reading the checklist, please inspect
these.
Some other relevant changes in this commit:
- The publication involves a copy of the necessary software
tarballs. Hence a new target ('dist-software') was also added to
package all the project's software tarballs in one tarball for easy
distribution.
- A new 'dist-lzip' target has been defined for those who want to
distribute an Lzip-compressed tarball.
- The '\includetikz' LaTeX macro now has a second argument to allow
configuring the '\includegraphics' call when the plot should not be
built, but just imported.
|
|
Until now, Maneage would only build Flock before building everything else
using Make (calling 'basic.mk') in parallel. Flock was necessary to avoid
parallel downloads during the building of software (which could cause
network problems). But after recently trying Maneage on FreeBSD (which is
not yet complete, see bug #58465), we noticed that the BSD implemenation of
Make couldn't parse 'basic.mk' (in particular, complaining with the 'ifeq'
parts) and its shell also had some peculiarities.
It was thus decided to also install our own minimalist shell, Make and
compressor program before calling 'basic.mk'. In this way, 'basic.mk' can
now assume the same GNU Make features that high-level.mk and python.mk
assume. The pre-make building of software is now organized in
'reproduce/software/shell/pre-make-build.sh'.
Another nice feature of this commit is for macOS users: until now the
default macOS Make had problems for parallel building of software, so
'basic.mk' was built in one thread. But now that we can build the core
tools with GNU Make on macOS too, it uses all threads. Furthermore, since
we now run 'basic.mk' with GNU Make, we can use '.ONESHELL' and don't have
to finish every line of a long rule with a backslash to keep variables and
such.
Generally, the pre-make software are now organized like this: first we
build Lzip before anything else: it is downloaded as a simple '.tar' file
that is not compressed (only ~400kb). Once Lzip is built, the pre-make
phase continues with building GNU Make, Dash (a minimalist shell) and
Flock. All of their tarballs are in '.tar.lz'. Maneage then enters
'basic.mk' and the first program it builds is GNU Gzip (itself packaged as
'.tar.lz'). Once Gzip is built, we build all the other compression software
(all downloaded as '.tar.gz'). Afterwards, any compression standard for
other software is fine because we have it.
In the process, a bug related to using backup servers was found in
'reproduce/analysis/bash/download-multi-try' for calling outside of
'basic.mk' and removed Bash-specific features. As a result of that bug-fix,
because we now have multiple servers for software tarballs, the backup
servers now have their own configuration file in
'reproduce/software/config/servers-backup.conf'. This makes it much easier
to maintain the backup server list across the multiple places that we need
it.
Some other minor fixes:
- In building Bzip2, we need to specify 'CC' so it doesn't use 'gcc'.
- In building Zip, the 'generic_gcc' Make option caused a crash on FreeBSD
(which doesn't have GCC).
- We are now using 'uname -s' to specify if we are on a Linux kernel or
not, if not, we are still using the old 'on_mac_os' variable.
- While I was trying to build on FreeBSD, I noticed some further
corrections that could help. For example the 'makelink' Make-function
now takes a third argument which can be a different name compared to the
actual program (used for examle to make a link to '/usr/bin/cc' from
'gcc'.
- Until now we didn't know if the host's Make implementation supports
placing a '@' at the start of the recipe (to avoid printing the actual
commands to standard output). Especially in the tarball download phase,
there are many lines that are printed for each download which was really
annoying. We already used '@' in 'high-level.mk' and 'python.mk' before,
but now that we also know that 'basic.mk' is called with our custom GNU
Make, we can use it at the start for a cleaner stdout.
- Until now, WCSLIB assumed a Fortran compiler, but when the user is on a
system where we can't install GCC (or has activated the '--host-cc'
option), it may not be present and the project shouldn't break because
of this. So with this commit, when a Fortran compiler isn't present,
WCSLIB will be built with the '--disable-fortran' configuration option.
This commit (task #15667) was completed with help/checks by Raul
Infante-Sainz and Boud Roukema.
|
|
In time, some of the copyright license description had been mistakenly
shortened to two paragraphs instead of the original three that is
recommended in the GPL. With this commit, they are corrected to be exactly
in the same three paragraph format suggested by GPL.
The following files also didn't have a copyright notice, so one was added
for them:
reproduce/software/make/README.md
reproduce/software/bibtex/healpix.tex
reproduce/analysis/config/delete-me-num.conf
reproduce/analysis/config/verify-outputs.conf
|
|
Until this commit, when the version of Gnuastro doesn't match with the
version that the project was designed to use, the warning message saying
how to run the configure step was not showing the option `-e'. This
situation is normal when updating the version of Gnuastro to the most
recent one (with the project already configured). However, the use of this
option is more convenient than giving the top-build directory, etc, every
time. With this commit, the warning message has been changed in order show
also the option `-e' in the re-configure of the project.
|
|
Until now Maneage used the host's GNU Gettext if it was present. Gettext is
a relatively low-level software that enables programs to print messages in
different languages based on the host environment. Even though it has not
direct effect on the running of the software for Maneage and the lanugage
environment in Maneage is pre-determined, it is necessary to have it
because if the basic programs see it in the host they will link with it and
will have problems if/when the host's Gettext is updated.
With this commit (which is actually a squashed rebase of 9 commits by Raul
and Mohammad), Gettext and its two extra dependencies (libxml2 and
libunistring) are now installed within Maneage as a basic software and
built before GNU Bash. As a result, all programs built afterwards will
successfully link with our own internal version of Gettext and
libraries. To get this working, some of the basic software dependencies had
to updated and re-ordered and it has been tested in both GNU/Linux and
macoS.
Some other minor issues that are fixed with this commit
- Until this commit, when TeX was not installed, the warning message
saying how to run the configure step in order to re-configure the
project was not showing the option `-e'. However, the use of this option
is more convenient than entering the top-build directory and etc every
time. So with this commit, the warning message has been changed in order
use the option `-e' in the re-configure of the project.
- Until now, on macOS systems, Bash was not linking with our internally
built `libncurses'. With this commit, this has been fixed by setting
`--withcurses=yes' for Bash's configure script.
|
|
Until now, if GCC couldn't be built for any reason, Maneage would crash and
the user had no way forward. Since GCC is complicated, it may happen and is
frustrating to wait until the bug is fixed. Also, while debugging Maneage,
when we know GCC has no problem, because it takes so long, it discourages
testing.
With this commit, we have re-activated the `--host-cc' option. It was
already defined in the options of `./project', but its affect was nullified
by hard-coding it to zero in the configure script on GNU/Linux systems. So
with this commit that has been removed and the user can use their own C
compiler on a GNU/Linux operating system also.
Furthermore, to inform the user about this option and its usefulness, when
GCC fails to build, a clear warning message is printed, instructing the
user to post the problem as a bug and telling them how to continue building
the project with the `--host-cc' option.
|
|
Until now, at the end of the configuration step, we would tell the user
this: "To change the configuration later, please re-run './project
configure', DO NOT manually edit the relevant files". However, as Boud
suggested in Bug #58243, this is against our principle to encourage users
to modify Maneage.
With this commit, that explanation has been expanded by a few sentences to
tell the users what to change and warn them in case they decide to change
the build-directory.
|
|
Until now, we wouldn't explicity check for GNU gettext. If it was present
on the system, we would just add a link to it in Maneage's installation
directory. However, in bug #58248, Boud noticed that Git (a basic software)
actually needs it to complete its installation. Unfortunately we haven't
had the tiem to include a build of Gettext in Maneage. Because it is mostly
available on many systems, it hasn't been reported too commonly, it also
has many dependencies which make it a little time consuming to install.
So with this commit, we actually check for GNU gettext right after checking
the compiler and if its not available an informative error message is
written to inform the user of the problem, along with suggestions on fixing
it (how to install GNU gettext from their package manager).
|
|
Until now we only checked for the existance and write-ability of the build
directory. But we recently discovered that if the specified build-directory
is in a non-POSIX compatible partition (for example NTFS), permissions
can't be modified and this can cause crashs in some programs (in
particular, while building Perl, see [1]). The thing that makes this
problem hard to identify is that on such partitions, `chmod' will still
return 0 (so it was hard to find).
With this commit, a check has been added after the user specifies the
build-directory. If the proposed build directory is not able to handle
permissions as expected, the configure script will not continue and will
let the user know and will ask them for another directory.
Also, the two printed characters at the start of error messages were
changed to `**' (instead of `--'). When everything is good, we'll use `--'
to tell the user that their given directory will be used as the build
directory. And since there are multiple checks now, the final message to
specify a new build directory is now moved to the end and not repeated in
every check.
[1] https://savannah.nongnu.org/support/?110220
|
|
Until now, the message that we printed just before starting to build
software didn't actually print the current directory, but only `pwd'. With
this commit, this is fixed (it uses the `currentdir' variable that is
already found before).
|
|
Until now, throughout Maneage we were using the old name of "Reproducible
Paper Template". But we have finally decided to use Maneage, so to avoid
confusion, the name has been corrected in `README-hacking.md' and also in
the copyright notices.
Note also that in `README-hacking.md', the main Maneage branch is now
called `maneage', and the main Git remote has been changed to
`https://gitlab.com/maneage/project' (this is a new GitLab Group that I
have setup for all Maneage-related projects). In this repository there is
only one `maneage' branch to avoid complications with the `master' branch
of the projects using Maneage later.
|
|
In the previous commit, we remove the `-static' flag from building PatchELF
because it wasn't necessary any more. Howver, the comment for the check
still included it and could be confusing. This is corrected with this
commit. Also, we don't need the `good_static_libc' variable (that was only
defined to pass onto PatchELF). This has also been corrected.
|
|
Until now the software configuration parameters were defined under the
`reproduce/software/config/installation/' directory. This was because the
configuration parameters of analysis software (for example Gnuastro's
configurations) were placed under there too. But this was terribly
confusing, because the run-time options of programs falls under the
"analysis" phase of the project.
With this commit, the Gnuastro configuration files have been moved under
the new `reproduce/analysis/config/gnuastro' directory and the software
configuration files are directly under `reproduce/software/config'. A clean
build was done with this change and it didn't crash, but it may cause
crashes in derived projects, so after merging with Maneage, please
re-configure your project to see if anything has been missed. Please let us
know if there is a problem.
|
|
Until now we would simply return the version numbers as they were written
into the separate files and situations can happen where the version numbers
contain an underscore (`_'). However, this character is a methematical
character in LaTeX, causing LaTeX to complain and abort.
With this commit, a step has been added at the end of the configure script
to convert any possible `_' to `\_'. Once it is commented (a backslash is
put behind it), the underscore will be printed as it is in the final PDF.
This commit was originally written by Mohammad Akhlaghi
|
|
MissFITS is package for manipulating FITS files.
I added it as my first commit to the project for educational
purposes.
|
|
Until now, we defined `LIBRARY_PATH' to fix the problem of the `ld' linker
of Binutils needing several `*crt*.o' files to run. However, some software
(for example ImageMagick) over-write `LIBRARY_PATH', therefore there is no
other way than to put a link to these necessary files in our local build
directory.
With this commit, we fixed the problem by putting a link to the system's
relevant files in the local library directory. This fixed the problem with
ImageMagick. Later, when we build the GNU C Library in the project, we
should remove this step.
This bug reported by Raul Castellanos Sanchez.
|
|
Until now, when a Fortran compiler didn't exist on the host operating
system, the configure script would crash with a warning. But some projects
may not need Fortran, so this is just an extra/annoying crash!
With this commit, it will still print the warning, but instead of a crash,
it will just sleep for some seconds, then continue. Later, when if a
software needs Fortran, it's building will crash, but atleast the user was
warned.
In the future, we should add a step to check on the necessary software and
see if Fortran is necessary for the project or not. The project
configuration should indeed crash if Fortran is necessary, but we should
tell the user that software XXXX needs Fortran so we can't continue without
a Fortran compiler.
Also, a small sentence ("Project's configuration will continue in XXXX
seconds.") was added after all the warnings that won't cause a crash, so
user's don't think its a crash.
|
|
Until now, Make was just run ordinarily on the two Makefiles of the
software building phase. Therefore when there was a problem with one
software while building in parallel, Make would only complete the running
rules and stop afterwards. But when other rules don't depened on the
crashed rule, its a waste of time to stop the whole thing.
With this commit, both calls to Make in the `configure.sh' script are done
with the `-k' option (or `--keep-going' in GNU Make). With this option, if
a rule crashes, the other rules that don't depend on it will also be
run. Generally, anything that doesn't depend on the crashed rule will be
done. The `-k' option is a POSIX definition in Make, so it is present in
most implemenetations (for the call to `basic.mk').
|
|
Until now the shell scripts in the software building phase were in the
`reproduce/software/bash' directory. But given our recent change to a
POSIX-only start, the `configure.sh' shell script (which is the main
component of this directory) is no longer written with Bash.
With this commit, to fix that problem, that directory's name has been
changed to `reproduce/software/shell'.
|