Age | Commit message (Collapse) | Author | Lines |
|
Until now, no machine-related specifications were being documented in the
workflow. This information can become helpful when observing differences in
the outcome of both software and analysis segments of the workflow by
others (some software may behave differently based on host machine).
With this commit, the host machine's 'hardware class' and 'byte-order' are
collected and now available as LaTeX macros for the authors to use in the
paper. Currently it is placed in the acknowledgments, right after
mentioning the Maneage commit.
Furthermore, the project and configuration scripts are now capable of
dealing with input directory names that have SPACE (and other special
characters) by putting them inside double-quotes. However, having spaces
and metacharacters in the address of the build directory could cause
build/install failure for some software source files which are beyond the
control of Maneage. So we now check the user's given build directory
string, and if the string has any '@', '#', '$', '%', '^', '&', '*', '(',
')', '+', ';', and ' ' (SPACE), it will ask the user to provide a different
directory.
|
|
Until now, if the software source tarballs already existed on the system
they would be copied inside the project. However, the software source
tarballs are sometimes/mostly larger than their actual product and can
consume significant space (~375 MB in the core branch!).
With this commit, when the software are present on the system, their
symbolic link will be placed in 'BDIR/software/tarballs', not a full
copy. Also, because the tarballs in software tarball directory may
themselves be links, we use 'realpath' to find the final place of the
actual file and link to that location. Therefore if 'realpath' can't be
found (prior to installing Coreutils in Maneage), we will copy the tarballs
from the given software tarball directory. After Maneage has installed
Coreutils, the project's own 'realpath' will be used. Of course, if the
software are downloaded, their full downloaded copy will be kept in
'BDIR/software/tarballs', nothing has changed in the downloading scenario.
|
|
POSSIBLE EFFECT ON YOUR PROJECT: The changes in this commit may only cause
conflicts to your project if you have changed the software building
Makefiles in your project's branch (e.g., 'basic.mk', 'high-level.mk' and
'python.mk'). If your project has only added analysis, it shouldn't be
affected.
This is a large commit, involving a long series of corrections in a
differnt branch which is now finally being merged into the core Maneage
branch. All changes were related and came up naturally as the low-level
infrastructure was improved. So separating them in the end for the final
merge would have been very time consuming and we are merging them as one
commit.
In general, the software building Makefiles are now much more easier to
read, modify and use, along with several new features that have been
added. See below for the full list.
- Until now, Maneage needed the host to have a 'make' implementation
because Make was necessary to build Lzip. Lzip is then used to
uncompress the source of our own GNU Make. However, in the
minimalist/slim versions of operating systems (for example used to build
Docker images) Make isn't included by default. Since Lzip was the only
program before our own GNU Make was installed, we consulting Antonio
Diaz Diaz (creator of Lzip) and he kindly added the necessary
functionality to a new version of Lzip, which we are using now. Hence we
don't need to assume a Make implementation on the host any more. With
this commit, Lzip and GNU Make are built without Make, allowing
everything else to be safely built with our own custom version of GNU
Make and not using the host's 'make' at all.
- Until recently (Commit 3d8aa5953c4) GNU Make was built in
'basic.mk'. Therefore 'basic.mk' was written in a way that it can be
used with other 'make' implementations also (i.e., important shell
commands starting with '&&' and ending in '\' without any comments
between them!). Furthermore, to help in style uniformity, the rules in
'high-level.mk' and 'python.mk' also followed a similar structure. But
due to the point above, we can now guarantee that GNU Make is used from
the very first Makefile, so this hard-to-read structure has been removed
in the software build recipes and they are much more readable and
edit-friendly now.
- Until now, the default backup servers where at some fixed URLs, on our
own pages or on Gitlab. But recently we uploaded all the necessary
software to Zenodo (https://doi.org/10.5281/zenodo.3883409) which is
more suitable for this task (it promises longevity, has a fixed DOI,
while allowing us to add new content, or new software tarball
versions). With this commit, a small script has been written to extract
the most recent Zenodo upload link from the Zenodo DOI and use it for
downloading the software source codes.
- Until now, we primarily used the webpage of each software for
downloading its tarball. But this caused many problems: 1) Some of them
needed Javascript before the download, 2) Some URLs had a complex
dependency on the version number, 3) some servers would be randomly down
for maintenance and etc. So thanks to the point above, we now use the
Zenodo server as the primary download location. However, if a user wants
to use a custom software that is not (yet!) in Zenodo, the download
script gives priority to a custom URL that the users can give as Make
variables. If that variable is defined, then the script will use that
URL before going onto Zenodo. We now have a special place for such URLs:
'reproduce/software/config/urls.conf'. The old URLs (which are a good
documentation themselves) are preserved here, but are commented by
default.
- The software source code downloading and checksum verification step has
been moved into a Make function called 'import-source' (defined in the
'build-rules.mk' and loaded in all software Makefiles). Having taken all
the low-level steps there, I noticed that there is no more need for
having the tarball as a separate target! So with this commit, a single
rule is the only place that needs to be edited/added (greatly
simplifying the software building Makefiles).
- Following task #15272, A new option has been added to the './project'
script called '--all-highlevel'. When this option is given, the contents
of 'TARGETS.conf' are ignored and all the software in Maneage are built
(selected by parsing the 'versions.conf' file). This new option was
added to confirm the extensive changes made in all the software building
recipes and is great for development/testing purposes.
- Many of the software hadn't been tested for a long time! So after using
the newly added '--all-highlevel', we noticed that some need to be
updated. In general, with this commit, 'libpaper' and 'pcre' were added
as new software, and the versions of the following software was updated:
'boost', 'flex', 'libtirpc', 'openblas' and 'lzip'. A 'run-parts.in'
shell script was added in 'reproduce/software/shell/' which is installed
with 'libpaper'.
- Even though we intentionally add the necessary flags to add RPATH inside
the built executable at compilation time, some software don't do it
(different software on different operating systems!). Until now, for
historical reasons this check was done in different ways for different
software on GNU/Linux sytems. But now it is unified: if 'patchelf' is
present we apply it. Because of this, 'patchelf' has been put as a
top-level prerequisite, right after Tar and is installed before anything
else.
- In 'versions.conf', GNU Libtool is recognized as 'libtool', but in
'basic.mk', it was 'glibtool'! This caused many confusions and is
corrected with this commit (in 'basic.mk', it is also 'libtool').
- A new argument is added to the './project' script to allow easy loading
of the project's shell and environment for fast/temporary testing of
things in the same environment as the project. Before activating the
project's shell, we completely remove all host environment variables to
simulate the project's environment. It can be called with this command:
'./project shell'. A simple prompt has also been added to highlight that
the user is using the Maneage shell!
|
|
Until now, Maneage would only build Flock before building everything else
using Make (calling 'basic.mk') in parallel. Flock was necessary to avoid
parallel downloads during the building of software (which could cause
network problems). But after recently trying Maneage on FreeBSD (which is
not yet complete, see bug #58465), we noticed that the BSD implemenation of
Make couldn't parse 'basic.mk' (in particular, complaining with the 'ifeq'
parts) and its shell also had some peculiarities.
It was thus decided to also install our own minimalist shell, Make and
compressor program before calling 'basic.mk'. In this way, 'basic.mk' can
now assume the same GNU Make features that high-level.mk and python.mk
assume. The pre-make building of software is now organized in
'reproduce/software/shell/pre-make-build.sh'.
Another nice feature of this commit is for macOS users: until now the
default macOS Make had problems for parallel building of software, so
'basic.mk' was built in one thread. But now that we can build the core
tools with GNU Make on macOS too, it uses all threads. Furthermore, since
we now run 'basic.mk' with GNU Make, we can use '.ONESHELL' and don't have
to finish every line of a long rule with a backslash to keep variables and
such.
Generally, the pre-make software are now organized like this: first we
build Lzip before anything else: it is downloaded as a simple '.tar' file
that is not compressed (only ~400kb). Once Lzip is built, the pre-make
phase continues with building GNU Make, Dash (a minimalist shell) and
Flock. All of their tarballs are in '.tar.lz'. Maneage then enters
'basic.mk' and the first program it builds is GNU Gzip (itself packaged as
'.tar.lz'). Once Gzip is built, we build all the other compression software
(all downloaded as '.tar.gz'). Afterwards, any compression standard for
other software is fine because we have it.
In the process, a bug related to using backup servers was found in
'reproduce/analysis/bash/download-multi-try' for calling outside of
'basic.mk' and removed Bash-specific features. As a result of that bug-fix,
because we now have multiple servers for software tarballs, the backup
servers now have their own configuration file in
'reproduce/software/config/servers-backup.conf'. This makes it much easier
to maintain the backup server list across the multiple places that we need
it.
Some other minor fixes:
- In building Bzip2, we need to specify 'CC' so it doesn't use 'gcc'.
- In building Zip, the 'generic_gcc' Make option caused a crash on FreeBSD
(which doesn't have GCC).
- We are now using 'uname -s' to specify if we are on a Linux kernel or
not, if not, we are still using the old 'on_mac_os' variable.
- While I was trying to build on FreeBSD, I noticed some further
corrections that could help. For example the 'makelink' Make-function
now takes a third argument which can be a different name compared to the
actual program (used for examle to make a link to '/usr/bin/cc' from
'gcc'.
- Until now we didn't know if the host's Make implementation supports
placing a '@' at the start of the recipe (to avoid printing the actual
commands to standard output). Especially in the tarball download phase,
there are many lines that are printed for each download which was really
annoying. We already used '@' in 'high-level.mk' and 'python.mk' before,
but now that we also know that 'basic.mk' is called with our custom GNU
Make, we can use it at the start for a cleaner stdout.
- Until now, WCSLIB assumed a Fortran compiler, but when the user is on a
system where we can't install GCC (or has activated the '--host-cc'
option), it may not be present and the project shouldn't break because
of this. So with this commit, when a Fortran compiler isn't present,
WCSLIB will be built with the '--disable-fortran' configuration option.
This commit (task #15667) was completed with help/checks by Raul
Infante-Sainz and Boud Roukema.
|