Age | Commit message (Collapse) | Author | Lines |
|
POSSIBLE EFFECT ON YOUR PROJECT: The changes in this commit may only cause
conflicts to your project if you have changed the software building
Makefiles in your project's branch (e.g., 'basic.mk', 'high-level.mk' and
'python.mk'). If your project has only added analysis, it shouldn't be
affected.
This is a large commit, involving a long series of corrections in a
differnt branch which is now finally being merged into the core Maneage
branch. All changes were related and came up naturally as the low-level
infrastructure was improved. So separating them in the end for the final
merge would have been very time consuming and we are merging them as one
commit.
In general, the software building Makefiles are now much more easier to
read, modify and use, along with several new features that have been
added. See below for the full list.
- Until now, Maneage needed the host to have a 'make' implementation
because Make was necessary to build Lzip. Lzip is then used to
uncompress the source of our own GNU Make. However, in the
minimalist/slim versions of operating systems (for example used to build
Docker images) Make isn't included by default. Since Lzip was the only
program before our own GNU Make was installed, we consulting Antonio
Diaz Diaz (creator of Lzip) and he kindly added the necessary
functionality to a new version of Lzip, which we are using now. Hence we
don't need to assume a Make implementation on the host any more. With
this commit, Lzip and GNU Make are built without Make, allowing
everything else to be safely built with our own custom version of GNU
Make and not using the host's 'make' at all.
- Until recently (Commit 3d8aa5953c4) GNU Make was built in
'basic.mk'. Therefore 'basic.mk' was written in a way that it can be
used with other 'make' implementations also (i.e., important shell
commands starting with '&&' and ending in '\' without any comments
between them!). Furthermore, to help in style uniformity, the rules in
'high-level.mk' and 'python.mk' also followed a similar structure. But
due to the point above, we can now guarantee that GNU Make is used from
the very first Makefile, so this hard-to-read structure has been removed
in the software build recipes and they are much more readable and
edit-friendly now.
- Until now, the default backup servers where at some fixed URLs, on our
own pages or on Gitlab. But recently we uploaded all the necessary
software to Zenodo (https://doi.org/10.5281/zenodo.3883409) which is
more suitable for this task (it promises longevity, has a fixed DOI,
while allowing us to add new content, or new software tarball
versions). With this commit, a small script has been written to extract
the most recent Zenodo upload link from the Zenodo DOI and use it for
downloading the software source codes.
- Until now, we primarily used the webpage of each software for
downloading its tarball. But this caused many problems: 1) Some of them
needed Javascript before the download, 2) Some URLs had a complex
dependency on the version number, 3) some servers would be randomly down
for maintenance and etc. So thanks to the point above, we now use the
Zenodo server as the primary download location. However, if a user wants
to use a custom software that is not (yet!) in Zenodo, the download
script gives priority to a custom URL that the users can give as Make
variables. If that variable is defined, then the script will use that
URL before going onto Zenodo. We now have a special place for such URLs:
'reproduce/software/config/urls.conf'. The old URLs (which are a good
documentation themselves) are preserved here, but are commented by
default.
- The software source code downloading and checksum verification step has
been moved into a Make function called 'import-source' (defined in the
'build-rules.mk' and loaded in all software Makefiles). Having taken all
the low-level steps there, I noticed that there is no more need for
having the tarball as a separate target! So with this commit, a single
rule is the only place that needs to be edited/added (greatly
simplifying the software building Makefiles).
- Following task #15272, A new option has been added to the './project'
script called '--all-highlevel'. When this option is given, the contents
of 'TARGETS.conf' are ignored and all the software in Maneage are built
(selected by parsing the 'versions.conf' file). This new option was
added to confirm the extensive changes made in all the software building
recipes and is great for development/testing purposes.
- Many of the software hadn't been tested for a long time! So after using
the newly added '--all-highlevel', we noticed that some need to be
updated. In general, with this commit, 'libpaper' and 'pcre' were added
as new software, and the versions of the following software was updated:
'boost', 'flex', 'libtirpc', 'openblas' and 'lzip'. A 'run-parts.in'
shell script was added in 'reproduce/software/shell/' which is installed
with 'libpaper'.
- Even though we intentionally add the necessary flags to add RPATH inside
the built executable at compilation time, some software don't do it
(different software on different operating systems!). Until now, for
historical reasons this check was done in different ways for different
software on GNU/Linux sytems. But now it is unified: if 'patchelf' is
present we apply it. Because of this, 'patchelf' has been put as a
top-level prerequisite, right after Tar and is installed before anything
else.
- In 'versions.conf', GNU Libtool is recognized as 'libtool', but in
'basic.mk', it was 'glibtool'! This caused many confusions and is
corrected with this commit (in 'basic.mk', it is also 'libtool').
- A new argument is added to the './project' script to allow easy loading
of the project's shell and environment for fast/temporary testing of
things in the same environment as the project. Before activating the
project's shell, we completely remove all host environment variables to
simulate the project's environment. It can be called with this command:
'./project shell'. A simple prompt has also been added to highlight that
the user is using the Maneage shell!
|
|
Until now, Maneage would accept the given build directory, regardless of
the free memory available there. This could cause confusing situations for
new users who don't know about the minimum storage requirement.
With this commit, after all other checks on the given build directory are
completed, the configure script will check the available space and warns
the user if there is less than almost 5GB free space available in the build
directory (with a 5 second delay).
It won't cause a crash because some projects may require roughly smaller
than this space (the default only needs roughly 2GB). But we also don't
want the host's partition to get too close to being full, causing them
problems elsewhere. We can change the behavior as desired in future
commits.
|
|
Until now, the English texts that embeds the list of software to
acknowledge in the paper was hard-wired into the low-level coding
('reproduce/software/shell/configure.sh' to be more specific). But this
file is very low-level, thus discouraging users to modify this surrounding
text.
While the list of software packages can be considered to be 'data' and is
fixed, the surrounding text to describe the lists is something the authors
should decide on. Authors of a scientific research paper take
responsibility for the full paper, including for the style of the
acknowledgments, even if these may well evolve into some standard text.
With this commit, authors who do *not* modify
'reproduce/software/config/acknowledge_software.sh' will have a default
text, with only a minor English correction from earlier versions of
Maneage. However, Authors choosing to use their own wording should be able
to modify the text parameters in
`reproduce/software/config/acknowledge_software.sh` in the obvious
way. This is much more modular than asking project authors to go looking
into the long and technical 'configure.sh' script.
Systematic issues: the file
`reproduce/software/config/acknowledge_software.sh` is an executable shell
script, because it has to be called by
`reproduce/software/shell/configure.sh`, which, in principle, does not yet
have access to `GNU make` (if I understand the bootstrap sequence
correctly). It is placed in `config/` rather than `shell/`, because the
user will expect to find configuration files in `config/`, not in `shell/`.
A possible alternative to avoid having a shell script as a configure file
would be to let `reproduce/software/config/acknowledge_software.sh` appear
to be a `make` file, but analyse it in `configure.sh` using `sed` to remove
whitespace around `=`, and adding other hacks to switch from `make` syntax
to `shell` syntax. However, this risks misleading the user, who will not
know whether s/he should follow `make` conventions or `shell` conventions.
|
|
Until now, when making the link to Gnuastro's configuration files, the
'configure.sh' script would incorrectly link to the old configuration
directory under the 'reproduce/software' directory. With this commit, it is
moved to the proper directory under 'reproduce/analysis'.
|
|
The project configuration requires a build-directory at configuration time,
two other directories can optionally be given to avoid downloading the
project's necessary data and software. It is possible to give these three
directories as command-line options, or by interactively giving them after
running the configure script.
Until now, when these directories weren't given as command-line options,
and the running shell was non-interactive, the configure script would crash
on the line trying to interactively read the user's given directories (the
'read' command).
With this commit, all the 'read' commands for these three directories are
now put within an 'if' statement. Therefore, when 'read' fails (the shell
is non-interactive), instead of a quiet crash, a descriptive message is
printed, telling the user that cause of the problem, and suggesting a fix.
This bug was found by Michael R. Crusoe.
|
|
Until now, the description of the input-data directory at configure time
included a description of the input data (created by reading the values of
'INPUTS.conf'). Maintaining this is easy for a single dataset, but it
becomes hard for a general project which may need many input datasets.
To avoid extra complexity (for maintaining this list), the description now
points a user of the project to the 'INPUTS.conf' file and asks them to
look inside of it for seeing the necessary data. This infact helps with the
users becoming familiar with the internal structure of Maneage and will
allow the authors to focus on not having to worry about updating the
low-level 'configure.sh' script.
|
|
When './project configure' is run, after the basic checks of the compiler,
a small statement is printed telling the user that some configuration
questions will now be asked to start building Maneage on the system. Until
now this description was confusing: it lead the reader to think that the
local configuration (which was recommended to read before continuing) is in
another file.
With this commit, the text has been edited to explictly mention that the
description of the steps following this notice should be read
carefully. Thus avoiding that confusion.
This issue was mentioned by Michael R. Crusoe.
|
|
Possible semantic conflicts (that may not show up as Git conflicts but may
cause a crash in your project after the merge):
1) The project title (and other basic metadata) should be set in
'reproduce/analysis/conf/metadata.conf'. Please include this file in
your merge (if it is ignored because of '.gitattributes'!).
2) Consider importing the changes in 'initialize.mk' and 'verify.mk' (if
you have added all analysis Makefiles to the '.gitattributes' file
(thus not merging any change in them with your branch). For example
with this command:
git diff master...maneage -- reproduce/analysis/make/initialize.mk
3) The old 'verify-txt-no-comments-leading-space' function has been
replaced by 'verify-txt-no-comments-no-space'. The new function will
also remove all white-space characters between the columns (not just
white space characters at the start of the line). Thus the resulting
check won't involve spacing between columns.
A common set of steps are always necessary to prepare a project for
publication. Until now, we would simply look at previous submissions and
try to follow them, but that was prone to errors and could cause
confusion. The internal infrastructure also didn't have some useful
features to make good publication possible. Now that the submission of a
paper fully devoted to the founding criteria of Maneage is complete
(arXiv:2006.03018), it was time to formalize the necessary steps for easier
submission of a project using Maneage and implement some low-level features
that can make things easier.
With this commit a first draft of the publication checklist has been added
to 'README-hacking.md', it was tested in the submission of arXiv:2006.03018
and zenodo.3872248. To help guide users on implementing the good practices
for output datasets, the outputs of the default project shown in the paper
now use the new features). After reading the checklist, please inspect
these.
Some other relevant changes in this commit:
- The publication involves a copy of the necessary software
tarballs. Hence a new target ('dist-software') was also added to
package all the project's software tarballs in one tarball for easy
distribution.
- A new 'dist-lzip' target has been defined for those who want to
distribute an Lzip-compressed tarball.
- The '\includetikz' LaTeX macro now has a second argument to allow
configuring the '\includegraphics' call when the plot should not be
built, but just imported.
|
|
Until now, Maneage would only build Flock before building everything else
using Make (calling 'basic.mk') in parallel. Flock was necessary to avoid
parallel downloads during the building of software (which could cause
network problems). But after recently trying Maneage on FreeBSD (which is
not yet complete, see bug #58465), we noticed that the BSD implemenation of
Make couldn't parse 'basic.mk' (in particular, complaining with the 'ifeq'
parts) and its shell also had some peculiarities.
It was thus decided to also install our own minimalist shell, Make and
compressor program before calling 'basic.mk'. In this way, 'basic.mk' can
now assume the same GNU Make features that high-level.mk and python.mk
assume. The pre-make building of software is now organized in
'reproduce/software/shell/pre-make-build.sh'.
Another nice feature of this commit is for macOS users: until now the
default macOS Make had problems for parallel building of software, so
'basic.mk' was built in one thread. But now that we can build the core
tools with GNU Make on macOS too, it uses all threads. Furthermore, since
we now run 'basic.mk' with GNU Make, we can use '.ONESHELL' and don't have
to finish every line of a long rule with a backslash to keep variables and
such.
Generally, the pre-make software are now organized like this: first we
build Lzip before anything else: it is downloaded as a simple '.tar' file
that is not compressed (only ~400kb). Once Lzip is built, the pre-make
phase continues with building GNU Make, Dash (a minimalist shell) and
Flock. All of their tarballs are in '.tar.lz'. Maneage then enters
'basic.mk' and the first program it builds is GNU Gzip (itself packaged as
'.tar.lz'). Once Gzip is built, we build all the other compression software
(all downloaded as '.tar.gz'). Afterwards, any compression standard for
other software is fine because we have it.
In the process, a bug related to using backup servers was found in
'reproduce/analysis/bash/download-multi-try' for calling outside of
'basic.mk' and removed Bash-specific features. As a result of that bug-fix,
because we now have multiple servers for software tarballs, the backup
servers now have their own configuration file in
'reproduce/software/config/servers-backup.conf'. This makes it much easier
to maintain the backup server list across the multiple places that we need
it.
Some other minor fixes:
- In building Bzip2, we need to specify 'CC' so it doesn't use 'gcc'.
- In building Zip, the 'generic_gcc' Make option caused a crash on FreeBSD
(which doesn't have GCC).
- We are now using 'uname -s' to specify if we are on a Linux kernel or
not, if not, we are still using the old 'on_mac_os' variable.
- While I was trying to build on FreeBSD, I noticed some further
corrections that could help. For example the 'makelink' Make-function
now takes a third argument which can be a different name compared to the
actual program (used for examle to make a link to '/usr/bin/cc' from
'gcc'.
- Until now we didn't know if the host's Make implementation supports
placing a '@' at the start of the recipe (to avoid printing the actual
commands to standard output). Especially in the tarball download phase,
there are many lines that are printed for each download which was really
annoying. We already used '@' in 'high-level.mk' and 'python.mk' before,
but now that we also know that 'basic.mk' is called with our custom GNU
Make, we can use it at the start for a cleaner stdout.
- Until now, WCSLIB assumed a Fortran compiler, but when the user is on a
system where we can't install GCC (or has activated the '--host-cc'
option), it may not be present and the project shouldn't break because
of this. So with this commit, when a Fortran compiler isn't present,
WCSLIB will be built with the '--disable-fortran' configuration option.
This commit (task #15667) was completed with help/checks by Raul
Infante-Sainz and Boud Roukema.
|
|
In time, some of the copyright license description had been mistakenly
shortened to two paragraphs instead of the original three that is
recommended in the GPL. With this commit, they are corrected to be exactly
in the same three paragraph format suggested by GPL.
The following files also didn't have a copyright notice, so one was added
for them:
reproduce/software/make/README.md
reproduce/software/bibtex/healpix.tex
reproduce/analysis/config/delete-me-num.conf
reproduce/analysis/config/verify-outputs.conf
|
|
Until this commit, when the version of Gnuastro doesn't match with the
version that the project was designed to use, the warning message saying
how to run the configure step was not showing the option `-e'. This
situation is normal when updating the version of Gnuastro to the most
recent one (with the project already configured). However, the use of this
option is more convenient than giving the top-build directory, etc, every
time. With this commit, the warning message has been changed in order show
also the option `-e' in the re-configure of the project.
|
|
Until now Maneage used the host's GNU Gettext if it was present. Gettext is
a relatively low-level software that enables programs to print messages in
different languages based on the host environment. Even though it has not
direct effect on the running of the software for Maneage and the lanugage
environment in Maneage is pre-determined, it is necessary to have it
because if the basic programs see it in the host they will link with it and
will have problems if/when the host's Gettext is updated.
With this commit (which is actually a squashed rebase of 9 commits by Raul
and Mohammad), Gettext and its two extra dependencies (libxml2 and
libunistring) are now installed within Maneage as a basic software and
built before GNU Bash. As a result, all programs built afterwards will
successfully link with our own internal version of Gettext and
libraries. To get this working, some of the basic software dependencies had
to updated and re-ordered and it has been tested in both GNU/Linux and
macoS.
Some other minor issues that are fixed with this commit
- Until this commit, when TeX was not installed, the warning message
saying how to run the configure step in order to re-configure the
project was not showing the option `-e'. However, the use of this option
is more convenient than entering the top-build directory and etc every
time. So with this commit, the warning message has been changed in order
use the option `-e' in the re-configure of the project.
- Until now, on macOS systems, Bash was not linking with our internally
built `libncurses'. With this commit, this has been fixed by setting
`--withcurses=yes' for Bash's configure script.
|
|
Until now, if GCC couldn't be built for any reason, Maneage would crash and
the user had no way forward. Since GCC is complicated, it may happen and is
frustrating to wait until the bug is fixed. Also, while debugging Maneage,
when we know GCC has no problem, because it takes so long, it discourages
testing.
With this commit, we have re-activated the `--host-cc' option. It was
already defined in the options of `./project', but its affect was nullified
by hard-coding it to zero in the configure script on GNU/Linux systems. So
with this commit that has been removed and the user can use their own C
compiler on a GNU/Linux operating system also.
Furthermore, to inform the user about this option and its usefulness, when
GCC fails to build, a clear warning message is printed, instructing the
user to post the problem as a bug and telling them how to continue building
the project with the `--host-cc' option.
|
|
Until now, at the end of the configuration step, we would tell the user
this: "To change the configuration later, please re-run './project
configure', DO NOT manually edit the relevant files". However, as Boud
suggested in Bug #58243, this is against our principle to encourage users
to modify Maneage.
With this commit, that explanation has been expanded by a few sentences to
tell the users what to change and warn them in case they decide to change
the build-directory.
|
|
Until now, we wouldn't explicity check for GNU gettext. If it was present
on the system, we would just add a link to it in Maneage's installation
directory. However, in bug #58248, Boud noticed that Git (a basic software)
actually needs it to complete its installation. Unfortunately we haven't
had the tiem to include a build of Gettext in Maneage. Because it is mostly
available on many systems, it hasn't been reported too commonly, it also
has many dependencies which make it a little time consuming to install.
So with this commit, we actually check for GNU gettext right after checking
the compiler and if its not available an informative error message is
written to inform the user of the problem, along with suggestions on fixing
it (how to install GNU gettext from their package manager).
|
|
Until now we only checked for the existance and write-ability of the build
directory. But we recently discovered that if the specified build-directory
is in a non-POSIX compatible partition (for example NTFS), permissions
can't be modified and this can cause crashs in some programs (in
particular, while building Perl, see [1]). The thing that makes this
problem hard to identify is that on such partitions, `chmod' will still
return 0 (so it was hard to find).
With this commit, a check has been added after the user specifies the
build-directory. If the proposed build directory is not able to handle
permissions as expected, the configure script will not continue and will
let the user know and will ask them for another directory.
Also, the two printed characters at the start of error messages were
changed to `**' (instead of `--'). When everything is good, we'll use `--'
to tell the user that their given directory will be used as the build
directory. And since there are multiple checks now, the final message to
specify a new build directory is now moved to the end and not repeated in
every check.
[1] https://savannah.nongnu.org/support/?110220
|
|
Until now, the message that we printed just before starting to build
software didn't actually print the current directory, but only `pwd'. With
this commit, this is fixed (it uses the `currentdir' variable that is
already found before).
|
|
Until now, throughout Maneage we were using the old name of "Reproducible
Paper Template". But we have finally decided to use Maneage, so to avoid
confusion, the name has been corrected in `README-hacking.md' and also in
the copyright notices.
Note also that in `README-hacking.md', the main Maneage branch is now
called `maneage', and the main Git remote has been changed to
`https://gitlab.com/maneage/project' (this is a new GitLab Group that I
have setup for all Maneage-related projects). In this repository there is
only one `maneage' branch to avoid complications with the `master' branch
of the projects using Maneage later.
|
|
In the previous commit, we remove the `-static' flag from building PatchELF
because it wasn't necessary any more. Howver, the comment for the check
still included it and could be confusing. This is corrected with this
commit. Also, we don't need the `good_static_libc' variable (that was only
defined to pass onto PatchELF). This has also been corrected.
|
|
Until now the software configuration parameters were defined under the
`reproduce/software/config/installation/' directory. This was because the
configuration parameters of analysis software (for example Gnuastro's
configurations) were placed under there too. But this was terribly
confusing, because the run-time options of programs falls under the
"analysis" phase of the project.
With this commit, the Gnuastro configuration files have been moved under
the new `reproduce/analysis/config/gnuastro' directory and the software
configuration files are directly under `reproduce/software/config'. A clean
build was done with this change and it didn't crash, but it may cause
crashes in derived projects, so after merging with Maneage, please
re-configure your project to see if anything has been missed. Please let us
know if there is a problem.
|
|
Until now we would simply return the version numbers as they were written
into the separate files and situations can happen where the version numbers
contain an underscore (`_'). However, this character is a methematical
character in LaTeX, causing LaTeX to complain and abort.
With this commit, a step has been added at the end of the configure script
to convert any possible `_' to `\_'. Once it is commented (a backslash is
put behind it), the underscore will be printed as it is in the final PDF.
This commit was originally written by Mohammad Akhlaghi
|
|
MissFITS is package for manipulating FITS files.
I added it as my first commit to the project for educational
purposes.
|
|
Until now, we defined `LIBRARY_PATH' to fix the problem of the `ld' linker
of Binutils needing several `*crt*.o' files to run. However, some software
(for example ImageMagick) over-write `LIBRARY_PATH', therefore there is no
other way than to put a link to these necessary files in our local build
directory.
With this commit, we fixed the problem by putting a link to the system's
relevant files in the local library directory. This fixed the problem with
ImageMagick. Later, when we build the GNU C Library in the project, we
should remove this step.
This bug reported by Raul Castellanos Sanchez.
|
|
Until now, when a Fortran compiler didn't exist on the host operating
system, the configure script would crash with a warning. But some projects
may not need Fortran, so this is just an extra/annoying crash!
With this commit, it will still print the warning, but instead of a crash,
it will just sleep for some seconds, then continue. Later, when if a
software needs Fortran, it's building will crash, but atleast the user was
warned.
In the future, we should add a step to check on the necessary software and
see if Fortran is necessary for the project or not. The project
configuration should indeed crash if Fortran is necessary, but we should
tell the user that software XXXX needs Fortran so we can't continue without
a Fortran compiler.
Also, a small sentence ("Project's configuration will continue in XXXX
seconds.") was added after all the warnings that won't cause a crash, so
user's don't think its a crash.
|
|
Until now, Make was just run ordinarily on the two Makefiles of the
software building phase. Therefore when there was a problem with one
software while building in parallel, Make would only complete the running
rules and stop afterwards. But when other rules don't depened on the
crashed rule, its a waste of time to stop the whole thing.
With this commit, both calls to Make in the `configure.sh' script are done
with the `-k' option (or `--keep-going' in GNU Make). With this option, if
a rule crashes, the other rules that don't depend on it will also be
run. Generally, anything that doesn't depend on the crashed rule will be
done. The `-k' option is a POSIX definition in Make, so it is present in
most implemenetations (for the call to `basic.mk').
|
|
Until now the shell scripts in the software building phase were in the
`reproduce/software/bash' directory. But given our recent change to a
POSIX-only start, the `configure.sh' shell script (which is the main
component of this directory) is no longer written with Bash.
With this commit, to fix that problem, that directory's name has been
changed to `reproduce/software/shell'.
|