Age | Commit message (Collapse) | Author | Lines |
|
I needed to take these steps in a few occasions on a project I am building
over this pipeline. This will commonly happen when a team starts using this
pipeline, so it was added to make things easier.
|
|
To be more generic and recognizable, the `README-pipeline.md' script was
renamed to `README-hacking.md'. In essence, it is just that: to hack the
existing pipeline for your own project. We follow a similar naming
convention in many GNU software.
|
|
Two corrections were made in the Git hooks of Metastore.
1) The shebang at the start of the scripts now uses the absolute adress of
our installed bash, not the relative `.local/bin/bash'. Note that it is
possible to use Git within subdirectories and in that scenario, the
`.local' will fail.
2) The `$$user' section was removed from the command to find the user's
group. With the user as an argument, `groups' may print the user's name
first, then their list of groups. When this happens, the script would
be just repeating the user's name. But the raw `groups' command will
list the groups of the running user.
|
|
Until now, the check to see if the patchelf program should be used or not
(for GNU/Linux vs. Mac installations) was mistakenly added over the step
that we define the `sh' symbolic link, not over the call to patchelf. This
is corrected with this commit.
|
|
In this version, too many extra notices (just regarding a change from
branch to branch) are not printed with `-q'. Instead only a one line
statement is printed that it is saved or applied.
|
|
Until we see what happens with the pull request of our suggested features
in metastore, its version isn't written directly into the executable, so we
won't actually check it, but write the version directly into the paper.
|
|
After testing the built of Metastore on a server, I noticed that because
its `/etc/passwd' doesn't have the list of users, the `getpwuid' call
within metastore failed and wouldn't let it finish.
So I looked into the code and was able to implement a solution to this
problem by adding two options to it for default values for the user and
group. Also, file attributes are not necessary in our (current) use case of
metastore and caused crashes on our server, so they are also disabled.
|
|
Metastore depends on `bsd/string.h' to work properly (atleast on GNU/Linux
systems). The first system I tried building with had that library, so I
didn't notice! With this commit, we also build `libbsd' as part of the
pipeline.
Also, I couldn't find libbsd's version in any of its installed headers, so
like Libjpeg, we can't actually check and will directly write our internal
version into the paper.
|
|
The pipeline heavily depends on file meta data (and in particular the
modification dates), for example the configuration-Makefiles within the
pipeline are set as prerequisites to the rules of the pipeline.
However, when Git checks out a branch, it doesn't preserve the meta-data of
the files unique to that branch (for example program source files or
configuration-Makefiles). As a result, the rules that depend on them will
be re-done.
This is especially troublesome in the scenario of this reproducible paper
project because we commonly need to switch between branches (for example to
import recent work in the pipeline into the projects). After some
searching, I think the Metastore program is the best solution. Metastore is
now built as part of the pipeline and through two Git hooks, it is called
by Git to store the original meta-data of files into a binary file that is
version controlled (and managed by Metastore).
|
|
When building in group mode, users can manage them selves to work on
independent analysis steps and thus not cause conflicts. However, until
now, there was no way to avoid conflicts in building the final paper.
To fix this problem, when we are in group mode, the pipeline will create a
separate LaTeX build director for each user and also a separate PDF file
for each user. This will ensure that their compilations don't conflict.
|
|
With the current build system, Bash and AWK don't write RPATH into the
executables. This causes many problems in the pipeline (for example when
using the `$(shell)' function in Make which doesn't have
`LD_LIBRARY_PATH').
After consulting the Bash and Make mailing lists, so far, the best solution
was to use the Patchelf program to manually write RPATH in these
executables. With this commit, Patchelf is now installed in the pipeline
and used in Bash and AWK to fix this problem.
|
|
The build of bash has been made a little cleaner to help in readability and
management of the code.
|
|
The TIFF library can optionally depend on webp [1] and zstd [2]. But these
aren't commonly used in scientific datasets so to avoid a longer build and
managing of extra dependencies (atleast for now!), we are disabling
them. The problem is that they cause a dependency on the host system and if
they are updated/removed, the relevant pipeline programs will crash.
[1] https://en.wikipedia.org/wiki/WebP
[2] https://en.wikipedia.org/wiki/Zstandard
|
|
In the previous commit, the copyright year and owner were mistakenly
modified. They are corrected now.
|
|
While working on a pipeline based on this, I noticed many linking errors of
our installed Bash, complaining that it can't link with libreadline. This
was while readline was present in the proper directory and the Bash within
a recipe would work properly.
After some investigation, I found out that this is because Make's `foreach'
function (which was used to define the targets) was apparently calling Bash
without setting `LD_LIBRARY_PATH', causing this error.
To avoid such sitations, Bash now uses its internal build of readline and
we no longer ask it to link with the installed readline.
|
|
The targets of the links to have the extra common `ncurses' packages were
previously just `pkgconfig/*.pc'! But this would only work when run within
the `installed/lib' directory, not any other! So the targets for these
packages now use an absolute address.
|
|
In a few cases, I had used a signle quote to close `. This would not
display properly on the Gitlab webpage, so they are corrected.
|
|
If the `./for-group' script is not used properly, it can lead to the whole
pipeline being re-run. Therefore it is important to do a sanity check
immediately at the start of Make's processing and inform the user if there
is a problem.
With this commit, `./for-group' exports the `reproducible_paper_for_group'
variable which is used by both the initial `./configure' script, and later
in each call to Make. The `./configure' script will use it to write a value
in `reproduce/config/pipeline/LOCAL.mk' and Make will use it to compare
with the value in `reproduce/config/pipeline/LOCAL.mk'.
If there is an inconsistency, Make will not even attempt to build anything
and will just print a message and abort.
|
|
Until now, Gnuastro was only mentioned in the first acknowledgments
section, but not in the paragraph with all the program names. But these two
are not mutually exclusive. All the software should be mentioned in the
last paragraph and those that need special mention can be mentioned before
it.
|
|
Until now, there was no reference to `README-pipeline.md' within the
`README.md' file. Since `README.md' is the first file that someone reads
and the basic perpose and structure of the pipeline is described in
`README-pipeline.md', it was necessary to bring it up there.
|
|
Wget and cURL depend on many network related libraries by default and if
they are present on the host operating system, they will be linked
with. This causes problems for the pipeline when these libraries are
updated on the host system.
With this commit, I went through the configure time options of both Wget
and cURL and removed any library that didn't seem related to merely
downloading of files (possibly with SSL, because we do build OpenSSL in the
pipeline).
Also, I noticed a new version of cURL has come, so that is also updated.
|
|
In a previous implementation, we were using a `target' variable to define
the final target of several links, but with the new `sov' solution, we just
used its base name. However, we had forgot to correct two instances of
`target'. This is corrected now.
Also, the step to clean all already built outputs of the NCURSES library
has been simplified to a platform independent wildcard.
|
|
On Mac OS systems, the full version number is not used in the filename
given to libncurses. For example for version 6.1, it is called
`libncursesw.6.dylib'. So a more generic and easier to maintain and read
script is now used to be able to make links for both Mac and GNU/Linux
systems.
In short, instead of checking if we are in Mac every time, we just set the
suffixes at the start based on the machine once as variables and use those
to define the links.
|
|
The call to SED in `dependencies-build-rules.mk' had the file name before
the options. On some verions of SED, this would cause problems. So the
filename is now given after the options.
|
|
The new `--colormap' option was added to the call to Gnuastro's ConvertType
program. Since Gnuastro 0.8, ConvertType needs this option for converting a
single-channel dataset to a color-supporting format.
|
|
Readline is a prerequisite of Bash and AWK, while NCURSES is a prerequisite
of Readline. With the recent update of GNU Bash (and thus GNU Readline) on
my host operating system, the pipeline crashed and I noticed this hole in
the pipeline. In particular, AWK (which linked with Readline 7.0) would
complain about not finding it and abort.
|
|
A minor typo was fixed to help in readability.
|
|
Git needs cURL in its build. Until now, by chance cURL was always built
before Git, but while building this pipeline on a system, Git was built
before cURL and we found the problem.
I also noticed that we hadn't added `Your name <your@email.address>' to the
`for-group' script. This has been corrected now.
|
|
ccache is a super annoying program in the context of the reproduction
pipeline. On systems that use it, the `gcc' and `g++' that are found in
PATH are actually calls to `ccache' (so it can manage their call)!
Two steps have been taken to ignore and disable ccache (if it isn't ignored
properly!): 1) when making symbolic links to compilers, if a directory
containing `ccache' is present in the PATH, it is first removed, then we
look for the low-level programs that we won't be building. 2) The
`CCACHE_DISABLE' environment variable is set to 1 where necessary (with the
other environment variables).
|
|
On large projects, its often necessary to share the build directory between
the various users of the pipeline. To simplify the process a `for-group'
script is now added to the pipeline which is just a wrapper over the
`./configure' and `.local/bin/make' commands to make sure that the group
owner of the outputs and the permission flags are set properly.
|
|
During the last month, several core GNU programs were updated, so their
versions in the pipeline have also been updated.
|
|
After installing Bash, we would just blindly try to build the $(ibdir)/sh'
symbolic link. But that could fail if it already existed. To make things
clean, we now remove any link first before attempting to make a new one.
|
|
Since the current implementation of this pipeline officially started in
2018, all the files only had 2018 in their copyright years. This has now
been corrected to 2018-2019.
|
|
By giving this option specifically at the build time of Pkg-config, we'll
ensure that any package that uses pkg-config will first look into our
locally installed build.
|
|
Both Gzip and Gnuastro were being bootstrapped personally from their Git
repository until now. But fortunately a new release of both came out last
week and so to make things standard we are now using their standard
tarballs.
I also noticed that we weren't checking the version of Gzip or mentioning
it in the acknowledgement section. This was also corrected.
|
|
As part of the pipeline, `.texlive2018' may be created. So it is added to
`.gitignore'.
|
|
A closing double-quotation was missing in the command to read about
Automatic variables in Make.
|
|
Bzip2's verison is found differently from the other programs (because it
writes no standard error, not standard output!). So a custom function is
written for it which includes creating a temporary file. But we had forgot
to delete that file after the version is found.
|
|
While checking the build of the previous commit, a failure happened when
linking `reproduce/build/dependencies/installed/bin/sh' with the built Bash
(because the symbolic link already existed!). So a `-f' flag was added to
`ln' to just change it without complaining.
I also noticed that the Git build was also not in verbose mode. So this has
also been corrected.
|
|
While we were testing this pipeline on a Mac OS system, we found and
reported a problem in Gzip's build (bug #33689). However, since the Gzip
build is not verbose, it was necessary to run its `make' with
`V=1'. Generally, since almost all the programs are built in verbose mode
(where you can see the compilation commands), we have also set this flag in
any build to be clear and make it easier to spot bugs in the future.
|
|
To make things clear for a user of the pipeline its mentioned that the
given input directory is only read and nothing is written to it.
|
|
A minor correction was made in the checklist (since we only have one
`foreach' loop in the top-level Makefile) and also the version of Gnuastro
was incremented.
|
|
Some problems with using the number of threads in dependency building were
fixed.
|
|
The version of Git was updated to the most recent version (2.20.0).
|
|
Some host Make systems may not allow automatic passing of the number of
threads to sub-Makes. So while building the basic dependencies, we'll need
to explicity add the `-j' option to the Make files that can benefit most
from it: those that are dependencies of many others (Tar & Make), or are
the last to build (Coreutils).
|
|
An extra line regarding the removal of Gnuastro in the checklist was
removed (we don't report the Gnuastro version at the start of the paper any
more).
|
|
As is common in most projects, a generic "COPYING" file has been added in
the top directory to mark the license of the whole project.
|
|
To help and be more clear a link to this pipeline's dependency repository
has been added to `README.md'.
|
|
Gnuastro's BuildProgram is a little special: it is actually built during
the building of Gnuastro to keep important include and lib directory
information and if someone wants to use BuildProgram, this information is
necessary. So a special configuration is added for it in
`reproduce/config/gnuastro'. This configuration file will allow users to
set their own special configuration if they like, then it will load the
installed BuildProgram configuration file.
|
|
On Mac OS systems, CFITSIO doesn't use path to find the `curl-config'
program (used by to give the library header and linking options), but uses
an absolute path. Therefore the only way we can ask CFITSIO to look into
our build of cURL is to manually change that absolute address.
Also, since all the libraries are now linked dynamically, we don't need the
extra linking flags when building WCSLIB (so it finds CFITSIO).
|