Age | Commit message (Collapse) | Author | Lines |
|
XLSX I/O is a very simple and fast program and library for reading and
writing `.xls' and `.xlsx' files (mainly used by Microsoft Excel) to CSV
files. It has two separate executables that can be called for an Excel file
and will output a CSV plain text file that can then be used within the
pipeline with more standard tools.
|
|
Until now, the main download script could only check one server for the
given URL. However, ultimately the actual server that a file is downloaded
from is irrelevant for this project: we actually check its
checksum. Especially in the case of software (which are distributed over
many servers), this can usually be very annoying: the servers may not
properly communicate with the running system and even the 10 trials won't
be enough.
With this commit, the download script
`reproduce/analysis/bash/download-multi-try' can take a new optional
argument (a 5th argument). It assumes this argument is a space-separated
list of server(s) to use as backup for the original URL. When downloading
from the original URL fails, it will look into this list and try
downloading the same file from each given server.
|
|
Until now the shell scripts in the software building phase were in the
`reproduce/software/bash' directory. But given our recent change to a
POSIX-only start, the `configure.sh' shell script (which is the main
component of this directory) is no longer written with Bash.
With this commit, to fix that problem, that directory's name has been
changed to `reproduce/software/shell'.
|
|
Until now, the project would first ask for the basic directories, then it
would start testing the compiler. But that was problematic because the
build directory can come from a previous setting (with `./project configure
-e'). Also, it could confuse users to first ask for details, then suddently
tell them that you don't have a working C library! We also need to store
the CPATH variable in the `LOCAL.conf' because in some cases, the compiler
won't work without it.
With this commit, the compiler checking has been moved at the start of the
configure script. Instead of putting the test program in the build
directory, we now make a temporary hidden directory in the source directory
and delete that directory as soon as the tests are done.
In the process, I also noticed that the copyright year of the two hidden
files weren't updated and corrected them.
|
|
Until now Perl was built after Coreutils, but I recently noticed that
Coreutils actually uses Perl while creating its manpages. So it is now
built before Coreutils.
Also, while testing on an Amazon AWS EC2 server, we noticed that Coreutils
can't build its man page for `md5sum'. The problem was found to be due to
the fact that until now, we weren't actually setting LD_LIBRARY_PATH to our
installed library path in `basic.mk'. Therefore, it would crash because the
server had an older version of OpenSSL than the one that the template's
Coreutils was built with.
In the meantime (while addressing the issues above, because we only had one
thread on the AWS server) I also noticed a few programs that were using a
summarize compilation command (that just prints `CC xxx.c' instead of the
whole command) so I fixed them by adding `V=1'.
This bug was found by Idafen Santana Pérez.
|
|
Until now, the configuration Makefiles (in
`reproduce/software/config/installation' and `reproduce/analysis/config')
had a `.mk' suffix, similar to the workhorse Makefiles. Although they are
indeed Makefiles, but given their nature (to only keep configuration
parameters), it is confusing (especially to early users) for them to also
have a `.mk' (similar to the analysis or software building Makefiles).
To address this issue, with this commit, all the configuration Makefiles
(in those directories) are now given a `.conf' suffix. This is also assumed
for all the files that are loaded.
The configuration (software building) and running of the template have been
checked with this change from scratch, but please report any error that may
not have been noticed.
THIS IS AN IMPORTANT CHANGE AND WILL CAUSE CRASHES OR UNEXPECTED BEHAVIORS
FOR PROJECTS THAT HAVE BRANCHED FROM THIS TEMPLATE. PLEASE CORRECT THE
SUFFIX OF ALL YOUR PROJECT'S CONFIGURATION MAKEFILES (IN THE DIRECTORIES
ABOVE), OTHERWISE THEY AREN'T AUTOMATICALLY LOADED ANYMORE.
|
|
Until now, GCC wouldn't build properly on Debian-based operating systems
because `ld' needed to link with several necessary C library features like
`crti.o' and `crtn.o' (this is an `ld' issue, not GCC). The solution is to
add the directory containing them to `LIBRARY_PATH'. In the previous
commit, I actually searched for these files, but while testing on another
system, I noticed that it can be problematic (other architectures may
exist).
With this commit, we are actually finding the build architecture of the
running GCC (which is the same as the `ld') and using that to fix a fixed
directory to `LIBRARY_PATH'.
|
|
Until now, when find the versions of the TeXLive packages, we would assume
that `cat-date' is always present (because some packages don't have a
version!). However, apparently an update has been made in the TeXLive
Manager (`tlmgr') and `cat-date' is no longer present! As a result, none of
the TeXLive packages were being printed.
With this commit, it now assumes that `revision' is always present for
every package, but it also attempts to read `cat-date' (for backwards
compatability). When `cat-version' isn't present, it will try printing
`revision' and if that is also not present, it will print the date.
|
|
Until this commit, the checking of X11 installation done to ensure that
it is already available in the host system was crashing in macOS
systems. The reason is that the place of the X11 libraries use to be
`/opt/X11/lib' in macOS systems. With this commit, this issue has been
fixed by adding this directory to the LDFLAGS.
|
|
Now that its 2020, its necessary to include this year in the copyright
statements.
|
|
Until this commit, the number `2' was missing in the checksum variable
name of that library. It was `libxml-checksum' but it should be
`libxml2-checksum'. With this commit, this issue has been fixed.
|
|
OpenMP takes a LONG TIME to build, so to keep things reproducible we are
explicitly disabling OpenMP, if a user needs OpenMP, its trivial to just
add it as a prerequisite of R. The problem is that in some scenarios (based
on other dependencies and when they were built in the build directory),
OpenMP may be present when R is being installed and in other it may not. We
don't want the result to be different between the two builds.
|
|
With this commit, we now have the core R interpretter within the
template. We should later include instructions to install R packages
(possibly in a separate top-level Makefile like Python).
|
|
Until now, Ghostscript was using some host system's X11 libraries during
its build (and later at run-time). We should ideally install all these
necessary libraries within the template (Task #15481). But right now we are
too busy.
As a temporary work-around we try building a small dummy program that links
with some of those libraries before attempting to built Ghostscript. If it
fails, then a notice is printed with the cause and explaining a temporary
solution is suggested: how to install those libraries on the system when
you have root access.
|
|
Until now when building Matplotlib on macOS systems, we were using the
default C compiler. However, while Yahya Sefidbakht (previously) and
Mahdieh Nabavi (now) were trying to build the template, on their macOS
using the GNU Compiler Collection (GCC), we found that Matplotlib needs
special macOS headers that GCC doesn't recognize.
With this commit, when Matplotlib is being built on macOS systems, it uses
`clang' and this fixed the problem (so far checked on Mahdieh's machine).
|
|
These two packages are necessary to build the GNU C Library.
|
|
When building the log4cxx tarball from its Git history, I noticed that
files with very long names are not packaged by tar (because by default
Automake uses the ancient v7 tar format that only supports file names less
than 99 characters).
So I build the tarball with the `tar-ustar' option to Automake (by
modifying the log4cxx source) and the resulting tarball was able to compile
and run successfully. This has been described above the rule to build
log4cxx and I also sent an email to their developing mailing list to inform
them of this problem. If they address it, I will remove the note on the
necessary corrections.
|
|
Some minor corrections were made in the template:
- When making the distribution, `.swp' files (created by Vim) are also
removed.
- Autoconf is set as a prerequisite of Automake
I was also trying to add the Apache log4cxx, but its default 0.10.0 tarball
needs some patches, so I have just left it half done until someone actually
needs it and we apply the patch.
|
|
Until now, the tarballs were the first normal prerequisite of the
software. As a result if their date changed, the whole software would be
re-built. However, for tarballs specifically, we actually check their
contents with a checksum, so their date is irrelevant (if its newer than
the built program, but has the same checksum, there is no need to re-build
the software).
Also, calling the tarball name as an argument to the building process (for
example `gbuild') was redundant. It is now automatically found from the
list of order-only prerequisites within `gbuild' and `cbuild' (similar to
how it was previously found in the `pybuild' for Python building).
A `README.md' file has also been placed in `reproduce/software/make' to
help describe the shared properties of the software building
Makefiles. This will hopefully grow much larger in the future.
|
|
The tarball of HEALPix includes multiple languages and doesn't include the
ready-to-run GNU Build System by default, we actually have to build the
`./configure' script for the C/C++ libraries. So it was necessary to also
include GNU Autoconf and GNU Automake as prerequisites of HEALPix.
However, the official GNU Autoconf tarball (dating from 2012) doesn't build
on modern systems, so I just cloned it from its source and bootstrapped it
and built its modern tarball which we are using here.
|
|
As part of an effort to bring in all the dependencies of the LSST Science
pipeline (which includes the last commit), these software are now available
in the template.
|
|
With this commit these three software packages are now installable with
this template.
|
|
Since ImageMagick can take long to build, we are now building it in
parallel. Also, the part where we replace an `_' with `\_' in the software
version at the end of the configure script was removed. It is more
clear/readable that the actual rule that includes such a name deals with
the underline (as is the case for `sip_tpv' which already dealt with it).
Finally, I noticed that the checks at the start of `top-prepare' were
missing new-lines. I had forgot that the Make single-shell variable isn't
activated in this stage yet.
|
|
It was some time since these three software were not updated! With this
commit the template now uses the most recent stable release of these
packages.
Also, the hosting server for ImageMagick was moved to my own webpage
because unfortunately ImageMagick removes its tarballs from its own
version.
|
|
Until now we were calling it `Sextractor', but the official way of writing
it is `SExtractor'. With this commit, this has been corrected.
|
|
New versions of astropy, bash, cmake, curl, findutils, gawk, gcc,
ghostscript, git, make, gsl had recently come so they are updated with this
commit.
About GNU Findutils and GNU Make: I was bootstrapping (building the tarball
of) these two separately separately because their standard tarball release
had problems on some systems. Both have been updated now so I am no longer
using my own webpage as their main URL.
A special note about GNU Make. I just noticed that during bootstrapping,
GNU Make would use the fixed version string of `4.2.90' for any commit!!!
But fortunately they have officially released their 4.2.90 version, so we
are safely using their own webpage. The only difference is the compression
format. My old bootstrapped build was `tar.lz', but the standard release is
`tar.gz'.
Also, all the basic programs (installed in `.local/bin') in `basic.mk' are
now existance-only dependencies (after a `|'). Because later programs just
use them at a very basic level, so there is no need to rebuild everything
when Bash gets updated for example.
|
|
Until now, OpenMPI would complain about not having `ssh' or `rsh' as a
remote shell feature. However, such features should not be necessary in a
reproducible scenario and they also have major security issues.
With this commit, we are now using OpenMPI's `OMPI_MCA_plm_rsh_agent'
environment variable to disable any remote shell dependency for it (as
suggested by Boud). Therefore, any dependency for OpenSSH has been
removed. But I thought to keep the build instructions incase it may be
useful under some un-foreseen scenario. However, to discourage people from
building it, a notice was added ontop of the build instructions.
This bug was found, tested and solved thanks to Roberto Baena Gallé and
Boud Roukema.
This fixes bug #56724.
|
|
Until now we have followed the convention of using space characters before
comment lines in recepies, not tabs. This has been corrected in one case.
|
|
Until this commit, `fpack' and `funpack' were not installed by default
with the installation of CFITSIO. It is necessary to explicity do a
`make fpack' and `make funpack' to have them installed. With this
commit, these two programs have been added.
|
|
When trying to install `libgit2' on my macOS system, it complains about
not finding `_iconv*' functions! But apparently `libgit2' has its own
implementation of `libiconv' that it uses if it can't find `libiconv' on
macOS. So, the solution to this problem was to add `-DUSE_ICONV=OFF' to
the configuration options of `libgit2'.
I reported this issue that now is fixed thanks to the help of Mohammad
Akhlaghi.
|
|
Until now, we weren't explicitly telling libtiff to ignore JBIG-KIT. So on
some systems, it would try to link withit and would thus fail.
With this commit, we have disabled JBIG-KIT support for libtiff. I tried to
import it into the template, but it doesn't have use the standard GNU Build
system. Maybe later we can add a set of build rules for it, I don't have
time now.
Also, this problem with libtiff occurred while building Ghostscript. But
was fixed after adding this option to libtiff. So libtiff was added as a
dependency of Ghostscript.
This bug was reported by Sheeraz Ahmad.
|
|
Until now we weren't setting OpenSSH's `privsep-path' configure option. As
a result, it would try to use its default (`/var/empty'). Therefore, when
the host doesn't have `/var/empty', OpenSSH would crash because of not
having permissions to create this directory.
With this commit, we are now using OpenSSH's `--with-privsep-path'
configure-time option to explicitly use a directory with the project's
build directory.
This bug was found by Sheeraz Ahmad and Amina Aahil.
|
|
This was a bug in WCSLIB 6.3 that has been fixed in WCSLIB 6.4. From
WCSLIB's changelog: "The rule change to the Fortran makefile in v6.3 to add
getwcstab_f.o to the sharable library causes it to depend on CFITSIO to
resolve fits_get_wcstab(). Hence backed out of that change.".
The actual error was like this:
Undefined symbols for architecture x86_64:
"_fits_read_wcstab", referenced from:
_ftwcst_ in getwcstab_f.o
"_gFitsFiles", referenced from:
_ftwcst_ in getwcstab_f.o
ld: symbol(s) not found for architecture x86_64
|
|
These three libraries are dependencies of Biber, so we will need them
later, but since we don't build biber from source now, we can't control
what library it links with. With this commit, we have just added their
versions, checksum, download URL and build rule incase they are useful in
other software.
Later, when we build Biber (and Texlive in general) from source, we'll be
able to use these.
|
|
Until now, unlike most other programs, in Gnuastro we would run `make
check', but this causes a crash on some systems, because of its
BuildProgram test: a linking error because the library isn't installed
yet. Which is natural in cases like this (and must be corrected in future
Gnuastro releases).
With this commit, the checks of Gnuastro have been removed.
|
|
Until now, OpenMPI was being installed without any dependency. This was
fine because it would indeed build. But the moment you tried loading
something that depends on it (for example `mpi4py' through `astropy'), you
would get an error complaining that SSH isn't present.
With this commit, the pipeline now also installs OpenSSH to solve this
problem.
|
|
Until now, we were relying on WCSLIB's internal checking and linking with
CFITSIO. But on one macOS system (not others that had no problem!), we
noticed that it complains with undefined symbol linking errors to CFITSIO
libraries.
With this commit, as a fast/ugly solution, we are explicity adding
`-lcfitsio' to WCSLIB's `LIBS' variable so all binaries are linked with it
automatically. We'll be in touch with the WCSLIB author to see if a better
solution can be found.
|
|
Some of the software tarballs are directly available on their webpage (for
example due to build problems on systems, where we had to clone the
software and build its tarball ourselves until the next release). Until
now, only for these software, these tarballs were hosted on
`http://akhlaghi.org/src'. But now, I have moved a clone of the software
backup repository (including all its software) to
`http://akhlaghi.org/reproduce-software'.
To be more clear and have a single place for the backup software, the URL
of those software has also been updated in the project source.
|
|
Until now the only way to define the environment of the Make recipes was
through the exported Make variables (mostly in `initialize.mk' for the
analysis steps for example). However, there is only so much you can do with
environment variables! In some situations you want slightly more
complicated environment control, like setting an alias or running of
scripts (things that are commonly done in the `~/.bashrc' file of users to
configure their interactive, non-login shells).
With this commit, a `reproduce/software/bash/bashrc.sh' has been defined
for this job (which is currently empty!). Every major Make step of the
project adds this file as the `BASH_ENV' environment variable, so the shell
that is created to execute a recipe first executes this file, then the
recipe. Each top-level Makefile also defines a `PROJECT_STATUS' environment
variable that enables users to limit their envirnoment setup based on the
condition it is being setup (in particular in the early phase of
`basic.mk', where the user can't make any assumption about the programs and
has to write a portable shell script).
|
|
Until now, there was no check on the integrity of the contents of the
downloaded/copied software tarballs, we only relied on the tarball
name. This could be bad for reproducibility and security, for example on
one server the name of a tarball may be the same but with different
content.
With this commit, the SHA512 checksums of all the software are stored in
the newly created `checksums.mk' (similar to how the versions are stored in
the `versions.mk'). The resulting variable is then defined for each
software and after downloading/copying the file we check to see if the new
tarball has the same checksum as the stored value. If it doesn't the script
will crash with an error, informing the user of the problem.
The only limitation now is a bootstrapping problem: if the host system
doesn't already an `sha512sum' executable, we will not do any checksum
verification until we install our `sha512sum' (as part of GNU
Coreutils). All the tarballs downloaded after GNU Coreutils are built will
have their checksums validated. By default almost all GNU/Linux systems
will have a usable `sha512sum' (its part of GNU Coreutils after all for a
long time: from the Coreutils Changelog file atleast since 2013).
This completes task #15347.
|
|
Until now, to work on a project, it was necessary to `./configure' it and
build the software. Then we had to run `.local/bin/make' to run the project
and do the analysis every time. If the project was a shared project between
many users on a large server, it was necessary to call the `./for-group'
script.
This way of managing the project had a major problem: since the user
directly called the lower-level `./configure' or `.local/bin/make' it was
not possible to provide high-level control (for example limiting the
environment variables). This was especially noticed recently with a bug
that was related to environment variables (bug #56682).
With this commit, this problem is solved using a single script called
`project' in the top directory. To configure and build the project, users
can now run these commands:
$ ./project configure
$ ./project make
To work on the project with other users in a group these commands can be
used:
$ ./project configure --group=GROUPNAME
$ ./project make --group=GROUPNAME
The old options to both configure and make the project are still valid. Run
`./project --help' to see a list. For example:
$ ./project configure -e --host-cc
$ ./project make -j8
The old `configure' script has been moved to
`reproduce/software/bash/configure.sh' and is called by the new `./project'
script. The `./project' script now just manages the options, then passes
control to the `configure.sh' script. For the "make" step, it also reads
the options, then calls Make. So in the lower-level nothing has
changed. Only the `./project' script is now the single/direct user
interface of the project.
On a parallel note: as part of bug #56682, we also found out that on some
macOS systems, the `DYLD_LIBRARY_PATH' environment variable has to be set
to blank. This is no problem because RPATH is automatically set in macOS
and the executables and libraries contain the absolute address of the
libraries they should link with. But having `DYLD_LIBRARY_PATH' can
conflict with some low-level system libraries and cause very hard to debug
linking errors (like that reported in the bug report).
This fixes bug #56682.
|
|
Until now we were only setting the `LD_LIBRARY_PATH' environment variable
for GNU/Linux systems. But macOS systems use the `DYLD_LIBRARY_PATH'.
With this commit, for better control over the environment, we are also
fixing `DYLD_LIBRARY_PATH' in all the places that we are setting the
general environment variables.
|
|
Some bugs have been fixed in the new version of WCSLIB, so it has been
updated in the template.
|
|
More than two releases and bug fixes have been made to libgit2. So we are
now using a more recent version in the template.
|
|
Configure script: when `texlive-ready-tlmgr' is not created, it is similar
to not having installed TeXLive. A check was added so in this scenario the
`./configure' script doesn't crash.
high-level.mk: `cairo' and `pixman' are now installed in parallel and with
`V=1' (so the full compilation and linking command is printed).
|
|
In one of the last few commits, the commands in the recipe of libgit2 was
merged with `&&' so it stops if anything fails. But I had forgot to add a
`;' at the end of the `install_name_tool' command. This is corrected with
this commit.
|
|
While doing a clean build, several issues were found in the pipeline and
corrected. The main issue was that with the recent installation of
`libiconv', the GCC standard C++ library depends on `libiconv', so we need
to explicity add an `-liconv' to any C++ compilation.
The other corrected points are:
- The C++ compiler is now explicitly defined in `CXX'.
- libgit2 and WCSLIB's recipes weren't using `&&' between statements, so
if there was an error, it would still build the target!
- The CMake bootstrapping script now builds much faster in parallel.
|
|
Until now, we were using the `--enable-single' configure-time option of
FFTW to build its single-precision library. The reason was that according
to the output of `./configure --help', this was equivalent to
`--enable-float'. However, the single precision library wasn't being built.
As a result, on systems that already had it, SExtractor would use the
system's library and on other systems, it would simply not pass the
configure step.
With this commit, we are now using `--enable-float' which fixes this
problem and installs the `libfftw3f*' libraries (showing that it is not
equal to `--enable-single'!). Also, some optimization flags were added to
hopefully make it faster.
This issue was found thanks to Zahra Sharbaf and Hamed Altafi.
This fixes bug #56588.
|
|
Until now, we were letting the TeXLive installer use the default
CTAN-chosen mirror based on the host. But in many cases, this is not
efficient and sometimes those servers don't work.
With this commit, we manually set the server to use (`rit.edu'), which is
relatively fast and up to date. In this way, until we build TeXLive from
source, every user will be using the same CTAN mirror.
|
|
Until now, we weren't explicitly setting the C and Fortran compiler
environment variables (`CC' and `F77'). As a result, if the user's system
had already set them, there would be a problem (and the system's compilers
would be used).
With this commit, we are explicitly setting these two environment variables
at the start of `high-level.mk'.
This bug was found after a discussion with Elham Saremi.
|