Age | Commit message (Collapse) | Author | Lines |
|
We were developing the build of Numpy and Scipy on Mac in a parallel thread
and things seems to be working relatively nice now. There were only two
problems:
1) GCC still has some random building issues on Mac.
2) ATLAS shared libraries can't be built on Mac (so we used OpenBLAS to
build Numpy and Scipy on both Mac and GNU/Linux).
But for now, none of these problems are critical. So, we can progress in
one branch.
There were only very minor conflicts in the merge.
|
|
After trying the build a system with no Python library, I noticed that
Python's HDF5 module (`h5py') needs the HDF5 library and OpenMPI (to work
in parallel). So they were added. Finally `h5py' uses the `mpi4py' module
to communicate with OpenMPI, so it was also added. However, for some
reason, mpi4py doesn't work with this version of OpenMPI (as described in
the comments above).
So for now, h5py doesn't use it and can only work on a single thread, while
the HDF5 C library links with OpenMPI with no problem.
|
|
We could not get ATLAS shared libraries on Mac (while the static ATLAS
libraries are built and can be used successfully on Mac). So, the
pipeline now builds OpenBLAS, which both Numpy and Scipy can use on Mac
and GNU/Linux.
We also added FFTW as a dependency of Numpy. Altough Numpy is not linking to
FFTW for some reason. However, since FFTW is a low level library used by
many programs, we have kept it as a dependency of Numpy anyway for now.
|
|
To help make it easier to re-use (like the rest of the "large" files).
|
|
In this commit we add `h5py' Python package.
We also include `setuptools' as a main dependency of Python because with the
previous commit it (as well as `pip') is no longer installed with Python.
Numpy version also has been incremented.
|
|
Numpy needs ATLAS as shared libraries. So we also need to build Python with
shared libraries. We also need to define site.cfg for numpy and scipy so we
define a master template:
`reproduce/config/pipeline/dependency-numpy-scipy.cfg'
Also `Openssl' did not have rpath so we added with this commit.
|
|
An initial installation of atlas is now included in the pipeline,
but we are still trying to make it compile and build smoothly. In
the process, we found that GCC also needs some modifications
(for example rpath issues).
|
|
To ensure that we have all the necessary Python dependencies, I done an
offline build and noticed that several packages were also necessary for the
`./configure' step to finish (`libffi', `asn1crypto', `cffi', `jeepney',
`pycparser' and `secretstorage'). With this commit they are added.
|
|
Until this commit, we had some of the python packages intalled
but they did not work properly because of the `PYTHONPATH' variables.
That is, the pipeline's `python' was the `python' of the system
instead of the pipeline's `python'.
With this commit this issue has been fixed by setting the correct
`PYTHONPATH'. In this commit we also modify the installation of
`bzip2' because `CMake' was complaining about some libraries built
statically.
|
|
Until now the installation of Python and its packages (numpy, astropy,
astroquery, etc.) were done in the same `makefile'.
With this commit the installation of Python and its packages have been
split and now it is independent of the other programs. The installation
of all Python packages needs to be written explicitely because pip is
not used anymore.
|
|
As matplotlib is a general package for plotting and it is widely
used in science, we have added it to the pipeline.
When installing a dependency of matplotlib `python-dateutil', we
found a conflict in the download of the tarball. This is because
the name has a dash (-) in the middle. In addition, the name starts
with 'python', so it is the same as the python itself. Now it is
possible to install any package with any name, just adding an elif
in before the URL direction.
|
|
All dependencies for building astroquery package have been done.
Until nowthe Python dependencies were built in the same Makefile
as the high level libraries and programs. But, because astroquery
has many dependencies we split the Python and Python packages
installation in a new Makefile.
The installation of differents packages are done using Python and
not pip, because we found some problems when doing it with pip.
Apparently there are some interferences between the packages
installed by the pip of the system and the pip installed as part
of Python in the pipeline.
|
|
Astropy was added and one very important thing is that we have to
use the pypi tarball (https://pypi.org/) (which is bootstrapped)
and not the github tarball.
|
|
Python needs some packages to be really useful. Numpy is the most
important package for using Python and a lot of other packages
depend on it.
In this commit we add numpy to the pipeline. The tarball of numpy
right now is fossies.
|
|
Many projects use Python so it is necessary include it in the
pipeline.
|
|
With this commit, it is now possible to package the project into a tarball
or zip file, ready to be distributed to collaborators who only want to
modify the final paper (and not do the analysis technicalities), or for
uploading to sites like arXiv, or online LaTeX sharing pages.
|
|
I recently found another fork of metastore that allows its build on macOS
systems (https://github.com/mpctx/metastore). So I forked it into my own
fork with several other corrections (mostly cosmetic!), so it is now much
better suited for this pipeline.
Raul Infante-Sainz has already tested the building of metastore on his
macOS. In a previous test, we also noticed that libbsd should not be built
on Mac systems, so it is now a conditional prerequisite to metastore.
|
|
In this version, too many extra notices (just regarding a change from
branch to branch) are not printed with `-q'. Instead only a one line
statement is printed that it is saved or applied.
|
|
After testing the built of Metastore on a server, I noticed that because
its `/etc/passwd' doesn't have the list of users, the `getpwuid' call
within metastore failed and wouldn't let it finish.
So I looked into the code and was able to implement a solution to this
problem by adding two options to it for default values for the user and
group. Also, file attributes are not necessary in our (current) use case of
metastore and caused crashes on our server, so they are also disabled.
|
|
Metastore depends on `bsd/string.h' to work properly (atleast on GNU/Linux
systems). The first system I tried building with had that library, so I
didn't notice! With this commit, we also build `libbsd' as part of the
pipeline.
Also, I couldn't find libbsd's version in any of its installed headers, so
like Libjpeg, we can't actually check and will directly write our internal
version into the paper.
|
|
The pipeline heavily depends on file meta data (and in particular the
modification dates), for example the configuration-Makefiles within the
pipeline are set as prerequisites to the rules of the pipeline.
However, when Git checks out a branch, it doesn't preserve the meta-data of
the files unique to that branch (for example program source files or
configuration-Makefiles). As a result, the rules that depend on them will
be re-done.
This is especially troublesome in the scenario of this reproducible paper
project because we commonly need to switch between branches (for example to
import recent work in the pipeline into the projects). After some
searching, I think the Metastore program is the best solution. Metastore is
now built as part of the pipeline and through two Git hooks, it is called
by Git to store the original meta-data of files into a binary file that is
version controlled (and managed by Metastore).
|
|
With the current build system, Bash and AWK don't write RPATH into the
executables. This causes many problems in the pipeline (for example when
using the `$(shell)' function in Make which doesn't have
`LD_LIBRARY_PATH').
After consulting the Bash and Make mailing lists, so far, the best solution
was to use the Patchelf program to manually write RPATH in these
executables. With this commit, Patchelf is now installed in the pipeline
and used in Bash and AWK to fix this problem.
|
|
Wget and cURL depend on many network related libraries by default and if
they are present on the host operating system, they will be linked
with. This causes problems for the pipeline when these libraries are
updated on the host system.
With this commit, I went through the configure time options of both Wget
and cURL and removed any library that didn't seem related to merely
downloading of files (possibly with SSL, because we do build OpenSSL in the
pipeline).
Also, I noticed a new version of cURL has come, so that is also updated.
|
|
Readline is a prerequisite of Bash and AWK, while NCURSES is a prerequisite
of Readline. With the recent update of GNU Bash (and thus GNU Readline) on
my host operating system, the pipeline crashed and I noticed this hole in
the pipeline. In particular, AWK (which linked with Readline 7.0) would
complain about not finding it and abort.
|
|
During the last month, several core GNU programs were updated, so their
versions in the pipeline have also been updated.
|
|
Both Gzip and Gnuastro were being bootstrapped personally from their Git
repository until now. But fortunately a new release of both came out last
week and so to make things standard we are now using their standard
tarballs.
I also noticed that we weren't checking the version of Gzip or mentioning
it in the acknowledgement section. This was also corrected.
|
|
A minor correction was made in the checklist (since we only have one
`foreach' loop in the top-level Makefile) and also the version of Gnuastro
was incremented.
|
|
The version of Git was updated to the most recent version (2.20.0).
|
|
The build systems of Libgit2 and WCSLIB on Mac OS does not account for
installation in non-standard addresses: `Libgit2' keeps the absolute
address of its build directory (not the installation directory) and WCSLIB
doesn't write any absolute address at all (so the system uses the first one
it finds).
To address these issues, we are now using Mac OS's `install_name_tool'
program to fix the absolute path within the installed shared library.
Since the version of the library is actually present in its shared library
name, in `dependency-versions.mk' we have also separated these two
libraries so later when their version is changed, we are careful in
correcting the shared library name also.
|
|
Some high-level programs like Wget and cURL need to be built in shared mode
because they also include dynamic loading of libraries. Therefore, if we
only build the lower-level libraries in static mode, our own build will be
ignored and they will go and find the system's shared libraries to link
with. Because of this, for now, we have manually set the `static_build'
variable in the configure script to `no'.
Also, if the downloader fails, we'll delete the output (an empty file in
the case of Wget) because it interefers with a target definition.
|
|
The TeX Live installer needs Wget to operate smoothly, especially on recent
Mac OS systems that don't have Wget pre-installed. Also, it would be good
for the pipeline to have its own downloader. So with this commit, the
pipeline also installs Wget and OpenSSL which is a dependency.
Many other small changes/fixes were done in this process.
|
|
The pipeline now installs GCC and all its necessary prerequisites.
|
|
Until now we weren't explicity writing the full path of the dynamic
libraries necessary for linking a program. But now with
`-Wl,-rpath=$(ildir)' we ensure that the linker keeps the address of the
dynamic libraries necessary for linking at linking time, not running
time. Also, `pkg-config' is also built when preparing the basics. Several
other minor corrections were made thanks to the great help of Raúl Infante
Sainz.
|
|
The high-level dependencies are now built without having access to the
system's PATH. To do this, all the necessary software that we aren't
building ourselves are now brought into the installed `bin/' directory
using a symbolic link to the corresponding software on the host. To do
this, it was also necessary to increase the number of basic/low-level
packages that we are building, and add several more (Diffutils and
Findutils).
With this process in place, we now have a list of the exact software
packages that we are not building our selves, enabling easy building of all
such dependencies in the future.
|
|
While working on a research project using this pipeline, I noticed that we
don't have any `sh' executable within our PATH. However, some programs
(including Gnuastro's configure script, when it is checking for shells to
use with Libtool) check and use it. So after building Bash, we also build
an `sh' symbolic link to point to the built Bash executable.
|
|
In most analysis situations (except for simulations), an input dataset is
necessary, but that part of the pipeline was just left out and a general
`SURVEY' variable was set and never used. So with this commit, we actually
use a sample FITS file from the FITS standard webpage, show it (as well as
its histogram) and do some basic calculations on it.
This preparation of the input datasets is done in a generic way to enable
easy addition of more datasets if necessary.
|
|
A new version of the ghostscript package is now available, so the used
version in the pipeline (previously 9.25) has been incremented to 9.26.
|
|
When the C compiler is not GNU GCC, linking with GNU Binutils is going to
cause problems. So until the time that we can include GCC into this
pipeline, its best to avoid Binutils also.
Also, for building CMake, we were relying on an installed CMake, but now,
we are using its own `./bootstrap' script, so it can be built even if the
host system doesn't have CMake.
Also, for TeX Live, we are now setting a custom file as main target to
avoid complications with symbolic links as targets in Make.
Finally, when the user says they don't want to re-write an existing
configuration file, no extra notices will be printed and the configure
script will immediately start building programs.
|
|
Since the final product of the pipeline is a LaTeX-created PDF file, it was
necessary to also have LaTeX within the pipeline. With this commit, TeX
Live is also built as part of the configuration and all the necessary
packages to build the PDF are also installed and mentioned in the paper
along with their versions.
|
|
TeX Live is now also downloaded and built by the reproduction
pipeline. Currently on the basic (TeX and LaTeX) source is built but no
extra packages, so the PDF building will fail. We'll add them in the next
commit.
|
|
To have better control over the build, GNU Binutils, Bzip2, GNU Gzip, and
XZ Utils have also been added to the pipeline. Some other minor cleanups
and fixes were also implemented throughout the process.
|
|
Until now, when a package was to be built statically, we were adding the
`--static' option to `CFLAGS'. This was the wrong place to put it! It
should be in the linking step (thus `LDFLAGS'). Also, based on Bash's
configure script, we are now using the more generic form of `-static'
(single dash, not double dash).
On the other hand, the `--disable-shared' option isn't available in many of
the packages and it is highly redundant with the `-static' option, so it
has been removed to avoid an extra warning in such packages.
|
|
To ensure the easy unpacking and building of the programs, Lzip and Tar are
now also build during the initial setup phase.
Some minor corrections were also applied to make things cleaner and
smoother.
|
|
All the libraries that define their version string as a macro in their
headers are now also checked in `reproduce/src/make/initialize.mk'.
Also, the CFITSIO tarball now follows the same versioning style as the rest
of the tarballs: a script is added to convert the version string into what
is included in the tarball.
|
|
The version of all programs is now checked in
`reproduce/make/src/initialize.mk' and the pipeline won't complete if any
of the program versions change from those listed in
`reproduce/config/pipeline/dependency-versions.mk'.
Since the pipeline is systematically checking all program versions, we
don't need Gnuastro's `--onlyversion' option any more. So it (and all
references to it) have been removed.
|
|
During the configuration step several new programs that were necessary for
a more complete controlled environment are now also downloaded and built
statically.
|
|
To enable easy/proper reproduction of results, all the high-level
dependencies are now built within the pipeline and installed in a fixed
directory that is added to the PATH of the Makefile. This includes GNU Bash
and GNU Make, which are then used to run the pipeline.
The `./configure' script will first build Bash and Make within itself, then
it will build
All the dependencies are also built to be static. So after they are built,
changing of the system's low-level libraries (like C library) won't change
the tarballs.
Currently the C library and C compiler aren't built within the pipeline,
but we'll hopefully add them to the build process also.
With this change, we now have full control of the shell and Make that will
be used in the pipeline, so we can safely remove some of the generalities
we had before.
|