| Age | Commit message (Collapse) | Author | Lines | 
|---|
|  | Until now, management of the software names and versions in the paper was
done manually (a macro had to be defined in `initialize.mk', then used in
`paper.tex', so they had to be manually set in two places). Managing this
was not easy.
To fix this, with this commit, each software building rule's target is a
text file that contains its human-readable name and its version. In the
end, the configure script sorts them by their name and writes them into a
LaTeX input file that we can easily import as a file into the main paper. | 
|  | Until now, these versions were written in each run. This was mainly
inherited from the old days of the pipeline, where we didn't know the
software on the host. But now that we have almost everything under control,
we can just write these LaTeX macros at the end of the configure script and
make `initialize.mk' simpler and also (very slightly!) speed-up/simplify
the processing. | 
|  | The new files that were just added didn't have a copyright. One has been
added for them with this commit. | 
|  | For the tests, we had just set an absurd value for a test in the GCC build
recipe to always fail, but we had forgot to fix it. It is now corrected.
Also the order of making `g++' and `gfortran' was reversed for easier
readablility (it doesn't matter which one is done first, it only matters
that `gcc' be done last). | 
|  | We were developing the build of Numpy and Scipy on Mac in a parallel thread
and things seems to be working relatively nice now. There were only two
problems:
1) GCC still has some random building issues on Mac.
2) ATLAS shared libraries can't be built on Mac (so we used OpenBLAS to
   build Numpy and Scipy on both Mac and GNU/Linux).
But for now, none of these problems are critical. So, we can progress in
one branch.
There were only very minor conflicts in the merge. | 
|  | We wer not able to build `gcc' on Mac, so we are using links to the host
compilers. In this commit we also found that on Mac the HDF5 library
needs an explicit definition of the compilers. | 
|  | After trying the build a system with no Python library, I noticed that
Python's HDF5 module (`h5py') needs the HDF5 library and OpenMPI (to work
in parallel). So they were added. Finally `h5py' uses the `mpi4py' module
to communicate with OpenMPI, so it was also added. However, for some
reason, mpi4py doesn't work with this version of OpenMPI (as described in
the comments above).
So for now, h5py doesn't use it and can only work on a single thread, while
the HDF5 C library links with OpenMPI with no problem. | 
|  | Until this commit, the installation of all Python packages were
done in a separate Makefile.
With this commit, the pipeline install Python packages as part of the
hight level software. All Python packages rules  them remain in a
separate Makefile, but this Makefile is included in the high level
dependency `reproduce/src/make/dependencies.mk'. | 
|  | We could not get ATLAS shared libraries on Mac (while the static ATLAS
libraries are built and can be used successfully on Mac). So, the
pipeline now builds OpenBLAS, which both Numpy and Scipy can use on Mac
and GNU/Linux.
We also added FFTW as a dependency of Numpy. Altough Numpy is not linking to
FFTW for some reason. However, since FFTW is a low level library used by
many programs, we have kept it as a dependency of Numpy anyway for now. | 
|  | After doing a systematic search for files without a copyright notice, a few
more were found that didn't have a notice. So a notice was added for them.
I used this Bash command to find the files:
for f in $(find ./ -type f); do \
  if [[ $f != *.git* ]]; then \
    n=$(grep -i copyright $f | wc -l); \
    echo "$n $f"; \
  fi; \
done | awk '$1==0' | 
|  | A short, all-permisive copyright notice was added to the configuration
files that were missing one. | 
|  | To help make it easier to re-use (like the rest of the "large" files). | 
|  | The Makefile that build the shared libraries comes from Arch Linux so it
does not work easily on Mac. But the full ATLAS build goes successfully
for static libraries. For now we are disabling shared libraries on Mac.
Python was built explicity with `clang' on Mac. | 
|  | Until now, we were using `flock' (file-lock) for downloading the input
datasets in series. But we couldn't do this when downloading the software
tarballs because `flock' wasn't yet available. Generally, unlike
processing, downloading is much better done in series than in parallel.
To enable serial downloads of the software also, with this commit we are
installing `flock' in the configure script (not in a Makefile). As a
result, besides `flock', we can also benefit from the other good features
of the `reproduce/src/bash/download-multi-try' script *(for example
attempting download again after some time).
Some GNU mirrors may have problems at the time of download, so with this
commit, we are using the main GNU FTP server for GNU programs. | 
|  | Until now, we were simply using the host's GCC for Mac systems. But we
found that except for a single step (to fixing `rpath'), it works on
Mac!!!  So, GCC is now part of the Mac build as well.
However, we are still having some problems in building ATLAS on Mac. It
works on GNU/Linux, but not in Mac. So for the time being (just
temporarily), we are avoiding ATLAS (and thus Scipy) on Mac systems. We
just filed an issue on the ATLAS discussion list to hopefully fix the
problem soon. | 
|  | We just noticed that recently the `paste' command on macOS doesn't work
with a pipe. So we are now simply using the `tr' command in reverse to
re-create the PATH (to find where to link to). | 
|  | Until now we were using a symbolic link to replace GCC, but Make doesn't
treat symbolic links like files. So it would rebuild the links every
time. With this commit, only for GCC on Mac systems, we are actually
copying the host's GCC executable to avoid this problem.
Also, a wrong comment for cURL was removed. | 
|  | Conflicts in `gcc' build comments and in mentioning software used in
paper fixed. | 
|  | We generalized the libraries suffixes to work on Mac and GNU/Linux. | 
|  | In this commit we add `h5py' Python package.
We also include `setuptools' as a main dependency of Python because with the
previous commit it (as well as `pip') is no longer installed with Python.
Numpy version also has been incremented. | 
|  | Numpy needs ATLAS as shared libraries. So we also need to build Python with
shared libraries.  We also need to define site.cfg for numpy and scipy so we
define a master template:
`reproduce/config/pipeline/dependency-numpy-scipy.cfg'
Also `Openssl' did not have rpath so we added with this commit. | 
|  | An initial installation of atlas is now included in the pipeline,
but we are still trying to make it compile and build smoothly. In
the process, we found that GCC also needs some modifications
(for example rpath issues). | 
|  | Until recently, there was no problem with the `makelink' script of
`dependencies-basic.mk' because it was called on separate recipe lines (and
thus separate shells). But recently we added a call to it within a single
shell (for GCC on Mac OS systems). So a previous call to it would effect
the next call. To fix this, in this commit, we are re-setting PATH to its
original value after each call finishes. | 
|  | Bzip2 has a special/separate Makefile to build shared libraries which
didn't work on a macOS. So with this commit, we are allowing Bzip2 shared
libraries only on macOS systems.
Also, I noticed that macOS's `sed' doesn't have the `-i' option (to do the
change in place within the same file). So we are using `-e' to write the
changed Makefile in a temporary directory, then rename that. | 
|  | Until now, we were actually running all the programs to check their
versions during initialization. But now that the number of programs has
increased, this can be slow. With this commit, we simply report the version
as a constant string. Maybe later, we can follow the strategy of the TeX
Live packages and write them all at configure time. | 
|  | We still have a few problems with building GCC on a MacOS system. To allow
using the pipeline on this operating system, until we find the solution,
GCC is only built on non-Mac systems. On Mac, we'll just make a symbolic
link to the host's executables. | 
|  | To ensure that we have all the necessary Python dependencies, I done an
offline build and noticed that several packages were also necessary for the
`./configure' step to finish (`libffi', `asn1crypto', `cffi', `jeepney',
`pycparser' and `secretstorage'). With this commit they are added. | 
|  | Until now, we were only resetting the Python environment variables in the
actual processing Makefiles, not in the Makefile that build Python and its
modules. They are now added there also. | 
|  | With the help of Raul, we were able to build many higher-level Python
packages to enable the installation of packages like Matplotlib and
Astroquery. With this commit, that work is being merged into the master
branch. | 
|  | Until this commit, we had some of the python packages intalled
but they did not work properly because of the `PYTHONPATH' variables.
That is, the pipeline's `python' was the `python' of the system
instead of the pipeline's `python'.
With this commit this issue has been fixed by setting the correct
`PYTHONPATH'. In this commit we also modify the installation of
`bzip2' because `CMake' was complaining about some libraries built
statically. | 
|  | In the libpng installation there was `ilibdir' instead of `ilidir'. | 
|  | Until now, the pipeline was not installing its own `gcc' but using the
system one by making a symbolic link.
With this commit, GNU GCC has been added into the pipeline. Right now
the installation does not work on Mac OS system beause of some conflicts
with `clang', but in principle it should work on GNU Linux distributions. | 
|  | Until now the installation of Python and its packages (numpy, astropy,
astroquery, etc.) were done in the same `makefile'.
With this commit the installation of Python and its packages have been
split and now it is independent of the other programs. The installation
of all Python packages needs to be written explicitely because pip is
not used anymore. | 
|  | Until now, once the Git hooks have been installed (after the
installation of Metastore), if metastore doesn't exist (for example by
manually deleting the build directory for a re-build with same
configurations as before) we can't run `git commit' and `git checkout'
will print an ugly warning.
With this commit, the two Git hooks check for the existance of Metastore
and if it doesn't exist, they won't do anything. | 
|  | In an attempt to test the GCC build rule (without Binutils, because its too
architecture dependent), all the necessary dependencies were moved to GCC
(from `ld'). Also `fortran' was also added to the languages supported by
GCC. This rule built GCC 8.2.0 nicely on my GNU/Linux system. But `gcc' is
still not a final target to built, so the rule is being ignored for now. | 
|  | As matplotlib is a general package for plotting and it is widely
used in science, we have added it to the pipeline.
When installing a dependency of matplotlib `python-dateutil', we
found a conflict in the download of the tarball. This is because
the name has a dash (-) in the middle. In addition, the name starts
with 'python', so it is the same as the python itself. Now it is
possible to install any package with any name, just adding an elif
in before the URL direction. | 
|  | All dependencies for building astroquery package have been done.
Until nowthe Python dependencies were built in the same Makefile
as the high level libraries and programs. But, because astroquery
has many dependencies we split the Python and Python packages
installation in a new Makefile.
The installation of differents packages are done using Python and
not pip, because we found some problems when doing it with pip.
Apparently there are some interferences between the packages
installed by the pip of the system and the pip installed as part
of Python in the pipeline. | 
|  | As in all programs, the build process of ncurses depends on the running
shell (Bash) and AWK. At the start of the building of ncurses, we remove
its library. But Bash and AWK depend on ncurses to run (this creates a
circular dependency). Therefore its necessary to remove the Bash and AWK
executables when re-building ncurses.
This bug was found by Raul Infante Sainz. | 
|  | Raul Infante-Sainz added the building of Python (along with the Numpy and
Astropy packages) into the pipeline. That work is now being merged into the
main pipeline branch.
There was only this small problem that needed to be fixed: the Python
tarball's name after unpacking is actually `Python-X.X.X' (with a captial
P), not `python-X.X.X'. This has been corrected with this merge. | 
|  | The zip program wasn't placed correctly (in alphabetical order) and its URL
command had the wrong indentation! Both have no effect at all on the
processing and are only cosmetic (to help in readability). | 
|  | Astropy was added and one very important thing is that we have to
use the pypi tarball (https://pypi.org/) (which is bootstrapped)
and not the github tarball. | 
|  | Python needs some packages to be really useful. Numpy is the most
important package for using Python and a lot of other packages
depend on it.
In this commit we add numpy to the pipeline. The tarball of numpy
right now is fossies. | 
|  | Many projects use Python so it is necessary include it in the
pipeline. | 
|  | In the example running code of the wrapper script, I had just written
`./download-multi-try', but this script is meant to be run from the top of
the project directory. This could cause confusion.
So the example script now starts with `/path/to/download-multi-try'. | 
|  | We don't have a `.sh' suffix in the other scripts of `reproduce/src/bash',
so it was also removed from this script. | 
|  | Until now, downloading was treated similar to any other operation in the
Makefile: if it crashes, the pipeline would crash. But network errors
aren't like processing errors: attempting to download a second time will
probably not crash (network relays are very complex and not reproducible
and packages get lost all the time)!
This is usually not felt in downloading one or two files, but when
downloading many thousands of files, it will happen every once and a while
and its a real waste of time until you check to just press enter again!
With this commit we have the `reproduce/src/bash/download-multi-try.sh'
script in the pipeline which will repeat the downoad several times (with
incrasing time intervals) before crashing and thus fix the problem. | 
|  | In order to collaborate effectively in the project, even project members
that don't necessarily want (or have the capacity) to do the whole analysis
must be able to contribute to the project. Until now, the users of the
distributed tarball could only modify the text and not the figures (built
with PGFPlots) of the paper.
With this commit, the management of TeX source files in the pipeline was
slightly modified to allow this as cleanly as I could think of now! In
short, the hand-written TeX files are now kept in `tex/src' and for the
pipeline's generated TeX files (in particular the old `tex/pipeline.tex'),
we now have a `tex/pipeline' symbolic-link/directory that points to the
`tex' directory under the build directory.
When packaging the project, `tex/pipeline' will be a full directory with a
copy of all the necessary files. Therefore as far as LaTeX is concerned,
having a build-directory is no longer relevant. Many other small changes
were made to do this job cleanly which will just make this commit message
too long!
Also, the old `tarball' and `zip' targets are now `dist' and `dist-zip' (as
in the standard GNU Build system). | 
|  | With this commit, it is now possible to package the project into a tarball
or zip file, ready to be distributed to collaborators who only want to
modify the final paper (and not do the analysis technicalities), or for
uploading to sites like arXiv, or online LaTeX sharing pages. | 
|  | Until now, the group name to build the project actually went into the Git
source of the project! This doesn't allow exact reproducibility on
different machines (where the group name may be different).
With this commit, the `for-group' script has been modified to accept the
group name as its first argument and pass that onto `configure' and
Make. This is much better now, because not only the existance of a group
installation is checked, but also the name of the group. It also made
things simpler (in particular in `LOCAL.mk.in'). | 
|  | I recently found another fork of metastore that allows its build on macOS
systems (https://github.com/mpctx/metastore). So I forked it into my own
fork with several other corrections (mostly cosmetic!), so it is now much
better suited for this pipeline.
Raul Infante-Sainz has already tested the building of metastore on his
macOS. In a previous test, we also noticed that libbsd should not be built
on Mac systems, so it is now a conditional prerequisite to metastore. |