Age | Commit message (Collapse) | Author | Lines |
|
Until now, `metastore' did not depend on the necessaries programs that
we use to install it (`awk', `coreutils' and `sed'). They are not
official dependencies of `metastore', but we need them to install it.
With this commit, we put these programs as prerequisites of `metastore'
in order to be able to install it without any problem.
|
|
In two places, I had mistakenly put a <'> instead of a <`>, causing bad
highlighting in the markdown rendering. They have been corrected.
Also, one long line in was broken up into several.
|
|
Until now, the software building and analysis steps of the pipeline were
intertwined. However, these steps (of how to build a software, and how to
use it) are logically completely independent.
Therefore with this commit, the pipeline now has a new architecture
(particularly in the `reproduce' directory) to emphasize this distinction:
The `reproduce' directory now has the two `software' and `analysis'
subdirectories and the respective parts of the previous architecture have
been broken up between these two based on their function. There is also no
more `src' directory. The `config' directory for software and analysis is
now mixed with the language-specific directories.
Also, some of the software versions were also updated after some checks
with their webpages.
This new architecture will allow much more focused work on each part of the
pipeline (to install the software and to run them for an analysis).
|
|
All occurances of "pipeline" have been chanaged to "project" or "template"
withint the text (comments, READMEs, and comments) of the template. The
main template branch is now also named `template'.
This was all because `pipeline' is too generic and couldn't be
distinguished from the base, and customized project.
|
|
Since `.file-metadata' is a binary file, we can't include a copyright
inside of it so we have to use `README.md' to mention its copyright and
license notice. However, this was not done clearly and is now corrected.
|
|
Until now we weren't including the citation for FFTW (one of the template's
optional packages). With this commit, it is added.
|
|
Until now, there was a single `tex/src/references.tex' file that housed the
BibTex entries for everything (software and non-software).
Since we have started to include the BibTeX entry for more software, it
will be hard to manage the large (sometime unused) BibTeX entries of the
software in the middle of the non-software related citations in the text of
the paper.
Therefore, with this commit, a `tex/dependencies' directory has been made
which has a separate BibTeX entry file for each software that needs
one. After the software is built, this file is copied to the new
`.local/version-info/cite' directory. At the end, the configure script will
concatenate all the files in this directory into one file which will later
be used with `tex/src/references.tex' by BibLaTeX.
This greatly simplifies managing of citations. Allowing us to focus on the
software-building and paper-writing citations separately/cleanly (and thus
be more efficient in both).
|
|
Until now, we did not have `file'. It was in other project, where a
problem with `Astrometry-net' software, ends up with the necessity of
having `file' into the pipeline.
With this commit, we add `file' to the project. Since it is a low level
program, it is set in `dependencies-basic.mk' as a prerequisite of GCC.
|
|
Until now, the Scipy citation was only one paper and not the correct one
(it was the online manual).
With this commit, Scipy is properly cited using the two papers. Also
some modifications in the `tex/src/references.tex' have been done
(remove last page number).
|
|
Until now, name and version of all Python packages were indicated in the
final paper, but not the main paper of them (if it exists).
With this commit, some Python packages (Cython, Matplotlib, Numpy and
Scipy) are now properly acknoledged by citating the source paper.
`mpi4py' is also cited although this package is not yet included into
the pipeline.
|
|
Since we mixed the installation of Python packages with all other
software, it may occur that some Python packages start to be installed
before having installed `unzip'. As a consecuence, they could not be
decompressed and the installation will fail. In particular, tarballs of
Numpy and Setuptools are .zip files.
With this commit, we fix this issue by setting `unzip' as a prerequisit
of Numpy and Setuptools.
|
|
Until this commit, we were using the target (version number of the
program) in the `patchelf' for `awk' and `bash'. This makes an incorrect
linking in libraries because the target is not the bin program but just
a plain text containing the version number of the program.
With this commit we fix this issue by setting in the patchelf of `awk'
and `bash' the bin executable, and not the target (version number).
|
|
Yahya Sefidbakht reported the following error when building Pkg-config on
his Mac OS system (using GCC, not Clang). It is apparently because his
version of GCC doesn't support some speical feature on Mac that is
necessary to build Glib as part of Pkg-config.
With this commit, on Mac systems, for pkg-config we are explicity asking to
build with Clang (through the `CC' flag).
|
|
In order to get a consistent final result, in its later steps, the
configure script uses our own build of the basic command-line tools (like
`cat', `awk').
Also, a correction was made to the short option parsing errors when an
unwanted argument is given, and the `-?*' was changed to `-'?'*' to avoid
un-necessary shell interpretation (for example giving unreasonable
results).
|
|
On some systems, M4 isn't available, so the linking to the host system
fails, as a result, we can't build GNU Libtool.
The main reason we weren't building M4 was a bug with the most recent GNU C
library
(http://lists.gnu.org/archive/html/bug-gnulib/2019-04/msg00004.html). But I
found a patch used by Arch Linux which fixes the issue and allows M4 to be
built. As a result, the pipeline is now building M4 also and the patched M4
tarball is now uploaded to my own webpage as backup.
While doing the steps above, I also noticed that we weren't using a tab at
the start of the link definitions of `dependencies-basic.mk'. Although its
not necessary, to be consistent, its good for the lines to always start
with a tab.
|
|
The step where we check the possibility of using `sys/cdefs.h' was still
using `$$' for shell variables (in Make), not `$' (for the shell). This was
corrected.
Also, since Astropy needs two citations, the `,' in the citation command
would conflict with Make's parsing. So we just used an `echo' command to
re-write the version info.
In Astroquery, the prerequisite list was just reordered by length to be
more clear to the eye.
|
|
Until recently we were using an actual installed executable file for the
programs. So for Gnuastro, the target was called `astnoisechisel'. But
recently, this approach was changed and the target for each software is a
simple text file with the official software name and version.
So with this commit, we are simply using `gnuastro' for Gnuastro, not
`astnoisechisel'.
|
|
This work is now merged, I just added the new argument to the `pybuild'
function.
|
|
Until now, management of the software names and versions in the paper was
done manually (a macro had to be defined in `initialize.mk', then used in
`paper.tex', so they had to be manually set in two places). Managing this
was not easy.
To fix this, with this commit, each software building rule's target is a
text file that contains its human-readable name and its version. In the
end, the configure script sorts them by their name and writes them into a
LaTeX input file that we can easily import as a file into the main paper.
|
|
After trying to set the pipeline from scratch with no internet conection
(but all tarballs already downladed), `h5py' Python package complained
about not having access to download `pkgconfig'. After solving this
dependency, it also complained about not having `cython'.
With this commit, we add `pkgconfig' (Python) and `cython' to the
pipeline in order to be able to install `h5py' properly.
|
|
Until now, these versions were written in each run. This was mainly
inherited from the old days of the pipeline, where we didn't know the
software on the host. But now that we have almost everything under control,
we can just write these LaTeX macros at the end of the configure script and
make `initialize.mk' simpler and also (very slightly!) speed-up/simplify
the processing.
|
|
Until now, the steps to manage the command-line options of the configure
script were limited (couldn't accept an equal sign or space between the
option name and value). With this commit, it can now also accept optional
equal signs between the option name and value. Thus not causing many
confusions.
Also, it is more logically consistent for the link to the build-directory
to be placed in the top directory (as a hidden file like `.local' until
now), and not as a visible directory like `reproduce/build' (which we used
until now). Therefore, with this commit, the link to easily access the
build-directory is `.build' in the top source directory.
Finally, because `minmapsize' is too specific to Gnuastro and has now been
given its default value at the start of the configure script, the
description for `minmapsize' has been removed (to not confuse users who
don't use Gnuastro). If anyone is familiar enough with Gnuastro to change
it, they already know it from its book.
|
|
In the previous commit, a copyright notice was added after a systematic
search of the version controlled files. However, we missed `.gitignore'
(because we were discarding those with the `*.git*' pattern to avoid files
in the `.git' directory). This has been fixed by using this command (in the
top project directory) instead:
for f in $(find ./ -type f); do \
if [[ $f != ./.git/* ]]; then \
n=$(grep -i copyright $f | wc -l); \
echo "$n $f"; \
fi; \
done | awk '$1==0'
|
|
After doing a systematic search for files without a copyright notice, a few
more were found that didn't have a notice. So a notice was added for them.
I used this Bash command to find the files:
for f in $(find ./ -type f); do \
if [[ $f != *.git* ]]; then \
n=$(grep -i copyright $f | wc -l); \
echo "$n $f"; \
fi; \
done | awk '$1==0'
|
|
To help make it easier to re-use (like the rest of the "large" files).
|
|
In order to be more clear, a copyright statement was added to all the LaTeX
and README files.
|
|
With the options, it is now possible to run the configure script more
easily after the initial run. The `--help' option provides a nice and
complete introduction along with a listing of the input options and the
`-j' option can be use to manually set the number of threads.
|
|
Until now, we were using `flock' (file-lock) for downloading the input
datasets in series. But we couldn't do this when downloading the software
tarballs because `flock' wasn't yet available. Generally, unlike
processing, downloading is much better done in series than in parallel.
To enable serial downloads of the software also, with this commit we are
installing `flock' in the configure script (not in a Makefile). As a
result, besides `flock', we can also benefit from the other good features
of the `reproduce/src/bash/download-multi-try' script *(for example
attempting download again after some time).
Some GNU mirrors may have problems at the time of download, so with this
commit, we are using the main GNU FTP server for GNU programs.
|
|
The LaTeX macro for libgit2 was not properly used in `paper.tex'.
On Mac systems, after browsing the directory, a `.DS_Store' file was
created. So to keep things clean on those systems, it is added to the files
to be ignored by Git.
|
|
Until recently, there was no problem with the `makelink' script of
`dependencies-basic.mk' because it was called on separate recipe lines (and
thus separate shells). But recently we added a call to it within a single
shell (for GCC on Mac OS systems). So a previous call to it would effect
the next call. To fix this, in this commit, we are re-setting PATH to its
original value after each call finishes.
|
|
Bzip2 has a special/separate Makefile to build shared libraries which
didn't work on a macOS. So with this commit, we are allowing Bzip2 shared
libraries only on macOS systems.
Also, I noticed that macOS's `sed' doesn't have the `-i' option (to do the
change in place within the same file). So we are using `-e' to write the
changed Makefile in a temporary directory, then rename that.
|
|
Until now, we were actually running all the programs to check their
versions during initialization. But now that the number of programs has
increased, this can be slow. With this commit, we simply report the version
as a constant string. Maybe later, we can follow the strategy of the TeX
Live packages and write them all at configure time.
|
|
Since the `install' script also sets permissions manually, the permissions
that we define in `for-group' don't usually affect the installed
files. Therefore the installed files of one user can't be modified/deleted
by another. With this commit, after for-group finishes configuration, it
also adds the write flag for all group members in the whole installation
directory.
|
|
Until now the `./for-group' script would only add one argument to the Make
call, but in some situations, you need a second argument is well. With this
option, any possible fourth argument to `./for-group' is passed to Make.
|
|
We still have a few problems with building GCC on a MacOS system. To allow
using the pipeline on this operating system, until we find the solution,
GCC is only built on non-Mac systems. On Mac, we'll just make a symbolic
link to the host's executables.
|
|
To ensure that we have all the necessary Python dependencies, I done an
offline build and noticed that several packages were also necessary for the
`./configure' step to finish (`libffi', `asn1crypto', `cffi', `jeepney',
`pycparser' and `secretstorage'). With this commit they are added.
|
|
With the help of Raul, we were able to build many higher-level Python
packages to enable the installation of packages like Matplotlib and
Astroquery. With this commit, that work is being merged into the master
branch.
|
|
Until this commit, we had some of the python packages intalled
but they did not work properly because of the `PYTHONPATH' variables.
That is, the pipeline's `python' was the `python' of the system
instead of the pipeline's `python'.
With this commit this issue has been fixed by setting the correct
`PYTHONPATH'. In this commit we also modify the installation of
`bzip2' because `CMake' was complaining about some libraries built
statically.
|
|
This section was a little outdated and since then, a more clear/exact image
of using the Nix experience for the reproducible paper template has been
added.
|
|
In an attempt to test the GCC build rule (without Binutils, because its too
architecture dependent), all the necessary dependencies were moved to GCC
(from `ld'). Also `fortran' was also added to the languages supported by
GCC. This rule built GCC 8.2.0 nicely on my GNU/Linux system. But `gcc' is
still not a final target to built, so the rule is being ignored for now.
|
|
As matplotlib is a general package for plotting and it is widely
used in science, we have added it to the pipeline.
When installing a dependency of matplotlib `python-dateutil', we
found a conflict in the download of the tarball. This is because
the name has a dash (-) in the middle. In addition, the name starts
with 'python', so it is the same as the python itself. Now it is
possible to install any package with any name, just adding an elif
in before the URL direction.
|
|
All dependencies for building astroquery package have been done.
Until nowthe Python dependencies were built in the same Makefile
as the high level libraries and programs. But, because astroquery
has many dependencies we split the Python and Python packages
installation in a new Makefile.
The installation of differents packages are done using Python and
not pip, because we found some problems when doing it with pip.
Apparently there are some interferences between the packages
installed by the pip of the system and the pip installed as part
of Python in the pipeline.
|
|
As in all programs, the build process of ncurses depends on the running
shell (Bash) and AWK. At the start of the building of ncurses, we remove
its library. But Bash and AWK depend on ncurses to run (this creates a
circular dependency). Therefore its necessary to remove the Bash and AWK
executables when re-building ncurses.
This bug was found by Raul Infante Sainz.
|
|
Raul Infante-Sainz added the building of Python (along with the Numpy and
Astropy packages) into the pipeline. That work is now being merged into the
main pipeline branch.
There was only this small problem that needed to be fixed: the Python
tarball's name after unpacking is actually `Python-X.X.X' (with a captial
P), not `python-X.X.X'. This has been corrected with this merge.
|
|
The zip program wasn't placed correctly (in alphabetical order) and its URL
command had the wrong indentation! Both have no effect at all on the
processing and are only cosmetic (to help in readability).
|
|
Astropy was added and one very important thing is that we have to
use the pypi tarball (https://pypi.org/) (which is bootstrapped)
and not the github tarball.
|
|
Python needs some packages to be really useful. Numpy is the most
important package for using Python and a lot of other packages
depend on it.
In this commit we add numpy to the pipeline. The tarball of numpy
right now is fossies.
|
|
Many projects use Python so it is necessary include it in the
pipeline.
|
|
In the example running code of the wrapper script, I had just written
`./download-multi-try', but this script is meant to be run from the top of
the project directory. This could cause confusion.
So the example script now starts with `/path/to/download-multi-try'.
|
|
We don't have a `.sh' suffix in the other scripts of `reproduce/src/bash',
so it was also removed from this script.
|