paper-concept.git - Paper (Towards Long-term and Archivable Reproducibility)

Age	Commit message (Collapse)	Author	Lines
2019-04-14	Corrected copyright information for .file-metadata	Mohammad Akhlaghi	-0/+0
	Since `.file-metadata' is a binary file, we can't include a copyright inside of it so we have to use `README.md' to mention its copyright and license notice. However, this was not done clearly and is now corrected.
2019-04-14	Added citation for FFTW	Mohammad Akhlaghi	-0/+0
	Until now we weren't including the citation for FFTW (one of the template's optional packages). With this commit, it is added.
2019-04-12	Dependency BibTeX entries included only when necessary	Mohammad Akhlaghi	-0/+0
	Until now, there was a single `tex/src/references.tex' file that housed the BibTex entries for everything (software and non-software). Since we have started to include the BibTeX entry for more software, it will be hard to manage the large (sometime unused) BibTeX entries of the software in the middle of the non-software related citations in the text of the paper. Therefore, with this commit, a `tex/dependencies' directory has been made which has a separate BibTeX entry file for each software that needs one. After the software is built, this file is copied to the new `.local/version-info/cite' directory. At the end, the configure script will concatenate all the files in this directory into one file which will later be used with `tex/src/references.tex' by BibLaTeX. This greatly simplifies managing of citations. Allowing us to focus on the software-building and paper-writing citations separately/cleanly (and thus be more efficient in both).
2019-04-12	File is built as a dependency of GCC	Raul Infante-Sainz	-0/+0
	Until now, we did not have `file'. It was in other project, where a problem with `Astrometry-net' software, ends up with the necessity of having `file' into the pipeline. With this commit, we add `file' to the project. Since it is a low level program, it is set in `dependencies-basic.mk' as a prerequisite of GCC.
2019-04-12	Fixed some Scipy-related packages citations	Raul Infante-Sainz	-0/+0
	Until now, the Scipy citation was only one paper and not the correct one (it was the online manual). With this commit, Scipy is properly cited using the two papers. Also some modifications in the `tex/src/references.tex' have been done (remove last page number).
2019-04-12	Acknowledged Scipy-related packages: Cython, Matplotlib, Numpy and Scipy	Raul Infante-Sainz	-0/+0
	Until now, name and version of all Python packages were indicated in the final paper, but not the main paper of them (if it exists). With this commit, some Python packages (Cython, Matplotlib, Numpy and Scipy) are now properly acknoledged by citating the source paper. `mpi4py' is also cited although this package is not yet included into the pipeline.
2019-04-11	Unzip set as prerequisite of Numpy and Setuptools	Raul Infante-Sainz	-0/+0
	Since we mixed the installation of Python packages with all other software, it may occur that some Python packages start to be installed before having installed `unzip'. As a consecuence, they could not be decompressed and the installation will fail. In particular, tarballs of Numpy and Setuptools are .zip files. With this commit, we fix this issue by setting `unzip' as a prerequisit of Numpy and Setuptools.
2019-04-10	Using bin executable in patchelf for awk and bash	Raul Infante-Sainz	-0/+0
	Until this commit, we were using the target (version number of the program) in the `patchelf' for `awk' and `bash'. This makes an incorrect linking in libraries because the target is not the bin program but just a plain text containing the version number of the program. With this commit we fix this issue by setting in the patchelf of `awk' and `bash' the bin executable, and not the target (version number).
2019-04-08	Using Clang on Mac OS systems for pkg-config	Mohammad Akhlaghi	-0/+0
	Yahya Sefidbakht reported the following error when building Pkg-config on his Mac OS system (using GCC, not Clang). It is apparently because his version of GCC doesn't support some speical feature on Mac that is necessary to build Glib as part of Pkg-config. With this commit, on Mac systems, for pkg-config we are explicity asking to build with Clang (through the `CC' flag).
2019-04-07	Configure script using our build programs in final steps	Mohammad Akhlaghi	-0/+0
	In order to get a consistent final result, in its later steps, the configure script uses our own build of the basic command-line tools (like `cat', `awk'). Also, a correction was made to the short option parsing errors when an unwanted argument is given, and the `-?' was changed to `-'?'' to avoid un-necessary shell interpretation (for example giving unreasonable results).
2019-04-07	GNU M4 now built as a dependency of GNU Libtool	Mohammad Akhlaghi	-0/+0
	On some systems, M4 isn't available, so the linking to the host system fails, as a result, we can't build GNU Libtool. The main reason we weren't building M4 was a bug with the most recent GNU C library (http://lists.gnu.org/archive/html/bug-gnulib/2019-04/msg00004.html). But I found a patch used by Arch Linux which fixes the issue and allows M4 to be built. As a result, the pipeline is now building M4 also and the patched M4 tarball is now uploaded to my own webpage as backup. While doing the steps above, I also noticed that we weren't using a tab at the start of the link definitions of `dependencies-basic.mk'. Although its not necessary, to be consistent, its good for the lines to always start with a tab.
2019-04-07	Corrections in configure script and astroquery, astropy rules	Mohammad Akhlaghi	-0/+0
	The step where we check the possibility of using `sys/cdefs.h' was still using `$$' for shell variables (in Make), not `$' (for the shell). This was corrected. Also, since Astropy needs two citations, the `,' in the citation command would conflict with Make's parsing. So we just used an `echo' command to re-write the version info. In Astroquery, the prerequisite list was just reordered by length to be more clear to the eye.
2019-04-07	Using gnuastro name for Gnuastro's target, not astnoisechisel	Mohammad Akhlaghi	-0/+0
	Until recently we were using an actual installed executable file for the programs. So for Gnuastro, the target was called `astnoisechisel'. But recently, this approach was changed and the target for each software is a simple text file with the official software name and version. So with this commit, we are simply using `gnuastro' for Gnuastro, not `astnoisechisel'.
2019-04-05	Imported work on adding Cython and Python's pkgconfig module	Mohammad Akhlaghi	-0/+0
	This work is now merged, I just added the new argument to the `pybuild' function.
2019-04-05	Software acknowledgement section is automatically generated	Mohammad Akhlaghi	-0/+0
	Until now, management of the software names and versions in the paper was done manually (a macro had to be defined in `initialize.mk', then used in `paper.tex', so they had to be manually set in two places). Managing this was not easy. To fix this, with this commit, each software building rule's target is a text file that contains its human-readable name and its version. In the end, the configure script sorts them by their name and writes them into a LaTeX input file that we can easily import as a file into the main paper.
2019-04-05	Python cython and pkgconfig packages added for h5py	Raul Infante-Sainz	-0/+0
	After trying to set the pipeline from scratch with no internet conection (but all tarballs already downladed), `h5py' Python package complained about not having access to download `pkgconfig'. After solving this dependency, it also complained about not having `cython'. With this commit, we add `pkgconfig' (Python) and `cython' to the pipeline in order to be able to install `h5py' properly.
2019-04-04	Dependency version LaTeX macros written at the end of configure	Mohammad Akhlaghi	-0/+0
	Until now, these versions were written in each run. This was mainly inherited from the old days of the pipeline, where we didn't know the software on the host. But now that we have almost everything under control, we can just write these LaTeX macros at the end of the configure script and make `initialize.mk' simpler and also (very slightly!) speed-up/simplify the processing.
2019-04-04	Better option-reading in configure, using .build to access BDIR	Mohammad Akhlaghi	-0/+0
	Until now, the steps to manage the command-line options of the configure script were limited (couldn't accept an equal sign or space between the option name and value). With this commit, it can now also accept optional equal signs between the option name and value. Thus not causing many confusions. Also, it is more logically consistent for the link to the build-directory to be placed in the top directory (as a hidden file like `.local' until now), and not as a visible directory like `reproduce/build' (which we used until now). Therefore, with this commit, the link to easily access the build-directory is `.build' in the top source directory. Finally, because `minmapsize' is too specific to Gnuastro and has now been given its default value at the start of the configure script, the description for `minmapsize' has been removed (to not confuse users who don't use Gnuastro). If anyone is familiar enough with Gnuastro to change it, they already know it from its book.
2019-04-02	.gitignore with copyright using better search for copyright notice	Mohammad Akhlaghi	-0/+0
	In the previous commit, a copyright notice was added after a systematic search of the version controlled files. However, we missed `.gitignore' (because we were discarding those with the `.git' pattern to avoid files in the `.git' directory). This has been fixed by using this command (in the top project directory) instead: for f in $(find ./ -type f); do \ if [[ $f != ./.git/* ]]; then \ n=$(grep -i copyright $f \| wc -l); \ echo "$n $f"; \ fi; \ done \| awk '$1==0'
2019-04-02	Copyright notice added to remaining files	Mohammad Akhlaghi	-0/+0
	After doing a systematic search for files without a copyright notice, a few more were found that didn't have a notice. So a notice was added for them. I used this Bash command to find the files: for f in $(find ./ -type f); do \ if [[ $f != .git ]]; then \ n=$(grep -i copyright $f \| wc -l); \ echo "$n $f"; \ fi; \ done \| awk '$1==0'
2019-03-29	Added copyright information in dependency-versions.mk	Mohammad Akhlaghi	-0/+0
	To help make it easier to re-use (like the rest of the "large" files).
2019-03-29	Added Copyright to all TeX and README files	Mohammad Akhlaghi	-0/+0
	In order to be more clear, a copyright statement was added to all the LaTeX and README files.
2019-03-28	Configure script now has options	Mohammad Akhlaghi	-0/+0
	With the options, it is now possible to run the configure script more easily after the initial run. The `--help' option provides a nice and complete introduction along with a listing of the input options and the `-j' option can be use to manually set the number of threads.
2019-03-28	flock is now built in configure, to allow serial downloads	Mohammad Akhlaghi	-0/+0
	Until now, we were using `flock' (file-lock) for downloading the input datasets in series. But we couldn't do this when downloading the software tarballs because `flock' wasn't yet available. Generally, unlike processing, downloading is much better done in series than in parallel. To enable serial downloads of the software also, with this commit we are installing `flock' in the configure script (not in a Makefile). As a result, besides `flock', we can also benefit from the other good features of the `reproduce/src/bash/download-multi-try' script *(for example attempting download again after some time). Some GNU mirrors may have problems at the time of download, so with this commit, we are using the main GNU FTP server for GNU programs.
2019-03-19	Minor corrections: typo and adding file to .gitignore	Mohammad Akhlaghi	-0/+0
	The LaTeX macro for libgit2 was not properly used in `paper.tex'. On Mac systems, after browsing the directory, a `.DS_Store' file was created. So to keep things clean on those systems, it is added to the files to be ignored by Git.
2019-03-18	Reseting path in script to make symbolic links to system programs	Mohammad Akhlaghi	-0/+0
	Until recently, there was no problem with the `makelink' script of `dependencies-basic.mk' because it was called on separate recipe lines (and thus separate shells). But recently we added a call to it within a single shell (for GCC on Mac OS systems). So a previous call to it would effect the next call. To fix this, in this commit, we are re-setting PATH to its original value after each call finishes.
2019-03-18	No Bzip2 shared libraries on macOS systems	Mohammad Akhlaghi	-0/+0
	Bzip2 has a special/separate Makefile to build shared libraries which didn't work on a macOS. So with this commit, we are allowing Bzip2 shared libraries only on macOS systems. Also, I noticed that macOS's `sed' doesn't have the `-i' option (to do the change in place within the same file). So we are using `-e' to write the changed Makefile in a temporary directory, then rename that.
2019-03-11	Not checking software versions in initialize.mk	Mohammad Akhlaghi	-0/+0
	Until now, we were actually running all the programs to check their versions during initialization. But now that the number of programs has increased, this can be slow. With this commit, we simply report the version as a constant string. Maybe later, we can follow the strategy of the TeX Live packages and write them all at configure time.
2019-03-11	for-group gives write permission to all built software in the end	Mohammad Akhlaghi	-0/+0
	Since the `install' script also sets permissions manually, the permissions that we define in `for-group' don't usually affect the installed files. Therefore the installed files of one user can't be modified/deleted by another. With this commit, after for-group finishes configuration, it also adds the write flag for all group members in the whole installation directory.
2019-03-08	For-group script can allow to arguments to Make call	Mohammad Akhlaghi	-0/+0
	Until now the `./for-group' script would only add one argument to the Make call, but in some situations, you need a second argument is well. With this option, any possible fourth argument to `./for-group' is passed to Make.
2019-03-08	Using system's GCC on Mac	Mohammad Akhlaghi	-0/+0
	We still have a few problems with building GCC on a MacOS system. To allow using the pipeline on this operating system, until we find the solution, GCC is only built on non-Mac systems. On Mac, we'll just make a symbolic link to the host's executables.
2019-03-07	Several new Python packages added for full build	Mohammad Akhlaghi	-0/+0
	To ensure that we have all the necessary Python dependencies, I done an offline build and noticed that several packages were also necessary for the `./configure' step to finish (`libffi', `asn1crypto', `cffi', `jeepney', `pycparser' and `secretstorage'). With this commit they are added.
2019-03-06	Imported work on many basic Python modules	Mohammad Akhlaghi	-0/+0
	With the help of Raul, we were able to build many higher-level Python packages to enable the installation of packages like Matplotlib and Astroquery. With this commit, that work is being merged into the master branch.
2019-03-06	Astroquery, astropy, matplotlib and numpy are now in the pipeline	Raul Infante-Sainz	-0/+0
	Until this commit, we had some of the python packages intalled but they did not work properly because of the `PYTHONPATH' variables. That is, the pipeline's `python' was the `python' of the system instead of the pipeline's `python'. With this commit this issue has been fixed by setting the correct `PYTHONPATH'. In this commit we also modify the installation of `bzip2' because `CMake' was complaining about some libraries built statically.
2019-03-01	Elaboration in README-hacking.mk's future improvements section	Mohammad Akhlaghi	-0/+0
	This section was a little outdated and since then, a more clear/exact image of using the Nix experience for the reproducible paper template has been added.
2019-02-23	GCC build rule doesn't depend on Binutils	Mohammad Akhlaghi	-0/+0
	In an attempt to test the GCC build rule (without Binutils, because its too architecture dependent), all the necessary dependencies were moved to GCC (from `ld'). Also `fortran' was also added to the languages supported by GCC. This rule built GCC 8.2.0 nicely on my GNU/Linux system. But `gcc' is still not a final target to built, so the rule is being ignored for now.
2019-02-21	Matplotlib is now in the pipeline	Raul Infante-Sainz	-0/+0
	As matplotlib is a general package for plotting and it is widely used in science, we have added it to the pipeline. When installing a dependency of matplotlib `python-dateutil', we found a conflict in the download of the tarball. This is because the name has a dash (-) in the middle. In addition, the name starts with 'python', so it is the same as the python itself. Now it is possible to install any package with any name, just adding an elif in before the URL direction.
2019-02-20	Installed astroquery in the pipeline	Raul Infante-Sainz	-0/+0
	All dependencies for building astroquery package have been done. Until nowthe Python dependencies were built in the same Makefile as the high level libraries and programs. But, because astroquery has many dependencies we split the Python and Python packages installation in a new Makefile. The installation of differents packages are done using Python and not pip, because we found some problems when doing it with pip. Apparently there are some interferences between the packages installed by the pip of the system and the pip installed as part of Python in the pipeline.
2019-02-20	Pipeline's Bash and AWK deleted when re-building ncurses	Mohammad Akhlaghi	-0/+0
	As in all programs, the build process of ncurses depends on the running shell (Bash) and AWK. At the start of the building of ncurses, we remove its library. But Bash and AWK depend on ncurses to run (this creates a circular dependency). Therefore its necessary to remove the Bash and AWK executables when re-building ncurses. This bug was found by Raul Infante Sainz.
2019-02-13	Imported recent work on building Python within the pipeline	Mohammad Akhlaghi	-0/+0
	Raul Infante-Sainz added the building of Python (along with the Numpy and Astropy packages) into the pipeline. That work is now being merged into the main pipeline branch. There was only this small problem that needed to be fixed: the Python tarball's name after unpacking is actually `Python-X.X.X' (with a captial P), not `python-X.X.X'. This has been corrected with this merge.
2019-02-13	Minor cosmetic corrections in software tarball downloading rule	Mohammad Akhlaghi	-0/+0
	The zip program wasn't placed correctly (in alphabetical order) and its URL command had the wrong indentation! Both have no effect at all on the processing and are only cosmetic (to help in readability).
2019-02-13	Astropy installed in the pipeline	Raul Infante-Sainz	-0/+0
	Astropy was added and one very important thing is that we have to use the pypi tarball (https://pypi.org/) (which is bootstrapped) and not the github tarball.
2019-02-07	Numpy is now in the pipeline	Raul Infante-Sainz	-0/+0
	Python needs some packages to be really useful. Numpy is the most important package for using Python and a lot of other packages depend on it. In this commit we add numpy to the pipeline. The tarball of numpy right now is fossies.
2019-02-07	Python is now in the pipeline	Raul Infante-Sainz	-0/+0
	Many projects use Python so it is necessary include it in the pipeline.
2019-02-06	Minor correction in description of downloading wrapper	Mohammad Akhlaghi	-0/+0
	In the example running code of the wrapper script, I had just written `./download-multi-try', but this script is meant to be run from the top of the project directory. This could cause confusion. So the example script now starts with `/path/to/download-multi-try'.
2019-02-06	Removed .sh suffix in download wrapper script	Mohammad Akhlaghi	-0/+0
	We don't have a `.sh' suffix in the other scripts of `reproduce/src/bash', so it was also removed from this script.
2019-02-06	Wrapper script for multiple attempts at downloading inputs	Mohammad Akhlaghi	-0/+0
	Until now, downloading was treated similar to any other operation in the Makefile: if it crashes, the pipeline would crash. But network errors aren't like processing errors: attempting to download a second time will probably not crash (network relays are very complex and not reproducible and packages get lost all the time)! This is usually not felt in downloading one or two files, but when downloading many thousands of files, it will happen every once and a while and its a real waste of time until you check to just press enter again! With this commit we have the `reproduce/src/bash/download-multi-try.sh' script in the pipeline which will repeat the downoad several times (with incrasing time intervals) before crashing and thus fix the problem.
2019-02-06	Better management for .tex directories to build from tarball	Mohammad Akhlaghi	-0/+0
	In order to collaborate effectively in the project, even project members that don't necessarily want (or have the capacity) to do the whole analysis must be able to contribute to the project. Until now, the users of the distributed tarball could only modify the text and not the figures (built with PGFPlots) of the paper. With this commit, the management of TeX source files in the pipeline was slightly modified to allow this as cleanly as I could think of now! In short, the hand-written TeX files are now kept in `tex/src' and for the pipeline's generated TeX files (in particular the old `tex/pipeline.tex'), we now have a `tex/pipeline' symbolic-link/directory that points to the `tex' directory under the build directory. When packaging the project, `tex/pipeline' will be a full directory with a copy of all the necessary files. Therefore as far as LaTeX is concerned, having a build-directory is no longer relevant. Many other small changes were made to do this job cleanly which will just make this commit message too long! Also, the old `tarball' and `zip' targets are now `dist' and `dist-zip' (as in the standard GNU Build system).
2019-02-05	Ability to package project into tarball or zip file	Mohammad Akhlaghi	-0/+0
	With this commit, it is now possible to package the project into a tarball or zip file, ready to be distributed to collaborators who only want to modify the final paper (and not do the analysis technicalities), or for uploading to sites like arXiv, or online LaTeX sharing pages.
2019-02-05	for-group: better check of group name and fixed make argument	Mohammad Akhlaghi	-0/+0
	A few issues came up while testing the `for-group' script in one of the projects based on this pipeline that are being fixed with this commit: 1) We are ultimately using the `sg' command to use the specified group, not `chgrp'. So in cases where `chgrp' has problems, this would cause a wrong error. So for the test of the given group's existance, we are now directly calling `sg'. 2) In the call to `make' we were mistakenly giving make the `$2' (which is `make' on the command-line) argument. Since `./for-group' now takes the group name as its first argument, this should have been `$3'. 3) To help in readability, and also allow for group names with a space, `reproducible_paper_group_name' is now defined and exported before the final call to `sg'.