paper-concept.git - Paper (Towards Long-term and Archivable Reproducibility)

Age	Commit message (Collapse)	Author	Lines
2019-02-21	Matplotlib is now in the pipeline	Raul Infante-Sainz	-0/+0
	As matplotlib is a general package for plotting and it is widely used in science, we have added it to the pipeline. When installing a dependency of matplotlib `python-dateutil', we found a conflict in the download of the tarball. This is because the name has a dash (-) in the middle. In addition, the name starts with 'python', so it is the same as the python itself. Now it is possible to install any package with any name, just adding an elif in before the URL direction.
2019-02-20	Installed astroquery in the pipeline	Raul Infante-Sainz	-0/+0
	All dependencies for building astroquery package have been done. Until nowthe Python dependencies were built in the same Makefile as the high level libraries and programs. But, because astroquery has many dependencies we split the Python and Python packages installation in a new Makefile. The installation of differents packages are done using Python and not pip, because we found some problems when doing it with pip. Apparently there are some interferences between the packages installed by the pip of the system and the pip installed as part of Python in the pipeline.
2019-02-20	Pipeline's Bash and AWK deleted when re-building ncurses	Mohammad Akhlaghi	-0/+0
	As in all programs, the build process of ncurses depends on the running shell (Bash) and AWK. At the start of the building of ncurses, we remove its library. But Bash and AWK depend on ncurses to run (this creates a circular dependency). Therefore its necessary to remove the Bash and AWK executables when re-building ncurses. This bug was found by Raul Infante Sainz.
2019-02-13	Imported recent work on building Python within the pipeline	Mohammad Akhlaghi	-0/+0
	Raul Infante-Sainz added the building of Python (along with the Numpy and Astropy packages) into the pipeline. That work is now being merged into the main pipeline branch. There was only this small problem that needed to be fixed: the Python tarball's name after unpacking is actually `Python-X.X.X' (with a captial P), not `python-X.X.X'. This has been corrected with this merge.
2019-02-13	Minor cosmetic corrections in software tarball downloading rule	Mohammad Akhlaghi	-0/+0
	The zip program wasn't placed correctly (in alphabetical order) and its URL command had the wrong indentation! Both have no effect at all on the processing and are only cosmetic (to help in readability).
2019-02-13	Astropy installed in the pipeline	Raul Infante-Sainz	-0/+0
	Astropy was added and one very important thing is that we have to use the pypi tarball (https://pypi.org/) (which is bootstrapped) and not the github tarball.
2019-02-07	Numpy is now in the pipeline	Raul Infante-Sainz	-0/+0
	Python needs some packages to be really useful. Numpy is the most important package for using Python and a lot of other packages depend on it. In this commit we add numpy to the pipeline. The tarball of numpy right now is fossies.
2019-02-07	Python is now in the pipeline	Raul Infante-Sainz	-0/+0
	Many projects use Python so it is necessary include it in the pipeline.
2019-02-06	Minor correction in description of downloading wrapper	Mohammad Akhlaghi	-0/+0
	In the example running code of the wrapper script, I had just written `./download-multi-try', but this script is meant to be run from the top of the project directory. This could cause confusion. So the example script now starts with `/path/to/download-multi-try'.
2019-02-06	Removed .sh suffix in download wrapper script	Mohammad Akhlaghi	-0/+0
	We don't have a `.sh' suffix in the other scripts of `reproduce/src/bash', so it was also removed from this script.
2019-02-06	Wrapper script for multiple attempts at downloading inputs	Mohammad Akhlaghi	-0/+0
	Until now, downloading was treated similar to any other operation in the Makefile: if it crashes, the pipeline would crash. But network errors aren't like processing errors: attempting to download a second time will probably not crash (network relays are very complex and not reproducible and packages get lost all the time)! This is usually not felt in downloading one or two files, but when downloading many thousands of files, it will happen every once and a while and its a real waste of time until you check to just press enter again! With this commit we have the `reproduce/src/bash/download-multi-try.sh' script in the pipeline which will repeat the downoad several times (with incrasing time intervals) before crashing and thus fix the problem.
2019-02-06	Better management for .tex directories to build from tarball	Mohammad Akhlaghi	-0/+0
	In order to collaborate effectively in the project, even project members that don't necessarily want (or have the capacity) to do the whole analysis must be able to contribute to the project. Until now, the users of the distributed tarball could only modify the text and not the figures (built with PGFPlots) of the paper. With this commit, the management of TeX source files in the pipeline was slightly modified to allow this as cleanly as I could think of now! In short, the hand-written TeX files are now kept in `tex/src' and for the pipeline's generated TeX files (in particular the old `tex/pipeline.tex'), we now have a `tex/pipeline' symbolic-link/directory that points to the `tex' directory under the build directory. When packaging the project, `tex/pipeline' will be a full directory with a copy of all the necessary files. Therefore as far as LaTeX is concerned, having a build-directory is no longer relevant. Many other small changes were made to do this job cleanly which will just make this commit message too long! Also, the old `tarball' and `zip' targets are now `dist' and `dist-zip' (as in the standard GNU Build system).
2019-02-05	Ability to package project into tarball or zip file	Mohammad Akhlaghi	-0/+0
	With this commit, it is now possible to package the project into a tarball or zip file, ready to be distributed to collaborators who only want to modify the final paper (and not do the analysis technicalities), or for uploading to sites like arXiv, or online LaTeX sharing pages.
2019-02-05	for-group: better check of group name and fixed make argument	Mohammad Akhlaghi	-0/+0
	A few issues came up while testing the `for-group' script in one of the projects based on this pipeline that are being fixed with this commit: 1) We are ultimately using the `sg' command to use the specified group, not `chgrp'. So in cases where `chgrp' has problems, this would cause a wrong error. So for the test of the given group's existance, we are now directly calling `sg'. 2) In the call to `make' we were mistakenly giving make the `$2' (which is `make' on the command-line) argument. Since `./for-group' now takes the group name as its first argument, this should have been `$3'. 3) To help in readability, and also allow for group names with a space, `reproducible_paper_group_name' is now defined and exported before the final call to `sg'.
2019-02-01	Group name is now part of the local configuration	Mohammad Akhlaghi	-0/+0
	Until now, the group name to build the project actually went into the Git source of the project! This doesn't allow exact reproducibility on different machines (where the group name may be different). With this commit, the `for-group' script has been modified to accept the group name as its first argument and pass that onto `configure' and Make. This is much better now, because not only the existance of a group installation is checked, but also the name of the group. It also made things simpler (in particular in `LOCAL.mk.in').
2019-02-01	Configure script ending message now accounts for group building	Mohammad Akhlaghi	-0/+0
	Until now, the `./configure' script would only print the `.local/bin/make -j8' command. But when configured for groups, a different command should be used. It now does a check just before running and suggests the proper command.
2019-02-01	Configure script now runs under /bin/bash	Mohammad Akhlaghi	-0/+0
	Until now it was `/bin/sh', but on Debian systems, this can cause problems because by default they use a much weaker shell (dash) which doesn't recognize functions.
2019-01-24	Updated fork of metastore allows building on macOS	Mohammad Akhlaghi	-0/+0
	I recently found another fork of metastore that allows its build on macOS systems (https://github.com/mpctx/metastore). So I forked it into my own fork with several other corrections (mostly cosmetic!), so it is now much better suited for this pipeline. Raul Infante-Sainz has already tested the building of metastore on his macOS. In a previous test, we also noticed that libbsd should not be built on Mac systems, so it is now a conditional prerequisite to metastore.
2019-01-23	Removing files ending with a ~ in the git checkout hook	Mohammad Akhlaghi	-0/+0
	While editing files, some editors create temporary `~' files that can cause problems in metastore's ability to delete their host directory if its not on the other branch. With this commit, a `find' call was added to the post checkout Git hook to remove such temporary files before metastore is called. Also, some comments were added to both git hooks to make them easier to understand for a beginner.
2019-01-23	New note to checklist for including pipeline-origin in new clone	Mohammad Akhlaghi	-0/+0
	I needed to take these steps in a few occasions on a project I am building over this pipeline. This will commonly happen when a team starts using this pipeline, so it was added to make things easier.
2019-01-23	README-pipeline.md is now called README-hacking.md	Mohammad Akhlaghi	-0/+0
	To be more generic and recognizable, the `README-pipeline.md' script was renamed to `README-hacking.md'. In essence, it is just that: to hack the existing pipeline for your own project. We follow a similar naming convention in many GNU software.
2019-01-23	Corrections in metastore's git hooks	Mohammad Akhlaghi	-0/+0
	Two corrections were made in the Git hooks of Metastore. 1) The shebang at the start of the scripts now uses the absolute adress of our installed bash, not the relative `.local/bin/bash'. Note that it is possible to use Git within subdirectories and in that scenario, the `.local' will fail. 2) The `$$user' section was removed from the command to find the user's group. With the user as an argument, `groups' may print the user's name first, then their list of groups. When this happens, the script would be just repeating the user's name. But the raw `groups' command will list the groups of the running user.
2019-01-23	Corrected check for patchelf when building Bash	Mohammad Akhlaghi	-0/+0
	Until now, the check to see if the patchelf program should be used or not (for GNU/Linux vs. Mac installations) was mistakenly added over the step that we define the `sh' symbolic link, not over the call to patchelf. This is corrected with this commit.
2019-01-22	Updated to newly modified version of metastore	Mohammad Akhlaghi	-0/+0
	In this version, too many extra notices (just regarding a change from branch to branch) are not printed with `-q'. Instead only a one line statement is printed that it is saved or applied.
2019-01-22	Not checking metastore's version temporarily	Mohammad Akhlaghi	-0/+0
	Until we see what happens with the pull request of our suggested features in metastore, its version isn't written directly into the executable, so we won't actually check it, but write the version directly into the paper.
2019-01-22	Using fork of metastore to work when getpwuid isn't usable	Mohammad Akhlaghi	-0/+0
	After testing the built of Metastore on a server, I noticed that because its `/etc/passwd' doesn't have the list of users, the `getpwuid' call within metastore failed and wouldn't let it finish. So I looked into the code and was able to implement a solution to this problem by adding two options to it for default values for the user and group. Also, file attributes are not necessary in our (current) use case of metastore and caused crashes on our server, so they are also disabled.
2019-01-21	Libbsd added as a dependency of Metastore	Mohammad Akhlaghi	-0/+0
	Metastore depends on `bsd/string.h' to work properly (atleast on GNU/Linux systems). The first system I tried building with had that library, so I didn't notice! With this commit, we also build `libbsd' as part of the pipeline. Also, I couldn't find libbsd's version in any of its installed headers, so like Libjpeg, we can't actually check and will directly write our internal version into the paper.
2019-01-21	Metastore package now installed to allow keeping file meta-data	Mohammad Akhlaghi	-0/+0
	The pipeline heavily depends on file meta data (and in particular the modification dates), for example the configuration-Makefiles within the pipeline are set as prerequisites to the rules of the pipeline. However, when Git checks out a branch, it doesn't preserve the meta-data of the files unique to that branch (for example program source files or configuration-Makefiles). As a result, the rules that depend on them will be re-done. This is especially troublesome in the scenario of this reproducible paper project because we commonly need to switch between branches (for example to import recent work in the pipeline into the projects). After some searching, I think the Metastore program is the best solution. Metastore is now built as part of the pipeline and through two Git hooks, it is called by Git to store the original meta-data of files into a binary file that is version controlled (and managed by Metastore).