paper-concept.git - Paper (Towards Long-term and Archivable Reproducibility)

Age	Commit message (Collapse)	Author	Lines
2019-04-30	Imported some recent/parallel work, conflicts fixed	Mohammad Akhlaghi	-24/+62
	Especially because of the new convention regarding backslashes, there were many conflicts that are now fixed. But none were substantial.
2019-04-30	End-of-line Backslashs no longer right under each other	Mohammad Akhlaghi	-25/+25
	When we need to quote the new-line character we end the line with a backslash (`\'). Until now, our convention has been to put all such backslashes under each other to help in visual inspection. But this causes a lot of confusion in version control: if only one line's length is larger, the whole block will be marked as changed and thus makes it hard to visually see the actual change. It also makes debuging the code (adding some temporary lines) hard. With this commit, I went through all the files and tried to fix all such cases so only a single white space character is between the last command character and the backslash. Where there was an empty line (ending with a backslash, to help in visually separating the code into blocks), I put the backslash right under the previous line's. This completes task #15259.
2019-04-30	Better configure checks to see if GCC can be built	Raul Infante-Sainz	-24/+62
	Until now, to test if GCC can use `sys/cdefs.h', we were building a small test program using it. But after testing on an Ubuntu 14.04, we noticed that the GCC test during the configure script passes, but GCC still can't be built. After some investigation we noticed its available in other directories, but during the build of GCC, those directories aren't used, and it only assumes it to be under `/usr/include'. So with this commit, we are only checking this particular location for this header, not a test run of GCC. After fixing this, we noticed that GCC's build crashed again because it couldn't link with `libc.a' (or `libc.so'). So we also added a for this library and added a new warning to inform the user what they might be able to do. Finally, we noticed that in one of the last steps of building GCC, we weren't using `&&', but `;', so the GCC name file would be built, even when the GCC build failed.
2019-04-22	High-level software now specified in TARGETS.mk	Mohammad Akhlaghi	-5/+2
	Until now, to specify which high-level software you want the project to contain, it was necessary to go into the `high-level.mk' Makefile that is complicated and can create bugs. With this commit, a new `reproduce/software/config/installation/TARGETS.mk' file has been created that is easily/cleanly in charge of documenting the final high-level software that must be built for the project. Also, until now, FFTW was set as a dependency of Numpy while we couldn't actually get Numpy to use it! It was just there for future reference and to justify its build rule. But now that many software won't be built and there is no problem with having rules even though a project might not use them, it has been removed.
2019-04-17	Corrected bibtex entry for Astrometry-net and Swarp	Raul Infante-Sainz	-1/+1
	Until now, there were erros in the citation of Astrometry-net and Scamp papers. With this commit, we fix these problems. The Swarp bibtex has also been modify to follow the stetic of the citation style we have right now in the project. We also added the `dependency-bib.tex' as a prerequisite of `paper.bbl'.
2019-04-15	New architecture to separate software-building and analysis steps	Mohammad Akhlaghi	-57/+90
	Until now, the software building and analysis steps of the pipeline were intertwined. However, these steps (of how to build a software, and how to use it) are logically completely independent. Therefore with this commit, the pipeline now has a new architecture (particularly in the `reproduce' directory) to emphasize this distinction: The `reproduce' directory now has the two `software' and `analysis' subdirectories and the respective parts of the previous architecture have been broken up between these two based on their function. There is also no more `src' directory. The `config' directory for software and analysis is now mixed with the language-specific directories. Also, some of the software versions were also updated after some checks with their webpages. This new architecture will allow much more focused work on each part of the pipeline (to install the software and to run them for an analysis).
2019-04-14	Replaced all occurances of pipeline in text	Mohammad Akhlaghi	-58/+59
	All occurances of "pipeline" have been chanaged to "project" or "template" withint the text (comments, READMEs, and comments) of the template. The main template branch is now also named `template'. This was all because `pipeline' is too generic and couldn't be distinguished from the base, and customized project.
2019-04-13	Corrected copyright notices and info about adding copyright info	Mohammad Akhlaghi	-5/+1
	Until now, the files where the people were meant to change didn't have a proper copyright notice (for example `Copyright (C) YOUR NAME.'). This was wrong because the license does not convey copyright ownership. So the name of the file's original author must always be included and when people modify it (and add their own copyright-able modifications). With this commit, the file's original author (and email) are added to the copyright notice and when more than one person modified a file, both names have their individual copyright notice. Based on this, the description for adding a copyright notice in `README-hacking.md' has also been modified.
2019-04-12	Dependency BibTeX entries included only when necessary	Mohammad Akhlaghi	-2/+26
	Until now, there was a single `tex/src/references.tex' file that housed the BibTex entries for everything (software and non-software). Since we have started to include the BibTeX entry for more software, it will be hard to manage the large (sometime unused) BibTeX entries of the software in the middle of the non-software related citations in the text of the paper. Therefore, with this commit, a `tex/dependencies' directory has been made which has a separate BibTeX entry file for each software that needs one. After the software is built, this file is copied to the new `.local/version-info/cite' directory. At the end, the configure script will concatenate all the files in this directory into one file which will later be used with `tex/src/references.tex' by BibLaTeX. This greatly simplifies managing of citations. Allowing us to focus on the software-building and paper-writing citations separately/cleanly (and thus be more efficient in both).
2019-04-12	Configure script dealing properly with empty software directories	Mohammad Akhlaghi	-20/+34
	Until now, we hadn't actually tested the case where a whole software directory (Python modules in particular) is empty. So the configure script finished with some errors in this case. With this commit, this step of the configure script was modified to deal with such cases cleanly. Also, in `initialize.mk', I added a `-f' to the symbolic link command, so it doesn't complain if the file link already exists.
2019-04-07	Configure script using our build programs in final steps	Mohammad Akhlaghi	-35/+44
	In order to get a consistent final result, in its later steps, the configure script uses our own build of the basic command-line tools (like `cat', `awk'). Also, a correction was made to the short option parsing errors when an unwanted argument is given, and the `-?' was changed to `-'?'' to avoid un-necessary shell interpretation (for example giving unreasonable results).
2019-04-07	GNU M4 now built as a dependency of GNU Libtool	Mohammad Akhlaghi	-1/+1
	On some systems, M4 isn't available, so the linking to the host system fails, as a result, we can't build GNU Libtool. The main reason we weren't building M4 was a bug with the most recent GNU C library (http://lists.gnu.org/archive/html/bug-gnulib/2019-04/msg00004.html). But I found a patch used by Arch Linux which fixes the issue and allows M4 to be built. As a result, the pipeline is now building M4 also and the patched M4 tarball is now uploaded to my own webpage as backup. While doing the steps above, I also noticed that we weren't using a tab at the start of the link definitions of `dependencies-basic.mk'. Although its not necessary, to be consistent, its good for the lines to always start with a tab.
2019-04-07	Corrections in configure script and astroquery, astropy rules	Mohammad Akhlaghi	-13/+15
	The step where we check the possibility of using `sys/cdefs.h' was still using `$$' for shell variables (in Make), not `$' (for the shell). This was corrected. Also, since Astropy needs two citations, the `,' in the citation command would conflict with Make's parsing. So we just used an `echo' command to re-write the version info. In Astroquery, the prerequisite list was just reordered by length to be more clear to the eye.
2019-04-07	--host-cc configure option to avoid building GCC, M4 mandatory	Mohammad Akhlaghi	-0/+57
	In some cases (specially when debugging the pipeline), its very time-consuming to install GCC. With this commit, a `--host-cc' option has been added to avoid building the C compiler when necessary. The test to see if `sys/cdefs.h' is available on the system (necessary to build GCC) has also been moved to the configure script to print a more visible warning and also use the new `host_cc' variable to let `dependencies-basic.mk' know that GCC shouldn't be built. Finally, we are having problems installing M4 from source, so it has been set as a mandatory dependency.
2019-04-05	Software acknowledgement section is automatically generated	Mohammad Akhlaghi	-116/+64
	Until now, management of the software names and versions in the paper was done manually (a macro had to be defined in `initialize.mk', then used in `paper.tex', so they had to be manually set in two places). Managing this was not easy. To fix this, with this commit, each software building rule's target is a text file that contains its human-readable name and its version. In the end, the configure script sorts them by their name and writes them into a LaTeX input file that we can easily import as a file into the main paper.
2019-04-04	Dependency version LaTeX macros written at the end of configure	Mohammad Akhlaghi	-1/+116
	Until now, these versions were written in each run. This was mainly inherited from the old days of the pipeline, where we didn't know the software on the host. But now that we have almost everything under control, we can just write these LaTeX macros at the end of the configure script and make `initialize.mk' simpler and also (very slightly!) speed-up/simplify the processing.
2019-04-04	Better option checks and values in the configure script	Mohammad Akhlaghi	-27/+29
	Double quotes were placed around the checked values so they can have space within them. Also, some checks were added for options that don't accept a value.
2019-04-04	Configure script also accepts short options with no delimiter	Mohammad Akhlaghi	-13/+28
	Until now, the short options to the configure script needed a delimiter (either white-space or an `=') between the name and value. With this commit, for short options, it also accepts the value immediately touching the option name. Also, when trying to fine the absolute address of a given path, a check was added to abort if it doesn't exist.
2019-04-04	--existing-conf doesn't take any values in configure script	Mohammad Akhlaghi	-7/+7
	Until now we were (wrongly) assuming that the configure script's `--existsing-conf' option takes a value, while this is not the case.
2019-04-04	Numpy and Scipy build on Mac imported into the main branch	Mohammad Akhlaghi	-17/+0
	We were developing the build of Numpy and Scipy on Mac in a parallel thread and things seems to be working relatively nice now. There were only two problems: 1) GCC still has some random building issues on Mac. 2) ATLAS shared libraries can't be built on Mac (so we used OpenBLAS to build Numpy and Scipy on both Mac and GNU/Linux). But for now, none of these problems are critical. So, we can progress in one branch. There were only very minor conflicts in the merge.
2019-04-04	Better option-reading in configure, using .build to access BDIR	Mohammad Akhlaghi	-202/+86
	Until now, the steps to manage the command-line options of the configure script were limited (couldn't accept an equal sign or space between the option name and value). With this commit, it can now also accept optional equal signs between the option name and value. Thus not causing many confusions. Also, it is more logically consistent for the link to the build-directory to be placed in the top directory (as a hidden file like `.local' until now), and not as a visible directory like `reproduce/build' (which we used until now). Therefore, with this commit, the link to easily access the build-directory is `.build' in the top source directory. Finally, because `minmapsize' is too specific to Gnuastro and has now been given its default value at the start of the configure script, the description for `minmapsize' has been removed (to not confuse users who don't use Gnuastro). If anyone is familiar enough with Gnuastro to change it, they already know it from its book.
2019-04-02	Python packages are installed as high level program dependencies	Raul Infante-Sainz	-17/+0
	Until this commit, the installation of all Python packages were done in a separate Makefile. With this commit, the pipeline install Python packages as part of the hight level software. All Python packages rules them remain in a separate Makefile, but this Makefile is included in the high level dependency `reproduce/src/make/dependencies.mk'.
2019-03-28	Configure script now has options	Mohammad Akhlaghi	-40/+254
	With the options, it is now possible to run the configure script more easily after the initial run. The `--help' option provides a nice and complete introduction along with a listing of the input options and the `-j' option can be use to manually set the number of threads.
2019-03-28	flock is now built in configure, to allow serial downloads	Mohammad Akhlaghi	-2/+61
	Until now, we were using `flock' (file-lock) for downloading the input datasets in series. But we couldn't do this when downloading the software tarballs because `flock' wasn't yet available. Generally, unlike processing, downloading is much better done in series than in parallel. To enable serial downloads of the software also, with this commit we are installing `flock' in the configure script (not in a Makefile). As a result, besides `flock', we can also benefit from the other good features of the `reproduce/src/bash/download-multi-try' script *(for example attempting download again after some time). Some GNU mirrors may have problems at the time of download, so with this commit, we are using the main GNU FTP server for GNU programs.
2019-02-20	Installed astroquery in the pipeline	Raul Infante-Sainz	-0/+18
	All dependencies for building astroquery package have been done. Until nowthe Python dependencies were built in the same Makefile as the high level libraries and programs. But, because astroquery has many dependencies we split the Python and Python packages installation in a new Makefile. The installation of differents packages are done using Python and not pip, because we found some problems when doing it with pip. Apparently there are some interferences between the packages installed by the pip of the system and the pip installed as part of Python in the pipeline.
2019-02-01	Group name is now part of the local configuration	Mohammad Akhlaghi	-31/+25
	Until now, the group name to build the project actually went into the Git source of the project! This doesn't allow exact reproducibility on different machines (where the group name may be different). With this commit, the `for-group' script has been modified to accept the group name as its first argument and pass that onto `configure' and Make. This is much better now, because not only the existance of a group installation is checked, but also the name of the group. It also made things simpler (in particular in `LOCAL.mk.in').
2019-02-01	Configure script ending message now accounts for group building	Mohammad Akhlaghi	-1/+6
	Until now, the `./configure' script would only print the `.local/bin/make -j8' command. But when configured for groups, a different command should be used. It now does a check just before running and suggests the proper command.
2019-02-01	Configure script now runs under /bin/bash	Mohammad Akhlaghi	-1/+1
	Until now it was `/bin/sh', but on Debian systems, this can cause problems because by default they use a much weaker shell (dash) which doesn't recognize functions.
2019-01-18	Sanity check to run the Make with proper group permissions	Mohammad Akhlaghi	-0/+40
	If the `./for-group' script is not used properly, it can lead to the whole pipeline being re-run. Therefore it is important to do a sanity check immediately at the start of Make's processing and inform the user if there is a problem. With this commit, `./for-group' exports the `reproducible_paper_for_group' variable which is used by both the initial `./configure' script, and later in each call to Make. The `./configure' script will use it to write a value in `reproduce/config/pipeline/LOCAL.mk' and Make will use it to compare with the value in `reproduce/config/pipeline/LOCAL.mk'. If there is an inconsistency, Make will not even attempt to build anything and will just print a message and abort.
2019-01-02	Copyright year updated to 2019	Mohammad Akhlaghi	-1/+1
	Since the current implementation of this pipeline officially started in 2018, all the files only had 2018 in their copyright years. This has now been corrected to 2018-2019.
2018-12-13	Added extra note to the input datasets directory	Mohammad Akhlaghi	-1/+4
	To make things clear for a user of the pipeline its mentioned that the given input directory is only read and nothing is written to it.
2018-12-13	Fixed numthreads in dependencies also	Mohammad Akhlaghi	-0/+1
	Some problems with using the number of threads in dependency building were fixed.
2018-12-11	Passing -j build options to dependency building bottle-necks	Mohammad Akhlaghi	-0/+1
	Some host Make systems may not allow automatic passing of the number of threads to sub-Makes. So while building the basic dependencies, we'll need to explicity add the `-j' option to the Make files that can benefit most from it: those that are dependencies of many others (Tar & Make), or are the last to build (Coreutils).
2018-12-05	Corrected comment on downloader in configure script	Mohammad Akhlaghi	-12/+5
	The comment above the downloader section of the configure script was not up to date with how the pipeline uses a downloader during configuration and building now. So it was updated.
2018-12-05	Configuring on multiple threads	Mohammad Akhlaghi	-2/+2
	Until now we had constrained the configuration step to one thread to easily see failures on other systems. But with most tests passing successfully now, we are using the total number of available threads.
2018-12-04	Shared library absolute address fixed in Libgit2 and WCSLIB on Mac OS	Mohammad Akhlaghi	-2/+3
	The build systems of Libgit2 and WCSLIB on Mac OS does not account for installation in non-standard addresses: `Libgit2' keeps the absolute address of its build directory (not the installation directory) and WCSLIB doesn't write any absolute address at all (so the system uses the first one it finds). To address these issues, we are now using Mac OS's `install_name_tool' program to fix the absolute path within the installed shared library. Since the version of the library is actually present in its shared library name, in `dependency-versions.mk' we have also separated these two libraries so later when their version is changed, we are careful in correcting the shared library name also.
2018-12-03	Checking Mac OS host for configuring OpenSSL	Mohammad Akhlaghi	-1/+18
	OpenSSL can't automatically detect the architecture of Mac OS systems, so as it suggests on its Wiki, it needs some help for doing that. With this commit, we are checking the build on Mac OS with the presence of `otool' (Mac OS's linker). If it's there, we'll add the OpenSSL configuration options suggested by OpenSSL's Wiki.
2018-12-03	Added rpath in basic dependencies, remove input if download fails	Mohammad Akhlaghi	-0/+1
	Until now, we weren't including the `rpath' linking options to the basic dependencies. They are now added. Also, when the download of an input file fails for any reason, an empty file won't be replaced there any more.
2018-12-03	Trusted CA certificates also downloaded for Wget usage	Mohammad Akhlaghi	-2/+2
	To enable easy downloading of HTTPS links with Wget (this pipeline's defaut downloader), we need a set of trusted CA certificates. Until the time that we can generate one ourselves, one generic set of trusted CA certificates is now downloaded like a tarball and placed in the OpenSSL configuration directory. With these CA certificates, within the pipeline we can now safely use the pipeline's own installed Wget.
2018-12-03	Preference for shared library linking	Mohammad Akhlaghi	-41/+51
	Some high-level programs like Wget and cURL need to be built in shared mode because they also include dynamic loading of libraries. Therefore, if we only build the lower-level libraries in static mode, our own build will be ignored and they will go and find the system's shared libraries to link with. Because of this, for now, we have manually set the `static_build' variable in the configure script to `no'. Also, if the downloader fails, we'll delete the output (an empty file in the case of Wget) because it interefers with a target definition.
2018-12-02	Configure building on one thread for debugging	Mohammad Akhlaghi	-2/+2
	To help in debugging, we are only running the Makefiles within the configure script on one thread.
2018-12-02	Wget and OpenSSL now installed as a basic dependency	Mohammad Akhlaghi	-3/+33
	The TeX Live installer needs Wget to operate smoothly, especially on recent Mac OS systems that don't have Wget pre-installed. Also, it would be good for the pipeline to have its own downloader. So with this commit, the pipeline also installs Wget and OpenSSL which is a dependency. Many other small changes/fixes were done in this process.
2018-12-01	Improved TeXLive installation checks	Mohammad Akhlaghi	-7/+6
	Thanks to the check by Cristina Martínez, some corrections were made when we attempt to download and install TeXLive. Further checks and corrections will be in due time.
2018-11-30	cURL now also available as downloader with -L flag	Mohammad Akhlaghi	-4/+13
	The main reason I wasn't using cURL as a downloading tool was that I wasn't familar with how to ask it to follow a re-direct. But I just found out that its with the `-L' configure time option. So it is now added as a downloader tool to the pipeline.
2018-11-30	Setting libgit2 to build statically in any case	Mohammad Akhlaghi	-2/+2
	On the Libgit2 webpage, it has recommended to build it statically on Mac systems. By default we are doing this on Linux systems, but the `-static' flag failed on Mac. But apparently CMake might be able to deal with the issue in a different way.
2018-11-29	Optional rpath link option, CMake search path set, static WCSLIB	Mohammad Akhlaghi	-1/+25
	Thanks to a test build on Raul Infante Sainz's Mac OS computer, we were able to address some issues and will be trying them after this commit: a) The LLVM linker on that computer didn't recognize `-rpath-link'! So at configure time we now check for it and only include it when the linker recognizes it. b) CMake corrections: 1) `CMAKE_LIBRARY_PATH' is now defined so CMake can look in our custom directory to find the necessary libraries. 2) To build and install the CMake built programs, we now simply use `make' and `make install'. c) To avoid particular linking problems with WCSLIB (which has special problems compared to other libraries), we are now deleting the shared library version (both on GNU and Mac systems).
2018-11-29	Ignoring building of GCC for pipeline	Mohammad Akhlaghi	-1/+10
	GNU Binutils (which provides the GNU Linker) is not ported to Mac OS systems. GCC also takes a very long time to build, and if we are to still have linking problems with LLVM's linker, it would be better to just ignore GCC also and use the system's C compiler and linker together. So for the time being, GCC isn't a main target of the basic dependencies and won't be installed. But we have kept the rules that were checked on a GNU/Linux operating system.
2018-11-29	GCC is now installed by the pipeline	Mohammad Akhlaghi	-2/+2
	The pipeline now installs GCC and all its necessary prerequisites.
2018-11-26	Making lock file directory	Mohammad Akhlaghi	-2/+2
	We had forgot to add the rule to build the lock file directory for downloading data. This has been corrected.
2018-11-25	Rule of tex/pipeline.tex now defined in paper.mk not top Makefile	Mohammad Akhlaghi	-1/+1
	To avoid redundant steps in the the top-level Makefile and make it simpler and easier to follow, we now define the base names of all the Makefiles in the `makesrc' variable of the top-level Makefile. `makesrc' is then used to define the Makefiles to include and the necessary TeX macros at the same time. This is much more clear and obvious than the previous case were we had to list the Makefiles and TeX macro files separately in the top level Makefile.