aboutsummaryrefslogtreecommitdiff
path: root/reproduce/analysis/make
AgeCommit message (Collapse)AuthorLines
2020-03-08Menke+20 example: properly count number of papers with softwareMohammad Akhlaghi-4/+15
Until now, I was mistakenly multiplying the fraction of papers in that journal. This is corrected with this commit.
2020-03-02Described the first analysis phase with a demo subMakefileMohammad Akhlaghi-10/+10
Until now, there was no explanation on an actual analysis phase, therefore with this commit an example scenario with a readable Makefile is included. The Data lineage graph was also simplified to both be more readable, and also to correspond to this new explanation and subMakefile. Some random edits/typos were also corrected and some references added for discussion.
2020-02-16Two values from the input dataset are now written into the paperMohammad Akhlaghi-2/+15
This was done just to get going with describing the analysis process.
2020-02-16Menke+2020 data is now imported and ready for later steps in plain textMohammad Akhlaghi-6/+72
The main problems with this dataset was the names of the journals (which sometimes have single quotes or apostrophes in them that is really annoying for SED)! But ultimately, for the simple study we want to do here, the journal names are irrelevant, so in the end I just ignored the names. Later we can set an identifier for the journals if necessary. But now we have the basic information in a way that is usable in a plot to show in this paper.
2020-01-18Raw draft (until now as a separate repository) importedMohammad Akhlaghi-3/+4
Until now, I was writing the paper without the template. But we will soon be adding a tutorial to the template, and I thought it will be good to have an example demonstration here too. So I just brought the hole project into the template structure, allowing us to add the template analysis later when its ready, and also allowing us to easily reproduce this paper ofcourse (without having to worry about the host's TeXLive installation.
2020-01-18First set of customizations doneMohammad Akhlaghi-132/+1
The unnecessary parts were removed and the project now runs.
2020-01-01Verification function checks if file existsMohammad Akhlaghi-3/+13
Until now, if the file to be verified didn't exist, a different checksum would be generated, and it would stop, but it wasn't immediately clear if the differing checksum is because the file doesn't exist at all! With this commit, before calculating the checksum, we first make sure if the file exists. If it doesn't exist an explicit error is printed and thus will help the project editor to find the cause of the problem.
2020-01-01Verification of output values and data added within templateMohammad Akhlaghi-22/+143
Until now, the only verification that the template provided was the published PDF. Users had to manually compare the published and generated PDFs (numbers, plots, tables) and see if they obtained the same result. However, this type of manual verification is not good and is prone to frustration and missing important differences. With this commit, a new Makefile has been added in the analysis steps: `verify.mk'. It provides facilities to easily verify the results that go into the paper. For example tables that go into making the paper's plots, or the LaTeX macros that blend into the text. See the updated parts in `README-hacking.md` for a more complete explanation. This completes task #15497.
2020-01-01Copyright statements updated to include 2020Mohammad Akhlaghi-7/+7
Now that its 2020, its necessary to include this year in the copyright statements.
2019-11-29Download links directly to actual file if it exists in INDIRMohammad Akhlaghi-2/+8
Until now, when an input dataset already exists in `INDIR', the template would just make a symbolic link to it in the build directory. However, in many cases, the files in INDIR will actually be links to other locations on the filesystem and some programs have problems following too many links. With this commit, the template is now using the `readlink' program (part of GNU Coreutils) to follow a possible link and point the link in the build directory directly to an actual non-link file.
2019-10-31Minor corrections in distribution and autoconf prerequisite of automakeMohammad Akhlaghi-0/+1
Some minor corrections were made in the template: - When making the distribution, `.swp' files (created by Vim) are also removed. - Autoconf is set as a prerequisite of Automake I was also trying to add the Apache log4cxx, but its default 0.10.0 tarball needs some patches, so I have just left it half done until someone actually needs it and we apply the patch.
2019-10-19Minor improvments in packaging of project with make distMohammad Akhlaghi-12/+14
The steps to package the project have been made slightly more clear and also the temporary directory that is created for packaging is deleted after the tarball is made.
2019-10-11Properly working make clean when in group modeMohammad Akhlaghi-3/+4
Until now, when you ran `make clean', all the directories under `$(BDIR)/tex/' would be deleted except for `macros' and `build'. This was good for the single-user mode. But in group mode, this would delete the user-specific TeX build directory because its called `build-USER', not `build'. With this commit, to fix the problem, we define the new `texbtopdir' and based on the group condition, and use that to specify which directory to not delete.
2019-10-01Minor corrections in configure and prepare phaseMohammad Akhlaghi-10/+10
Since ImageMagick can take long to build, we are now building it in parallel. Also, the part where we replace an `_' with `\_' in the software version at the end of the configure script was removed. It is more clear/readable that the actual rule that includes such a name deals with the underline (as is the case for `sip_tpv' which already dealt with it). Finally, I noticed that the checks at the start of `top-prepare' were missing new-lines. I had forgot that the Make single-shell variable isn't activated in this stage yet.
2019-10-01Infrastructure to keep preparation resultsMohammad Akhlaghi-17/+50
A special directory is now defined in `initialize.mk' that can be used in both the preparation and build phases. Also, the contents of prepared results can now be conditionally read during `./project make'.
2019-10-01Preparation phase added before final buildingMohammad Akhlaghi-0/+126
In many real-world scenarios, `./project make' can really benefit from having some basic information about the data before being run. For example when quering a server. If we know how many datasets were downloaded and their general properties, it can greatly optmize the process when we are designing the solution to be run in `./project make'. Therefore with this commit, a new phase has been added to the template's design: `./project prepare'. In the raw template this is empty, because the simple analysis done in the template doesn't warrant it. But everything is ready for projects using the template to add preparation phases prior to the analysis.
2019-09-26Working project when downloaded from arXivMohammad Akhlaghi-2/+4
Until now, we were assuming that the users would just clone the project in Git. But after submitting arXiv:1909.11230, and trying to build directly from the arXiv source, I noticed several problems that wouldn't allow users to build it automatically. So I tried the build step by step and was able to find a fix for the several issues that came up. The scripting parts of the fix were primarily related to the fact that the unpacked arXiv tarball isn't under version control, so some checks had to be put there. Also, we wanted to make it easy to remove the extra files, so an extra `--clean-texdit' option was added to `./project'. Finally, some manual corrections were necessary (prior to running `./project', which are now described in `README.md'. Most of the later steps can be automated and we should do it later, I just don't have enough time now.
2019-09-25Won't copy previous distribution builds in new distributionMohammad Akhlaghi-1/+1
Until now, the pipeline was instructed to only ignore the current temporary project distribution directory. So if there were directories from previous builds, they were wrongly included in the current tarball. With this commit, we don't just ignore the directory of the current distribution, but generally, all directories starting with `paper-v*'.
2019-09-16Git checksum printed even when on a tagMohammad Akhlaghi-2/+2
Until now, when the commit was tagged, `git describe' would just print the tag and no longer the commit checksum. This is bad because the checksum is a much more robust way to confirm the point in history. With this commit the `--long' option has been added to `git describe' to fix this issue. From now on, when we are on a tag, it will print the tag followed by a `-0-' and the first characters of the checksum.
2019-09-16Distribution tarball now builds in arXivMohammad Akhlaghi-8/+25
`./project make dist' will package all the LaTeX-specific files (and analysis source files) into one `tar.gz' file that is ready to upload to servers like arXiv. However, it wasn't updated for some time, so running it would complain about not having a `configure' script in the top of the project. With this commit, it now works with the new file-structure of the project and also copies all the BibLaTeX source files and `paper.bbl' into the top tarball directory, which allows arXiv to build the paper as intended. The output of `./project make dist' has been uploaded and tested on arXiv and it is built by arXiv perfectly. Also, a short description of all the special `make' targets was added to the output of `./project --help'.
2019-08-22OpenMPI environment variable used to disable need for OpenSSHMohammad Akhlaghi-1/+5
Until now, OpenMPI would complain about not having `ssh' or `rsh' as a remote shell feature. However, such features should not be necessary in a reproducible scenario and they also have major security issues. With this commit, we are now using OpenMPI's `OMPI_MCA_plm_rsh_agent' environment variable to disable any remote shell dependency for it (as suggested by Boud). Therefore, any dependency for OpenSSH has been removed. But I thought to keep the build instructions incase it may be useful under some un-foreseen scenario. However, to discourage people from building it, a notice was added ontop of the build instructions. This bug was found, tested and solved thanks to Roberto Baena Gallé and Boud Roukema. This fixes bug #56724.
2019-08-01Git hooks removed after doing a distcleanMohammad Akhlaghi-0/+5
Until now, when you needed to completely clean a project (with `./project make distclean') the Git hooks that are installed during configure time would cause problems when committing (the `pre-commit' hook in particular won't allow you to commit anything!). With this commit, before deleting the software, the template first removes these Git hooks.
2019-08-01Bash startup script for every recipeMohammad Akhlaghi-0/+4
Until now the only way to define the environment of the Make recipes was through the exported Make variables (mostly in `initialize.mk' for the analysis steps for example). However, there is only so much you can do with environment variables! In some situations you want slightly more complicated environment control, like setting an alias or running of scripts (things that are commonly done in the `~/.bashrc' file of users to configure their interactive, non-login shells). With this commit, a `reproduce/software/bash/bashrc.sh' has been defined for this job (which is currently empty!). Every major Make step of the project adds this file as the `BASH_ENV' environment variable, so the shell that is created to execute a recipe first executes this file, then the recipe. Each top-level Makefile also defines a `PROJECT_STATUS' environment variable that enables users to limit their envirnoment setup based on the condition it is being setup (in particular in the early phase of `basic.mk', where the user can't make any assumption about the programs and has to write a portable shell script).
2019-07-28Corrected typo in environment before running makeMohammad Akhlaghi-3/+3
We recently moved the system's `rm' program absolute address to a shell variable that is found during the `./project' script. But I had forgot to account for the difference between the Make and Bash variable naming differences. I had also forgot to add a value to the HOME variable. With this commit both are corrected: the system's `rm' path is now called `sys_rm' and the HOME variable is set.
2019-07-28Single wrapper instead of old ./configure, Makefile and ./for-groupMohammad Akhlaghi-18/+20
Until now, to work on a project, it was necessary to `./configure' it and build the software. Then we had to run `.local/bin/make' to run the project and do the analysis every time. If the project was a shared project between many users on a large server, it was necessary to call the `./for-group' script. This way of managing the project had a major problem: since the user directly called the lower-level `./configure' or `.local/bin/make' it was not possible to provide high-level control (for example limiting the environment variables). This was especially noticed recently with a bug that was related to environment variables (bug #56682). With this commit, this problem is solved using a single script called `project' in the top directory. To configure and build the project, users can now run these commands: $ ./project configure $ ./project make To work on the project with other users in a group these commands can be used: $ ./project configure --group=GROUPNAME $ ./project make --group=GROUPNAME The old options to both configure and make the project are still valid. Run `./project --help' to see a list. For example: $ ./project configure -e --host-cc $ ./project make -j8 The old `configure' script has been moved to `reproduce/software/bash/configure.sh' and is called by the new `./project' script. The `./project' script now just manages the options, then passes control to the `configure.sh' script. For the "make" step, it also reads the options, then calls Make. So in the lower-level nothing has changed. Only the `./project' script is now the single/direct user interface of the project. On a parallel note: as part of bug #56682, we also found out that on some macOS systems, the `DYLD_LIBRARY_PATH' environment variable has to be set to blank. This is no problem because RPATH is automatically set in macOS and the executables and libraries contain the absolute address of the libraries they should link with. But having `DYLD_LIBRARY_PATH' can conflict with some low-level system libraries and cause very hard to debug linking errors (like that reported in the bug report). This fixes bug #56682.
2019-07-27DYLD_LIBRARY_PATH also fixed for macOS systemsMohammad Akhlaghi-7/+8
Until now we were only setting the `LD_LIBRARY_PATH' environment variable for GNU/Linux systems. But macOS systems use the `DYLD_LIBRARY_PATH'. With this commit, for better control over the environment, we are also fixing `DYLD_LIBRARY_PATH' in all the places that we are setting the general environment variables.
2019-06-29Added citation for TIDES, sorted progs alphabeticallyMohammad Akhlaghi-5/+9
While reviewing Prasenjit's commits, I noticed that we had forgot to add the citation for TIDES, also to make things clear, the program/library build rules are now sorted alphabetically. Finally, I noticed that after building the TiKZ PDF figures, it is crashing (like on Prasenjit's computer). After looking around, I noticed its because we were setting the of the `TEXINPUTS' environment variable to be the installed TeX Live directory (which was ultimately redundant because by default TeX will look into where it was installed). The important thing is just that we remove any possible value the host system has, not to set new directories.
2019-06-28Corrections to basic buildPrasenjit Saha-1/+1
Several corrections were necessary in the basic build: 1) the version of GCC on some systems includes an `_' which would cause a crash when building the PDF. 2) libcharset had to be manually added to the Git build.
2019-06-13Permission flags of top.mk set to 644 like others, not 755Mohammad Akhlaghi-0/+0
Until now, for some reason, the permission flags of `top.mk' were 755 (good for an executable), not 644 (which is what they should be for a plain text file that is run by another program). This is corrected with this commit.
2019-05-21Imported Matplotlib installation, no conflictsMohammad Akhlaghi-0/+0
There weren't any conflicts in this merge.
2019-05-21ImageMagick is now included into the projectRaul Infante-Sainz-0/+0
With this commit, ImageMagick software has been added into the project. This software is useful to deal and treat images from the command line. Since it is widely used and a lot of other programs rely on it, it is worth to have it into the project.
2019-05-21Source directory links to build directory all managed in configureMohammad Akhlaghi-5/+2
Until now, the `tex/build' symbolic link was put in the clone/source tree when the build-directory's `tex' directory was being built. Thanks to Roberto Baena, we just found a bug because of this behavior: when a second group member is trying to build the pipeline, since the build directory's `tex' directory already exists, no `tex/build' will be put in their clone/source directory. As a result, the PDF building will crash. To fix this (and keep things organized), the two `tex/build' and `tex/tikz' links (to the build directory) are now built in the configure step while it is building all the top-level directories. They are no longer built within the Makefiles. Also, a comment was added on top of every directory built during the configuration phase to be clear. This fixes bug #56362.
2019-04-30End-of-line Backslashs no longer right under each otherMohammad Akhlaghi-9/+9
When we need to quote the new-line character we end the line with a backslash (`\'). Until now, our convention has been to put all such backslashes under each other to help in visual inspection. But this causes a lot of confusion in version control: if only one line's length is larger, the whole block will be marked as changed and thus makes it hard to visually see the actual change. It also makes debuging the code (adding some temporary lines) hard. With this commit, I went through all the files and tried to fix all such cases so only a single white space character is between the last command character and the backslash. Where there was an empty line (ending with a backslash, to help in visually separating the code into blocks), I put the backslash right under the previous line's. This completes task #15259.
2019-04-29Fixed a few architecture remnants in initialize.mkMohammad Akhlaghi-33/+3
In a few cases, `reproduce/analysis/make/initialize.mk' still assumed the old architecture. With this commit, they have been corrected.
2019-04-17Corrected bibtex entry for Astrometry-net and SwarpRaul Infante-Sainz-1/+1
Until now, there were erros in the citation of Astrometry-net and Scamp papers. With this commit, we fix these problems. The Swarp bibtex has also been modify to follow the stetic of the citation style we have right now in the project. We also added the `dependency-bib.tex' as a prerequisite of `paper.bbl'.
2019-04-15New architecture to separate software-building and analysis stepsMohammad Akhlaghi-0/+831
Until now, the software building and analysis steps of the pipeline were intertwined. However, these steps (of how to build a software, and how to use it) are logically completely independent. Therefore with this commit, the pipeline now has a new architecture (particularly in the `reproduce' directory) to emphasize this distinction: The `reproduce' directory now has the two `software' and `analysis' subdirectories and the respective parts of the previous architecture have been broken up between these two based on their function. There is also no more `src' directory. The `config' directory for software and analysis is now mixed with the language-specific directories. Also, some of the software versions were also updated after some checks with their webpages. This new architecture will allow much more focused work on each part of the pipeline (to install the software and to run them for an analysis).