paper-concept.git - Paper (Towards Long-term and Archivable Reproducibility)

Age	Commit message (Collapse)	Author	Lines
2020-09-03	Imported recent work in Maneage, minor conflicts fixed	Mohammad Akhlaghi	-12/+24
	Only two small conflicts came up: * The addition of the hardware architecture macro in 'paper.tex' (which was removed for now, but will be added as the referee has requested within the text). * The usage of "" around directory variables in 'paper.mk'.
2020-08-28	Edited README.md to remove installation of a text editor	Mohammad Akhlaghi	-10/+7
	With the previous commit, we now build Nano by default within Maneage, and project authors can ask to install Emacs and Vim within 'TARGETS.conf'. So in the instructions to build within a Docker image have been removed.
2020-08-25	README.md: added explanation on copying files from Docker image	Mohammad Akhlaghi	-2/+17
	When building Maneage inside a Docker container, in the end the users want to extract the final outputs from the container into their host operating system to inspect more comfortably. So with this commit, a short examplanation has been added on how to do this. We also noticed that it is much better if the 'Dockerfile' is stored and run in an empty directory, otherwise, it will start parsing the full directory and its subdirectories as the docker image's environment.
2020-08-20	Imported recent updates in Maneage, minor conflicts fixed	Mohammad Akhlaghi	-1/+10
	Some very minor conflicts came up and were easily corrected. They were mostly in parts that are also shared with the demonstration in the core Maneage branch.
2020-07-17	README.md now has description of building project in Docker	Mohammad Akhlaghi	-0/+218
	Docker is a "container" technology that allows an almost independent operating system run on the host. It is useful when the host OS doesn't support some features or has internal problems (for example its C library or C compiler have problems). Fortunately a Maneaged project can easily be built within a Docker image and a minimal image operating system. With this commit, a section has been added to 'README.md' to describe this process. Each step of the Dockerfile is explined, to help users that may not be too familiar with Docker, or help Docker user who are not familiar with Maneage.
2020-07-04	Improved comments in paper.mk and README.md	Mohammad Akhlaghi	-7/+9
	In 'README.md' I tried to explain a little better that TeXLive will only install its necessary packages, not the full TeXLive library! Also in paper.mk, I slightly improved the comments with very minor edits. Both these parts are slated to go into the core Maneage branch, so its important to maintain them here for now.
2020-07-03	Edits in REAME.md on Docker images and internet for TeXLive	Mohammad Akhlaghi	-9/+13
	The explanation was made more clear.
2020-07-01	Edited the Docker container explanations in README.md	Mohammad Akhlaghi	-79/+99
	The explanations are now more clear for someone that is less familiar with Docker.
2020-06-30	README.md: improved explanation on running Docker	Mohammad Akhlaghi	-57/+124
	With the new features in Maneage to install the necessry Xorg libraries, the explanations of the Docker image creation also needed to be updated.
2020-06-28	README.md now has descriptions to build a Dockerfile	Mohammad Akhlaghi	-29/+154
	Docker is a very commonly used program these days for building projects in an almost independent operating system. So the instructions to build a Dockerfile for the project were added in README.md.
2020-06-07	Added SoftwareHeritage link, minor typo corrections and clarifications	Mohammad Akhlaghi	-2/+2
	The git history of the project is now archived on SoftwareHeritage and a link to it as was added in the "Reproducible supplement" tag just under the abstract. Also, some corrections were also made in the text. In particular, the part explaining the separation of software and data reproducibility was slightly clarified to be more clear
2020-06-06	IMPORTANT: Added publication checklist, improved relevant infrastructure	Mohammad Akhlaghi	-58/+90
	Possible semantic conflicts (that may not show up as Git conflicts but may cause a crash in your project after the merge): 1) The project title (and other basic metadata) should be set in 'reproduce/analysis/conf/metadata.conf'. Please include this file in your merge (if it is ignored because of '.gitattributes'!). 2) Consider importing the changes in 'initialize.mk' and 'verify.mk' (if you have added all analysis Makefiles to the '.gitattributes' file (thus not merging any change in them with your branch). For example with this command: git diff master...maneage -- reproduce/analysis/make/initialize.mk 3) The old 'verify-txt-no-comments-leading-space' function has been replaced by 'verify-txt-no-comments-no-space'. The new function will also remove all white-space characters between the columns (not just white space characters at the start of the line). Thus the resulting check won't involve spacing between columns. A common set of steps are always necessary to prepare a project for publication. Until now, we would simply look at previous submissions and try to follow them, but that was prone to errors and could cause confusion. The internal infrastructure also didn't have some useful features to make good publication possible. Now that the submission of a paper fully devoted to the founding criteria of Maneage is complete (arXiv:2006.03018), it was time to formalize the necessary steps for easier submission of a project using Maneage and implement some low-level features that can make things easier. With this commit a first draft of the publication checklist has been added to 'README-hacking.md', it was tested in the submission of arXiv:2006.03018 and zenodo.3872248. To help guide users on implementing the good practices for output datasets, the outputs of the default project shown in the paper now use the new features). After reading the checklist, please inspect these. Some other relevant changes in this commit: - The publication involves a copy of the necessary software tarballs. Hence a new target ('dist-software') was also added to package all the project's software tarballs in one tarball for easy distribution. - A new 'dist-lzip' target has been defined for those who want to distribute an Lzip-compressed tarball. - The '\includetikz' LaTeX macro now has a second argument to allow configuring the '\includegraphics' call when the plot should not be built, but just imported.
2020-06-04	README.md, separated scenarios of building from tarball	Mohammad Akhlaghi	-23/+60
	The previous explanation was not too clear and simply following it was confusing. The issue was that with the tarball you have three scenarios: 1) only build the PDF using existing figures. 2) only build the PDF, but build the figures yourself, 3) build the full Maneaged project. Hopefully this distinction is now more clear from the README.md file.
2020-06-04	README.md: improved points on building from tarball	Mohammad Akhlaghi	-14/+11
	Some extra explanation can help the user understand the difference between a Git-based project and a distributed tarball.
2020-06-04	Verification activated, README added, Proper metadata in plot data	Mohammad Akhlaghi	-21/+17
	All the steps following the to-be-added (in 'README-hacking.md') publication checklist prior to the final check from new clone have been added: - 'README.md' file has been set. - "Reproducible supplement" was added just above the keywords, pointing to Zenodo. - A link to the to-be-uploaded data underlying the plot was added in the caption of the tools-per-year plot. - A new meta-data configuration file was added to store basic project metadata to be used throughout the project. This will later be taken into Maneage. For examle the project title is now stored here and written into the paper's LaTeX source and output datasets automatically. - Verification was activated and plot's data and LaTeX macro files are now automatically verified. - A complete metadata was added for the data underlying the plot. - A generic function was added in 'initialize.mk' that will automatically write project info and copyright in all plain-text outputs.
2020-05-22	Corrected copyright notices to fit GPL suggested format	Mohammad Akhlaghi	-1/+1
	In time, some of the copyright license description had been mistakenly shortened to two paragraphs instead of the original three that is recommended in the GPL. With this commit, they are corrected to be exactly in the same three paragraph format suggested by GPL. The following files also didn't have a copyright notice, so one was added for them: reproduce/software/make/README.md reproduce/software/bibtex/healpix.tex reproduce/analysis/config/delete-me-num.conf reproduce/analysis/config/verify-outputs.conf
2020-04-25	IMPORTANT: Primary Maneage repositories are now under maneage.org	Mohammad Akhlaghi	-2/+2
	Until now, the primary Maneage URLs were under GitLab, but since we now have a dedicated URL and Git repository, its better to transfer to this as soon as possible. Therefore with this commit, throughout Maneage, any place that Maneage was referenced through GitLab has been corrected. Please correct your project's remote to point to the new repository at `git.maneage.org/project.git', and please make sure it follows the `maneage' branch. There is no more `master' branch on Maneage.
2020-04-20	Maneage instead of Template in README-hacking.md and copyright notices	Mohammad Akhlaghi	-9/+6
	Until now, throughout Maneage we were using the old name of "Reproducible Paper Template". But we have finally decided to use Maneage, so to avoid confusion, the name has been corrected in `README-hacking.md' and also in the copyright notices. Note also that in `README-hacking.md', the main Maneage branch is now called `maneage', and the main Git remote has been changed to `https://gitlab.com/maneage/project' (this is a new GitLab Group that I have setup for all Maneage-related projects). In this repository there is only one `maneage' branch to avoid complications with the `master' branch of the projects using Maneage later.
2020-01-23	IMPORTANT: Project preparation is now also done with project make	Mohammad Akhlaghi	-15/+1
	Until now, the main commands to run the project were these: `./project configure' (to build the software), `./project prepare' (to possibly arrange input datasets and build special configuration Makefiles) and finally `./project make' to run the project. The main logic behind the "prepare" phase `top-prepare.mk' is to build configuration files that can be fed into the "make" step and optimize its operation. For example when the total number of necessary inputs for the majority of the analysis is not as large as the total number of inputs. With "prepare" (when necessary), you go through the raw inputs, select the ones that are necessary for the rest of the project. The output of `top-prepare.mk' is a configuration file (a Make variable) that keeps the IDs (numbers, names, etc). That configuration file would then be used in the `top-make.mk' to identify the lower level targets and allow optimal project organization and management. But the last two are both part of the analysis, and while they indeed need different calls to Make to be executed, many projects don't actually need a preparation phase: ultimately, its an implementation choice by the project developers and doesn't concern the project users (or the developers when they are running it). To avoid confusing the users, or simply annoying them when a projet doesn't need it, with this commit, the top-level `top-prepare.mk' and `top-make.mk' Makefiles are called with the single `./project make' command and `./project prepare' has been dropped. I noticed this while writing the paper on this system.
2020-01-01	Copyright statements updated to include 2020	Mohammad Akhlaghi	-1/+1
	Now that its 2020, its necessary to include this year in the copyright statements.
2019-10-01	Preparation phase added before final building	Mohammad Akhlaghi	-5/+18
	In many real-world scenarios, `./project make' can really benefit from having some basic information about the data before being run. For example when quering a server. If we know how many datasets were downloaded and their general properties, it can greatly optmize the process when we are designing the solution to be run in `./project make'. Therefore with this commit, a new phase has been added to the template's design: `./project prepare'. In the raw template this is empty, because the simple analysis done in the template doesn't warrant it. But everything is ready for projects using the template to add preparation phases prior to the analysis.
2019-09-28	Main project script sets executable flags at configure time	Mohammad Akhlaghi	-3/+3
	Until now, when the project's source was downloaded from something like arXiv, in `README.md', we were instructing them to set the executable flags of all the files that need it. But except for `./project', the reader shouldn't have to worry about the project internals! Once its executable, `./project' can easily fix the executable flags of all the files that need it automatically. With this commit, in `README.md', we just instruct the reader to set the executable flag of `./project' and any other file that needs an executable flag is given one at the start of the set of commands for `./project configure'. In customized projects, if an author needs executable flags on any other files, they can easily add it there without involving the user.
2019-09-27	Changing to the cloned directory added to README.md	Mohammad Akhlaghi	-0/+1
	Konrad Hinsen pointed out that this part was missing from the instructions in `README.md' after cloning. So it is added.
2019-09-26	Minor edits/clarifications in README.md	Mohammad Akhlaghi	-13/+7
	The two modifications to the LaTeX source of an arXiv-downloaded source weren't rendered properly on Gitlab, so they are corrected to be in the same line and not have a separate code-block.
2019-09-26	Working project when downloaded from arXiv	Mohammad Akhlaghi	-1/+77
	Until now, we were assuming that the users would just clone the project in Git. But after submitting arXiv:1909.11230, and trying to build directly from the arXiv source, I noticed several problems that wouldn't allow users to build it automatically. So I tried the build step by step and was able to find a fix for the several issues that came up. The scripting parts of the fix were primarily related to the fact that the unpacked arXiv tarball isn't under version control, so some checks had to be put there. Also, we wanted to make it easy to remove the extra files, so an extra `--clean-texdit' option was added to `./project'. Finally, some manual corrections were necessary (prior to running `./project', which are now described in `README.md'. Most of the later steps can be automated and we should do it later, I just don't have enough time now.
2019-09-18	README.md written to be more generic and easy to customize	Mohammad Akhlaghi	-37/+38
	Until now customizing it was a little more detailed, for example the copyright statement wasn't generic and was about "this template". So the user would have to correct it. With this commit, the copyright statment just says "this project", so it can apply to the raw template and also any customization of it. Also, some minor edits were made in the various parts of the text to make it more clear.
2019-08-28	Minor cosmectic markdown corrections in README.md	Mohammad Akhlaghi	-2/+2
	The Copyright year is now on a separate line (by adding a backslash), and the `file-metadata' is now enclosed in two "`" characters to show differently after rendering.
2019-07-28	Single wrapper instead of old ./configure, Makefile and ./for-group	Mohammad Akhlaghi	-6/+6
	Until now, to work on a project, it was necessary to `./configure' it and build the software. Then we had to run `.local/bin/make' to run the project and do the analysis every time. If the project was a shared project between many users on a large server, it was necessary to call the `./for-group' script. This way of managing the project had a major problem: since the user directly called the lower-level `./configure' or `.local/bin/make' it was not possible to provide high-level control (for example limiting the environment variables). This was especially noticed recently with a bug that was related to environment variables (bug #56682). With this commit, this problem is solved using a single script called `project' in the top directory. To configure and build the project, users can now run these commands: $ ./project configure $ ./project make To work on the project with other users in a group these commands can be used: $ ./project configure --group=GROUPNAME $ ./project make --group=GROUPNAME The old options to both configure and make the project are still valid. Run `./project --help' to see a list. For example: $ ./project configure -e --host-cc $ ./project make -j8 The old `configure' script has been moved to `reproduce/software/bash/configure.sh' and is called by the new `./project' script. The `./project' script now just manages the options, then passes control to the `configure.sh' script. For the "make" step, it also reads the options, then calls Make. So in the lower-level nothing has changed. Only the `./project' script is now the single/direct user interface of the project. On a parallel note: as part of bug #56682, we also found out that on some macOS systems, the `DYLD_LIBRARY_PATH' environment variable has to be set to blank. This is no problem because RPATH is automatically set in macOS and the executables and libraries contain the absolute address of the libraries they should link with. But having `DYLD_LIBRARY_PATH' can conflict with some low-level system libraries and cause very hard to debug linking errors (like that reported in the bug report). This fixes bug #56682.
2019-04-14	Replaced all occurances of pipeline in text	Mohammad Akhlaghi	-11/+11
	All occurances of "pipeline" have been chanaged to "project" or "template" withint the text (comments, READMEs, and comments) of the template. The main template branch is now also named `template'. This was all because `pipeline' is too generic and couldn't be distinguished from the base, and customized project.
2019-04-14	Corrected copyright information for .file-metadata	Mohammad Akhlaghi	-1/+2
	Since `.file-metadata' is a binary file, we can't include a copyright inside of it so we have to use `README.md' to mention its copyright and license notice. However, this was not done clearly and is now corrected.
2019-04-13	Corrected copyright notices and info about adding copyright info	Mohammad Akhlaghi	-4/+2
	Until now, the files where the people were meant to change didn't have a proper copyright notice (for example `Copyright (C) YOUR NAME.'). This was wrong because the license does not convey copyright ownership. So the name of the file's original author must always be included and when people modify it (and add their own copyright-able modifications). With this commit, the file's original author (and email) are added to the copyright notice and when more than one person modified a file, both names have their individual copyright notice. Based on this, the description for adding a copyright notice in `README-hacking.md' has also been modified.
2019-04-11	.file-metadata also given a copyright in top README	Mohammad Akhlaghi	-3/+5
	Since `.file-metadata' is a binary file and we couldn't put a copyright notice within it, it has been mentioned in `README.md' to have the same copyright. Also, the copyright modification step in `README-hacking.md' was brought to a later step to be more clear that it should always be done (on new files or files that are changed).
2019-04-07	Copyright notice added to all files missing one	Mohammad Akhlaghi	-3/+6
	Until now, for short files, we only had a license notice, not an actual copyright notice. With this commit, a copyright notice has also been added. We use this new command to find these files, suggested by `ineiev@gnu.org'.
2019-03-29	Added Copyright to all TeX and README files	Mohammad Akhlaghi	-0/+25
	In order to be more clear, a copyright statement was added to all the LaTeX and README files.
2019-01-23	README-pipeline.md is now called README-hacking.md	Mohammad Akhlaghi	-1/+1
	To be more generic and recognizable, the `README-pipeline.md' script was renamed to `README-hacking.md'. In essence, it is just that: to hack the existing pipeline for your own project. We follow a similar naming convention in many GNU software.
2019-01-17	README-pipeline.md referenced in README.md	Mohammad Akhlaghi	-1/+3
	Until now, there was no reference to `README-pipeline.md' within the `README.md' file. Since `README.md' is the first file that someone reads and the basic perpose and structure of the pipeline is described in `README-pipeline.md', it was necessary to bring it up there.
2018-12-06	Edited README.md to show example dependency repository	Mohammad Akhlaghi	-3/+3
	To help and be more clear a link to this pipeline's dependency repository has been added to `README.md'.
2018-12-05	Updated README.md	Mohammad Akhlaghi	-15/+15
	The README.md file was updated to reflect recent changes in the pipeline (especially regarding the downloader).
2018-11-22	Spell check in two READMEs	Mohammad Akhlaghi	-2/+2
	A spellcheck was run on the two README files.
2018-11-22	Minor edit/correction in README.md	Mohammad Akhlaghi	-9/+8
	The note to the pipeline designers was corrected to display properly on Gitlab.
2018-11-22	Placeholder in README.md for pipeline dependencies	Mohammad Akhlaghi	-2/+8
	A placeholder link is now used in `README.md' to encourage the pipeline designers to keep a backup of all the dependencies they use.
2018-11-22	Top level READMEs renamed to be similar to actual project	Mohammad Akhlaghi	-944/+49
	Until now, were were advising the users to rename the two README files after cloning the project. This was because online Git browsers usually display the `README.md' file, so we wanted the description of the pipeline to be visible in the pipeline, and later when a project adopts it, they can have their own `README.md'. But the problem is that any change in `REAME.md' will later cause conflicts with a project's `README.md'. So we are now using the same naming convention as the papers that use the pipeline.
2018-11-22	Checklist defining remote moved to top	Mohammad Akhlaghi	-19/+26
	In the checklist, we are now defining the remote host of the repository at an early stage. This is because we will need it in the `README.md' file (which now has a placeholder `XXXXXXX' instead of a valid URL).
2018-11-22	Using .local instead of ./.local in READMEs	Mohammad Akhlaghi	-6/+6
	Until now, in the instructions, we were suggesting to run `./.local/bin/make', but the `./' part is extra: this is already a directory and so the shell will be able to find it. So to make things more clear and easy to read/write, we removed the `./' part from the calls to our custom Make installation.
2018-11-21	Changing of README files in checklist	Mohammad Akhlaghi	-19/+25
	When you point to this project, the `README.md' file is the default file that opens on GitLab and other online git repositories. Since a reproduction pipeline project is different from the actual pipeline, its best for the default text that opens to describe the paper, not the pipeline. The old `README.md' is also kept, but its now called `REAME-pipeline.md'.
2018-11-21	Pulling into pipeline branch instead of fetching in README.md	Mohammad Akhlaghi	-13/+15
	In the previous commit, we were recommending to fetch the work from this pipeline. But since we have a separate `pipeline' branch, we can simply checkout to that branch and pull all the recent changes. So with this commit, the steps to get recent updates to the pipeline are updated.
2018-11-21	Fetching pipeline updates explained in README.md checklist	Mohammad Akhlaghi	-7/+26
	Since working on the pipeline will evolve along with the projects that use it, it can be useful for projects to fetch updates in the pipeline. So the checklist in `README.md' updated to explain how to do this cleanly.
2018-11-21	Updated description of Make in README.md	Mohammad Akhlaghi	-7/+13
	Until now, because we didn't build the dependencies internally, it was important for the pipeline to be usable with any version of Make. But because of the new installation of dependencies (including GNU Make), that is no longer the case. So we can safely use GNU Make and this needs to be mentioned in `README.md'.
2018-11-20	GNU Coreutils now built in basic dependencies	Mohammad Akhlaghi	-4/+6
	GNU Coreutils are basic programs that can help in the configuration of higher-level programs. Because of that, it was a dependency of almost all software built in `dependencies.mk'. To make things more clear, easier to read and faster (when building in parallel), the building of Coreutils is now moved to the `dependencies-basic.mk' rules. There, it is built along-side Bash. Since `dependenceis-basic.mk' is run and completed before `dependencies.mk', with this, we can be sure that Coreutils is present by the time we want to build the higher-level programs. Also, Zlib is now added as a dependency of Git also (it is necessary for its build).
2018-11-19	Minor corrections for easy applying of checklist	Mohammad Akhlaghi	-26/+30
	After going through the checklist for starting a new project based on the pipeline, I noticed some parts that could be modified to be more clear. They are now applied.