paper-concept.git - Paper (Towards Long-term and Archivable Reproducibility)

Age	Commit message (Collapse)	Author	Lines
2021-04-17	IMPORTANT: print-general-metadata new name for print-copyright	Mohammad Akhlaghi	-64/+79
	Summary: - Use the new name of this variable in your Makefiles. - In 'metadata.conf', remove fixed URL prefixes for DOIs ('https://doi.org/') or arXiv ('https://arxiv.org/abs'). Until now, the Make variable that would print the general metadata (of whole project) into each to-be-published dataset was called 'print-copyright'! But it now does much more than simply printing the copyright, it will also print a lot of metadata like arXiv ID, Zenodo DOI and etc into plain-text outputs. The out-dated name could thus be misleading and cause confusions. With this commit, the variable is therefore called 'print-general-metadata'. After merging your project with the Maneage branch, please replace any usage of 'print-copyright' to 'print-general-metadata'. Also with this commit, 'README-hacking.md' mentions 'metadata.conf' and 'print-general-metadata' in the "Publication checklist" section and reminds you to keep the first up to date, and use the second in your to-be-published datasets.
2021-01-04	README-hacking.md: edits and improvements to publication checklist	Mohammad Akhlaghi	-23/+40
	After going through the publication checklist, some edits were made to make things more clear. Also, an item was added to remind the project author that the commit hashes on the uploaded data files should be the same.
2021-01-02	Copyright year updated in all source files	Mohammad Akhlaghi	-4/+4
	Having entered 2021, it was necessary to update the copyright years at the top of the source files. We recommend that you do this for all your project-specific source files also.
2020-12-01	README-hacking.md: recommended to push maneage after merging	Mohammad Akhlaghi	-4/+11
	Until now at the end of the updating process, we hadn't explicity talked about pushing the branches. So people would usually only push their 'master' branch to their remote. While the merged 'master' branch does contain the commits from the core Maneage branch, having a no-updated 'maneage' branch reference on their remote can be confusing. With this commit, at the end of the process to merge with the 'maneage' branch we explicitly recommend to push both the 'master' and 'maneage' branches.
2020-11-28	README-hacking.md: updated paper to cite for using Maneage	Mohammad Akhlaghi	-4/+3
	Until now, we were asking the users of Maneage to cite the first paper that used its primoridal version (arXiv:1505:01664). But there is now a paper that fully describes the concept (arXiv:2006.03018). With this commit, in the 'citation' section of 'README-hacking.md' we now ask to cite the new paper.
2020-10-09	Update README-hacking.md with elaphrocentre ArXiv:2010.03742	Boud Roukema	-2/+11
	This commit updates README-hacking.md with the URIs for the 'elaphrocentre' galaxy formation pipeline paper arXiv:2010.03742. This makes three papers currently in the peer review pipeline: arXiv:2006.03018, arXiv:2007.11779, and arXiv:2010.03742, each chronologically corresponding to various stages of the review process.
2020-09-24	Gnuastro's analysis configuration files removed	Mohammad Akhlaghi	-5/+3
	Until now, the core Maneage branch included some configuration files for Gnuastro's programs. This was actually a remnant of the distant past when Maneage didn't actually build its own software and we had to rely on the host's software versions. This file contained the configuration files specific to Gnuastro for this project and also had a feature to avoid checking the host's own configuration files. However, we now build all our software ourselves with fixed configuration files (for the version that is being installed and its version is stored). So those extra configuration files were just extra and caused confusion and problems in some scenarios. With this commit, those extra files are now removed. Also, two small issues are also addressed in parallel with this commit: - When running './project make clean', the 'hardware-parameters.tex' macro file (which is created by './project configure' is not deleted. - The project title is now written into the default output's PDF's properties (through 'hypersetup' in 'tex/src/preamble-header.tex') through the LaTeX macro. All these issues were found and fixed with the help of Samane Raji.
2020-07-28	README-hacking.md: added new paper using Maneage (arXiv:2007.11779)	Boud Roukema	-4/+15
	Roukema+2020 (arXiv:2007.11779) is a newly published (as preprint) paper that uses Maneage, so it is being added to the list of published or submitted papers in 'README-hacking.md'. The Software Heritage URL sticks out way beyond the standard number of columns in the plain text form of the updated 'README-hacking.md' file, when rendered using markdown, it shouldn't look so bad. Also, see the related task https://savannah.nongnu.org/task/index.php?15736 (Raul+2020 should be Infante-Sainz+2020) for a suggestion of a more standard machine-readable format. It should be mentioned and emphasised to the reader that one should very carefully and obediently note and pay attention to the noteworthy fact that a few distracting words [1] such as "Note that" are removed in this commit. ;) [1] https://en.wiktionary.org/wiki/pontification
2020-07-20	README-hacking.md: clarify Zenodo usage in publication checklist	Boud Roukema	-8/+23
	This commit clarifies the initial usage of Zenodo for reserving a Zenodo identifier and starting an 'unpublished' upload. Some other minor wording changes are done here.
2020-06-06	IMPORTANT: Added publication checklist, improved relevant infrastructure	Mohammad Akhlaghi	-16/+330
	Possible semantic conflicts (that may not show up as Git conflicts but may cause a crash in your project after the merge): 1) The project title (and other basic metadata) should be set in 'reproduce/analysis/conf/metadata.conf'. Please include this file in your merge (if it is ignored because of '.gitattributes'!). 2) Consider importing the changes in 'initialize.mk' and 'verify.mk' (if you have added all analysis Makefiles to the '.gitattributes' file (thus not merging any change in them with your branch). For example with this command: git diff master...maneage -- reproduce/analysis/make/initialize.mk 3) The old 'verify-txt-no-comments-leading-space' function has been replaced by 'verify-txt-no-comments-no-space'. The new function will also remove all white-space characters between the columns (not just white space characters at the start of the line). Thus the resulting check won't involve spacing between columns. A common set of steps are always necessary to prepare a project for publication. Until now, we would simply look at previous submissions and try to follow them, but that was prone to errors and could cause confusion. The internal infrastructure also didn't have some useful features to make good publication possible. Now that the submission of a paper fully devoted to the founding criteria of Maneage is complete (arXiv:2006.03018), it was time to formalize the necessary steps for easier submission of a project using Maneage and implement some low-level features that can make things easier. With this commit a first draft of the publication checklist has been added to 'README-hacking.md', it was tested in the submission of arXiv:2006.03018 and zenodo.3872248. To help guide users on implementing the good practices for output datasets, the outputs of the default project shown in the paper now use the new features). After reading the checklist, please inspect these. Some other relevant changes in this commit: - The publication involves a copy of the necessary software tarballs. Hence a new target ('dist-software') was also added to package all the project's software tarballs in one tarball for easy distribution. - A new 'dist-lzip' target has been defined for those who want to distribute an Lzip-compressed tarball. - The '\includetikz' LaTeX macro now has a second argument to allow configuring the '\includegraphics' call when the plot should not be built, but just imported.
2020-06-04	README-hacking.md: minor edits in description of merging with Maneage	Mohammad Akhlaghi	-7/+15
	The recently added description for this step in the last commit needed some edits to be more clear and encourage re-building the project from scratch anytime authors merge with Maneage.
2020-06-03	README-hacking.md: Improved section on ignoring some files in Maneage	Mohammad Akhlaghi	-22/+53
	When some files should not be merged, until now we were suggesting to also add deleted files to the '.gitattributes' file. However, this feature of Git doesn't work for deleted files and they would still show up in the 'master' branch after a merge. So with this commit, we have added a simple AWK command to run after a merge that will automatically detect and delete such files (using the output of 'git status --porcelain'). Also, two minor typos were corrected in the newly added 'servers-backup.conf' file: the copyright year was wrong and there was no new-line at the end of the file (a good convention!).
2020-05-25	Unified reference to GNU/Linux and free software	Mohammad Akhlaghi	-2/+2
	One of the main reasons to building Maneage is to properly acknowledge/attribute the authors of software in research. So we have adopted a standard of never referring to the GNU-based operating systems running the Linux kernel simply as "Linux", we avoid terms like "Open Sourse" and use Free Software instead (in the same spirit). With this commit, a few instances of the cases above have been corrected, they had slipped through our fingers when we initially imported them into the project. In the special case of the "Journal for Open Source Software", we simply replaced it with its abbreviation (JOSS). This was done because in effect we were generally using journal name abbreviations in almost all the citations already. To avoid any inconsistancies, the names of the three other journals that weren't abbreviated are also abbreviated.
2020-05-22	Corrected copyright notices to fit GPL suggested format	Mohammad Akhlaghi	-10/+10
	In time, some of the copyright license description had been mistakenly shortened to two paragraphs instead of the original three that is recommended in the GPL. With this commit, they are corrected to be exactly in the same three paragraph format suggested by GPL. The following files also didn't have a copyright notice, so one was added for them: reproduce/software/make/README.md reproduce/software/bibtex/healpix.tex reproduce/analysis/config/delete-me-num.conf reproduce/analysis/config/verify-outputs.conf
2020-04-28	Astropy will no longer be installed by default	Mohammad Akhlaghi	-23/+30
	Until now Gnuastro and Astropy where installed by default in any clean build of Maneage. Gnuastro is used to do the demonstration analysis that is reported in the paper and Astropy was just there to help in testing the building of the MANY tools it depends on! It (and its dependencies) also had several papers that helped show software citation. However, as Boud suggested in task #15619, the burden of installing them for a new user may be too much and any future changes will cause merge conflicts. It may also give the impression that Maneage is only/mainly written for astronomers. So with this commit, I am removing Astropy as a default target. But we can only remove Gnuastro after we include an alternative analysis in the demonstration `delete-me' files. Following Boud's suggestion in that task, `TARGETS.conf' was also added to the files to be ignored in any future merge (in the checklist of `README-hacking.mk'). The solution was already described there, but mainly focused on the deleted `delete-me' files. So with this commit, I brought out this item as a more prominent item in the list. Maybe we can later add the analysis done in the Maneage paper (not yet published). In terms of testing the software builds, we already have task #15272 (Single target to build all high-level software, for testing) that aims to have a single configure option to install ALL high-level software and we can ask people to try if they like and report errors.
2020-04-26	README-hacking.md: described why automatic preparation only occurs once	Zahra Sharbaf	-1/+20
	Recently (since Commit 7d0c5ef77), the preparation is not run automatically every time. It is only run automatically the first time and needs to be manually called with the `--prepare-redo' option. But this wasn't explained in `README-hacking.md' (currently the main documentation of Maneage). With this commit, a description about invoking the preparation process after the first attempt of the running project has been added to `README-hacking.md'.
2020-04-25	IMPORTANT: Primary Maneage repositories are now under maneage.org	Mohammad Akhlaghi	-11/+10
	Until now, the primary Maneage URLs were under GitLab, but since we now have a dedicated URL and Git repository, its better to transfer to this as soon as possible. Therefore with this commit, throughout Maneage, any place that Maneage was referenced through GitLab has been corrected. Please correct your project's remote to point to the new repository at `git.maneage.org/project.git', and please make sure it follows the `maneage' branch. There is no more `master' branch on Maneage.
2020-04-21	README-hacking.md: removed any mention of tags	Mohammad Akhlaghi	-94/+68
	Tags are not a fixed piece of history (they can easily be moved and not imported in a different repository), so they are only confusing in the context of Maneage (where people should branch-off the main project). the raw commit hashes are a much more robust way to store a precise moment in history. Before this commit, I removed all Tags from the main Git repositories of Maneage and thus removed any mention of Tags with `README-hacking.md'. Ofcourse, if a project decides to use tags is upto them, but we won't implement it in the main branch.
2020-04-21	README-hacking.md: minor clarifications in checklist	Mohammad Akhlaghi	-28/+32
	Roberto Baena recently tried building a new project with Maneage and provided the following suggestions to make it more clear for a new user: 1) In the part where we talk about creating a Git repository, we should highlight that it must be empty. This is because some (for example Gitlab) propose to include a `README' file. But if the project is not empty, Git will not allow pushing to it. 2) The `(can be done later)' comment was removed from the "Delete dummy parts") to avoid confusion about applying some of them, but not others: if only some are done, it may cause problems in the build.
2020-04-20	README-hacking.md: Removed TeXLive year problem and numberd checklist	Mohammad Akhlaghi	-21/+11
	We recently fixed the problem of TeXLive that hard-codes the year of its build in its installation directory. But the note on this problem was still kept in `README-hacking.md'. That part is now removed. Also, to help in following the checklist, it is now an ordered list.
2020-04-20	Maneage instead of Template in README-hacking.md and copyright notices	Mohammad Akhlaghi	-217/+211
	Until now, throughout Maneage we were using the old name of "Reproducible Paper Template". But we have finally decided to use Maneage, so to avoid confusion, the name has been corrected in `README-hacking.md' and also in the copyright notices. Note also that in `README-hacking.md', the main Maneage branch is now called `maneage', and the main Git remote has been changed to `https://gitlab.com/maneage/project' (this is a new GitLab Group that I have setup for all Maneage-related projects). In this repository there is only one `maneage' branch to avoid complications with the `master' branch of the projects using Maneage later.
2020-04-17	IMPORTANT: software config directly under reproduce/software/config	Mohammad Akhlaghi	-30/+29
	Until now the software configuration parameters were defined under the `reproduce/software/config/installation/' directory. This was because the configuration parameters of analysis software (for example Gnuastro's configurations) were placed under there too. But this was terribly confusing, because the run-time options of programs falls under the "analysis" phase of the project. With this commit, the Gnuastro configuration files have been moved under the new `reproduce/analysis/config/gnuastro' directory and the software configuration files are directly under `reproduce/software/config'. A clean build was done with this change and it didn't crash, but it may cause crashes in derived projects, so after merging with Maneage, please re-configure your project to see if anything has been missed. Please let us know if there is a problem.
2020-04-01	Corrected reference for Infante-Sainz+2020 in README-hacking.md	Raul Infante-Sainz	-1/+1
	Until this commit, the year of this paper was 2019 and the linking url was the temporal one. However, the final official publication year is 2020. With this commit, the year and the url have been changed to the final ones.
2020-02-18	README-hacking.md: corrected typo in project command	Mohammad Akhlaghi	-1/+1
	I had forgot to add a `./' before the call to `project' for the `--check-config'.
2020-02-05	Imported Raul's additions to README-hacking.md, no conflicts	Mohammad Akhlaghi	-2/+16
	There were no conflicts in this merge.
2020-02-01	IMPORTANT: reproduce/software/bash renamed to reproduce/software/shell	Mohammad Akhlaghi	-2/+2
	Until now the shell scripts in the software building phase were in the `reproduce/software/bash' directory. But given our recent change to a POSIX-only start, the `configure.sh' shell script (which is the main component of this directory) is no longer written with Bash. With this commit, to fix that problem, that directory's name has been changed to `reproduce/software/shell'.
2020-01-27	Moving basic configuration of Git section in README-hacking.md	Raul Infante-Sainz	-13/+13
	Until this commit, the small section of `README-hacking.md' in which it is explained how to do the first configuration of Git was at the beginning of the section `First custom commit'. However, it is better to have it just before the item `Your first commit' in that section. With this commit, this change has been done. Now the reader has the necessary steps for configuring Git just before it is needed for making the first commit.
2020-01-23	IMPORTANT: Project preparation is now also done with project make	Mohammad Akhlaghi	-21/+12
	Until now, the main commands to run the project were these: `./project configure' (to build the software), `./project prepare' (to possibly arrange input datasets and build special configuration Makefiles) and finally `./project make' to run the project. The main logic behind the "prepare" phase `top-prepare.mk' is to build configuration files that can be fed into the "make" step and optimize its operation. For example when the total number of necessary inputs for the majority of the analysis is not as large as the total number of inputs. With "prepare" (when necessary), you go through the raw inputs, select the ones that are necessary for the rest of the project. The output of `top-prepare.mk' is a configuration file (a Make variable) that keeps the IDs (numbers, names, etc). That configuration file would then be used in the `top-make.mk' to identify the lower level targets and allow optimal project organization and management. But the last two are both part of the analysis, and while they indeed need different calls to Make to be executed, many projects don't actually need a preparation phase: ultimately, its an implementation choice by the project developers and doesn't concern the project users (or the developers when they are running it). To avoid confusing the users, or simply annoying them when a projet doesn't need it, with this commit, the top-level `top-prepare.mk' and `top-make.mk' Makefiles are called with the single `./project make' command and `./project prepare' has been dropped. I noticed this while writing the paper on this system.
2020-01-22	Adding Raul as contributor of README-hacking.md	Raul Infante-Sainz	-1/+2
	Since I (Raul) did some changes (and I hope to do more :-)) in the `README-hacking.md', I am adding my information at the beginning of this file.
2020-01-22	Adding basic configuration of Git in README-hacking.md	Raul Infante-Sainz	-1/+14
	Until this commit, we were asuming that Git was already properly configured. However, in order to be as complete as possible, it would be good if the basic commands to configure Git were in the `README-hacking.md'. With this commit, a small paragraph has been added in order to have the basic Git configuration commands (i.e. to configure the name, email, and favorite text editor).
2020-01-20	IMPORTANT!!! Configuration Makefiles now have a .conf suffix	Mohammad Akhlaghi	-33/+33
	Until now, the configuration Makefiles (in `reproduce/software/config/installation' and `reproduce/analysis/config') had a `.mk' suffix, similar to the workhorse Makefiles. Although they are indeed Makefiles, but given their nature (to only keep configuration parameters), it is confusing (especially to early users) for them to also have a `.mk' (similar to the analysis or software building Makefiles). To address this issue, with this commit, all the configuration Makefiles (in those directories) are now given a `.conf' suffix. This is also assumed for all the files that are loaded. The configuration (software building) and running of the template have been checked with this change from scratch, but please report any error that may not have been noticed. THIS IS AN IMPORTANT CHANGE AND WILL CAUSE CRASHES OR UNEXPECTED BEHAVIORS FOR PROJECTS THAT HAVE BRANCHED FROM THIS TEMPLATE. PLEASE CORRECT THE SUFFIX OF ALL YOUR PROJECT'S CONFIGURATION MAKEFILES (IN THE DIRECTORIES ABOVE), OTHERWISE THEY AREN'T AUTOMATICALLY LOADED ANYMORE.
2020-01-19	New --check-config option to ./project to check software build status	Mohammad Akhlaghi	-31/+14
	Until now, it was necessry to run a long `while true' loop to see what is currently being built at configure time. So with this commit, a new `--checkconfig' option has been added to `./project' that can be called to run that loop and make it easier to check.
2020-01-18	README-hacking.md: edits and corrections for easier customization	Mohammad Akhlaghi	-8/+10
	The checklist descriptions were slightly edited to be more clear. Also, while following them, I noticed that while removing the "delete-me" parts on `verify.mk', would cause an error: the `if [ $$m == delete-me ];' statement we were saying to delete cause an error because `elif' was the first statement Bash would see. So with this commit, the `download' conditional (which isn't instructed to be deleted) was set to be the top (with an `if') and the `delete-me' conditional now has an `elif'.
2020-01-17	README-hacking.md: script to list installed programs before configure	Mohammad Akhlaghi	-10/+18
	Until now, the small one-line script that lists programs was introduced in the checklist after running `./project configure'. But people would mostly miss it because they would wait until the configuration is complete. With this commit, that point has been put above the `./project configure' step. Readers are instructed to open a new terminal and run that script, then go to the next step so they see the directories get filled actively. It will also help them understand what is going on.
2020-01-13	Minor corrections in referencing Infante-Sainz+2019	Mohammad Akhlaghi	-6/+5
	Until now the actual journal webpage was used for Raul's paper. However, the journal webpage needs authorized access for people to read it, therefore its will be inaccessible for many people. A better and more well known place for the paper (atleast in astronomy) is the ADS link. In the ADS link, if someone has access to the journal, they will get the journal's version and if not, they will get the arXiv version. It also has a common BibTeX export tool for all journals. We had also done this for the other papers in that list. With this commit, I thus changed the URL for the paper, and also removed the "issue" number (4 in this case), since that is mostly irrelevant, only the volume and page numbers are usually used for the other papers too.
2020-01-13	Added Infante-Sainz et al. 2019 as most recent paper using this template	Raul Infante-Sainz	-3/+4
	The "SDSS extended PSFs" paper was already included as an example of papers wich uses this template. However, the reference was the arXiv one. With this commit, since the paper has been finally published, it has been added the final reference to the journal.
2020-01-01	Verification function checks if file exists	Mohammad Akhlaghi	-5/+7
	Until now, if the file to be verified didn't exist, a different checksum would be generated, and it would stop, but it wasn't immediately clear if the differing checksum is because the file doesn't exist at all! With this commit, before calculating the checksum, we first make sure if the file exists. If it doesn't exist an explicit error is printed and thus will help the project editor to find the cause of the problem.
2020-01-01	Minor corrections in `README-hacking.md' after verification	Mohammad Akhlaghi	-4/+7
	In the previous commit, I had forgot to update a small part in the checklist (when modifying `top-make.mk') which is now corrected. I also added a few sentences in the description of how to customize the verification to make it easier to understand.
2020-01-01	Verification of output values and data added within template	Mohammad Akhlaghi	-9/+51
	Until now, the only verification that the template provided was the published PDF. Users had to manually compare the published and generated PDFs (numbers, plots, tables) and see if they obtained the same result. However, this type of manual verification is not good and is prone to frustration and missing important differences. With this commit, a new Makefile has been added in the analysis steps: `verify.mk'. It provides facilities to easily verify the results that go into the paper. For example tables that go into making the paper's plots, or the LaTeX macros that blend into the text. See the updated parts in `README-hacking.md` for a more complete explanation. This completes task #15497.
2020-01-01	README-hacking.md checklist now also ignores changes in paper.tex	Mohammad Akhlaghi	-7/+12
	In the previous commit, we added the files to ignore from the template branch, but only the files that had been deleted. With this commit, `paper.tex' is also added to the files that must be ignored from the template branch (the file remains in the project, but in the template branch, its contents are just dummy place-holders).
2020-01-01	Added step README-hacking's checklist to avoid merging dummy files	Mohammad Akhlaghi	-0/+15
	During the checklist we guide the user to delete the dummy `delete-me*' files from their custom branch. Later, if the dummy files are updated in the template's master branch, if the user merges with the template branch, these files will be written back into their project! This is very annoying! With this commit, a step was added in the `README-hacking.md' checklist, just after deleting the dummy files to avoid this problem using the `.gitattributes' file, telling Git to keep the changes as implemented in the merging branch (`ours').
2020-01-01	Copyright statements updated to include 2020	Mohammad Akhlaghi	-3/+3
	Now that its 2020, its necessary to include this year in the copyright statements.
2019-11-06	Added 1911.01430 as most recent paper using this template	Mohammad Akhlaghi	-0/+7
	Raul's paper (that uses this template) was just published on arXiv today (congratulations Raul!). So it has been added to the list of papers using this template.
2019-10-30	Added ./project prepare in the checklist of README-hacking.md	Mohammad Akhlaghi	-2/+3
	Since adding this new step, I had forgot to mention it in the checklist of `README-hacking.md'. It is added with this commit.
2019-10-29	Minor edits to README-hacking.md	Mohammad Akhlaghi	-28/+31
	The part on using shared memory was edited for a few things that weren't clear.
2019-10-29	Further small edit in using shared memory of README-hacking.md	Mohammad Akhlaghi	-3/+2
	Some typos were fixed.
2019-10-29	Minor edits to suggestion on using shared memory	Mohammad Akhlaghi	-3/+9
	The edits help it be more clear, and remind the reader to delete any remaining file in the RAM in the end.
2019-10-29	Suggestion on good usage of /dev/shm in README-hacking.md	Mohammad Akhlaghi	-6/+57
	When you are working with large files and there is some good RAM in the system (large/powerful computers), it is beneficial to work in the shared memory directory and not the actual persistent storage (like HDD or SSD). With this commit, a fully working demo has been added to `README-hacking.md' (under the tips of "Make programming") to show how to effectively work in situations like this.
2019-10-01	Preparation phase added before final building	Mohammad Akhlaghi	-35/+53
	In many real-world scenarios, `./project make' can really benefit from having some basic information about the data before being run. For example when quering a server. If we know how many datasets were downloaded and their general properties, it can greatly optmize the process when we are designing the solution to be run in `./project make'. Therefore with this commit, a new phase has been added to the template's design: `./project prepare'. In the raw template this is empty, because the simple analysis done in the template doesn't warrant it. But everything is ready for projects using the template to add preparation phases prior to the analysis.
2019-09-26	Minor edit in README-hacking.md	Mohammad Akhlaghi	-3/+2
	The description of arXiv:1909.11230 was slightly modified to be in the same format as the other papers.