Age | Commit message (Collapse) | Author | Lines |
|
Until now, the main commands to run the project were these: `./project
configure' (to build the software), `./project prepare' (to possibly
arrange input datasets and build special configuration Makefiles) and
finally `./project make' to run the project.
The main logic behind the "prepare" phase `top-prepare.mk' is to build
configuration files that can be fed into the "make" step and optimize its
operation. For example when the total number of necessary inputs for the
majority of the analysis is not as large as the total number of
inputs. With "prepare" (when necessary), you go through the raw inputs,
select the ones that are necessary for the rest of the project. The output
of `top-prepare.mk' is a configuration file (a Make variable) that keeps
the IDs (numbers, names, etc). That configuration file would then be used
in the `top-make.mk' to identify the lower level targets and allow optimal
project organization and management.
But the last two are both part of the analysis, and while they indeed need
different calls to Make to be executed, many projects don't actually need a
preparation phase: ultimately, its an implementation choice by the project
developers and doesn't concern the project users (or the developers when
they are running it).
To avoid confusing the users, or simply annoying them when a projet doesn't
need it, with this commit, the top-level `top-prepare.mk' and `top-make.mk'
Makefiles are called with the single `./project make' command and
`./project prepare' has been dropped. I noticed this while writing the
paper on this system.
|
|
Until now, the configuration Makefiles (in
`reproduce/software/config/installation' and `reproduce/analysis/config')
had a `.mk' suffix, similar to the workhorse Makefiles. Although they are
indeed Makefiles, but given their nature (to only keep configuration
parameters), it is confusing (especially to early users) for them to also
have a `.mk' (similar to the analysis or software building Makefiles).
To address this issue, with this commit, all the configuration Makefiles
(in those directories) are now given a `.conf' suffix. This is also assumed
for all the files that are loaded.
The configuration (software building) and running of the template have been
checked with this change from scratch, but please report any error that may
not have been noticed.
THIS IS AN IMPORTANT CHANGE AND WILL CAUSE CRASHES OR UNEXPECTED BEHAVIORS
FOR PROJECTS THAT HAVE BRANCHED FROM THIS TEMPLATE. PLEASE CORRECT THE
SUFFIX OF ALL YOUR PROJECT'S CONFIGURATION MAKEFILES (IN THE DIRECTORIES
ABOVE), OTHERWISE THEY AREN'T AUTOMATICALLY LOADED ANYMORE.
|
|
Until now, it was necessry to run a long `while true' loop to see what is
currently being built at configure time. So with this commit, a new
`--checkconfig' option has been added to `./project' that can be called to
run that loop and make it easier to check.
|
|
The checklist descriptions were slightly edited to be more clear. Also,
while following them, I noticed that while removing the "delete-me" parts
on `verify.mk', would cause an error: the `if [ $$m == delete-me ];'
statement we were saying to delete cause an error because `elif' was the
first statement Bash would see. So with this commit, the `download'
conditional (which isn't instructed to be deleted) was set to be the top
(with an `if') and the `delete-me' conditional now has an `elif'.
|
|
Until now, the small one-line script that lists programs was introduced in
the checklist after running `./project configure'. But people would mostly
miss it because they would wait until the configuration is complete.
With this commit, that point has been put above the `./project configure'
step. Readers are instructed to open a new terminal and run that script,
then go to the next step so they see the directories get filled
actively. It will also help them understand what is going on.
|
|
Until now the actual journal webpage was used for Raul's paper. However,
the journal webpage needs authorized access for people to read it,
therefore its will be inaccessible for many people. A better and more well
known place for the paper (atleast in astronomy) is the ADS link. In the
ADS link, if someone has access to the journal, they will get the journal's
version and if not, they will get the arXiv version. It also has a common
BibTeX export tool for all journals. We had also done this for the other
papers in that list.
With this commit, I thus changed the URL for the paper, and also removed
the "issue" number (4 in this case), since that is mostly irrelevant, only
the volume and page numbers are usually used for the other papers too.
|
|
The "SDSS extended PSFs" paper was already included as an example of
papers wich uses this template. However, the reference was the arXiv
one. With this commit, since the paper has been finally published, it
has been added the final reference to the journal.
|
|
Until now, if the file to be verified didn't exist, a different checksum
would be generated, and it would stop, but it wasn't immediately clear if
the differing checksum is because the file doesn't exist at all!
With this commit, before calculating the checksum, we first make sure if
the file exists. If it doesn't exist an explicit error is printed and thus
will help the project editor to find the cause of the problem.
|
|
In the previous commit, I had forgot to update a small part in the
checklist (when modifying `top-make.mk') which is now corrected.
I also added a few sentences in the description of how to customize the
verification to make it easier to understand.
|
|
Until now, the only verification that the template provided was the
published PDF. Users had to manually compare the published and generated
PDFs (numbers, plots, tables) and see if they obtained the same
result. However, this type of manual verification is not good and is prone
to frustration and missing important differences.
With this commit, a new Makefile has been added in the analysis steps:
`verify.mk'. It provides facilities to easily verify the results that go
into the paper. For example tables that go into making the paper's plots,
or the LaTeX macros that blend into the text. See the updated parts in
`README-hacking.md` for a more complete explanation.
This completes task #15497.
|
|
In the previous commit, we added the files to ignore from the template
branch, but only the files that had been deleted. With this commit,
`paper.tex' is also added to the files that must be ignored from the
template branch (the file remains in the project, but in the template
branch, its contents are just dummy place-holders).
|
|
During the checklist we guide the user to delete the dummy `delete-me*'
files from their custom branch. Later, if the dummy files are updated in
the template's master branch, if the user merges with the template branch,
these files will be written back into their project! This is very annoying!
With this commit, a step was added in the `README-hacking.md' checklist,
just after deleting the dummy files to avoid this problem using the
`.gitattributes' file, telling Git to keep the changes as implemented in
the merging branch (`ours').
|
|
Now that its 2020, its necessary to include this year in the copyright
statements.
|
|
Raul's paper (that uses this template) was just published on arXiv today
(congratulations Raul!). So it has been added to the list of papers using
this template.
|
|
Since adding this new step, I had forgot to mention it in the checklist of
`README-hacking.md'. It is added with this commit.
|
|
The part on using shared memory was edited for a few things that weren't
clear.
|
|
Some typos were fixed.
|
|
The edits help it be more clear, and remind the reader to delete any
remaining file in the RAM in the end.
|
|
When you are working with large files and there is some good RAM in the
system (large/powerful computers), it is beneficial to work in the shared
memory directory and not the actual persistent storage (like HDD or
SSD).
With this commit, a fully working demo has been added to
`README-hacking.md' (under the tips of "Make programming") to show how to
effectively work in situations like this.
|
|
In many real-world scenarios, `./project make' can really benefit from
having some basic information about the data before being run. For example
when quering a server. If we know how many datasets were downloaded and
their general properties, it can greatly optmize the process when we are
designing the solution to be run in `./project make'.
Therefore with this commit, a new phase has been added to the template's
design: `./project prepare'. In the raw template this is empty, because the
simple analysis done in the template doesn't warrant it. But everything is
ready for projects using the template to add preparation phases prior to
the analysis.
|
|
The description of arXiv:1909.11230 was slightly modified to be in the same
format as the other papers.
|
|
This paper was published on arXiv today and is a good example for people to
see the application of this system in practice.
|
|
After a re-read on Gitlab, it has been slightly edited to be more clear.
|
|
When you want to publish your project, it is very convenient to have a
single file that contains the whole history. So a tip is added to
`README-hacking.md' that describes how to do this with `git bundle'.
|
|
Until now, the paper's title and author information were set it
`tex/src/preamble-header.tex'. But they are actually shown in the final PDF
paper and a much better place to keep them is the top-level `paper.tex'.
With this commit, the setting of the title and author names has been moved
to `paper.tex', just after importing all the preambles. However, the basic
package importation and low-level settings are still set in
`tex/src/preamble-header.tex', because they are relatively low-level.
This task was suggested by Deepak (Indian Institute of Astrophysics).
|
|
Until now, when describing the sections to remove for customizing a
project, I had mistakenly repeated the `%% Start of main body.'
statement.
With this commit, the second one is changed to `%% End of main body.'
This issue was reported by Deepak.
|
|
After the checklist was applied in the 5th Indo-French Astronomy School, we
found some cases in the checklist that were extra (and thus had to be
removed), or were needed (and thus were added).
Also the non-necessary steps for a first commit were moved to a
separate/new section in the checklist for the people to add after doing
their first commit.
Also, the software part of the paper was moved to an appendix.
|
|
Until now the only way to define the environment of the Make recipes was
through the exported Make variables (mostly in `initialize.mk' for the
analysis steps for example). However, there is only so much you can do with
environment variables! In some situations you want slightly more
complicated environment control, like setting an alias or running of
scripts (things that are commonly done in the `~/.bashrc' file of users to
configure their interactive, non-login shells).
With this commit, a `reproduce/software/bash/bashrc.sh' has been defined
for this job (which is currently empty!). Every major Make step of the
project adds this file as the `BASH_ENV' environment variable, so the shell
that is created to execute a recipe first executes this file, then the
recipe. Each top-level Makefile also defines a `PROJECT_STATUS' environment
variable that enables users to limit their envirnoment setup based on the
condition it is being setup (in particular in the early phase of
`basic.mk', where the user can't make any assumption about the programs and
has to write a portable shell script).
|
|
Until now there was no clear explanation on `.file-metadata' within the
project. But since it sometimes appears in the Git changed files and its
binary, it was necessary to add a short explanation to inform users.
With this commit a section has been added to the "Project architecture"
section of `README-hacking.md' to give some context on what it is and how
to deal with it.
This was suggested by Hamed Altafi.
|
|
Until now, to work on a project, it was necessary to `./configure' it and
build the software. Then we had to run `.local/bin/make' to run the project
and do the analysis every time. If the project was a shared project between
many users on a large server, it was necessary to call the `./for-group'
script.
This way of managing the project had a major problem: since the user
directly called the lower-level `./configure' or `.local/bin/make' it was
not possible to provide high-level control (for example limiting the
environment variables). This was especially noticed recently with a bug
that was related to environment variables (bug #56682).
With this commit, this problem is solved using a single script called
`project' in the top directory. To configure and build the project, users
can now run these commands:
$ ./project configure
$ ./project make
To work on the project with other users in a group these commands can be
used:
$ ./project configure --group=GROUPNAME
$ ./project make --group=GROUPNAME
The old options to both configure and make the project are still valid. Run
`./project --help' to see a list. For example:
$ ./project configure -e --host-cc
$ ./project make -j8
The old `configure' script has been moved to
`reproduce/software/bash/configure.sh' and is called by the new `./project'
script. The `./project' script now just manages the options, then passes
control to the `configure.sh' script. For the "make" step, it also reads
the options, then calls Make. So in the lower-level nothing has
changed. Only the `./project' script is now the single/direct user
interface of the project.
On a parallel note: as part of bug #56682, we also found out that on some
macOS systems, the `DYLD_LIBRARY_PATH' environment variable has to be set
to blank. This is no problem because RPATH is automatically set in macOS
and the executables and libraries contain the absolute address of the
libraries they should link with. But having `DYLD_LIBRARY_PATH' can
conflict with some low-level system libraries and cause very hard to debug
linking errors (like that reported in the bug report).
This fixes bug #56682.
|
|
Until now the description of the commit message guidelines wasn't clear
enough and could cause confusion, in particular because it didn't describe
why some basic formatting issues are mandatory.
With this commit, it is explained that the main reason we require
contributors for follow this format is "consistency" within the
project. Also generally it was edited to become more elaborate and explain
the points more clearly.
I also ran a spell check over the whole file and fixed a few typos.
This correction was suggested by Mohammad-reza Khellat.
|
|
Until now there was no guideline in `README-hacking.md' to describe/suggest
a good format for commit messages.
With this commit a point has been added in the "Tips" section to help new
developers contribute more smoothly.
The necessity of this paragraph was pointed out by Mohammad-reza Khellat.
|
|
The new command-box wasn't being rendered properly, so another correction
is made here. I also added the prompt `$' sign in another box of commands.
|
|
After checking the previous commit on Gitlab (to see how it is rendered), I
noticed that the code has come in the same line, not as a sperate
box. Hopefully this commit will fix it.
|
|
It is useful to visually see how the building of software is progressing
when running configure. I have been using a simple Bash `while' loop for
this, so I added it in the `README-hacking.md' to be useful for others too.
|
|
A title in the checklist was mistakenly using "project" (customized
template) instead of "template".
|
|
Since we just download the binary source of TeXLive, we need to keep it up
to date with the server. So it has been incremented to 2019 (TeXLive 2019
was released April 29th).
A note was also added in the Checklist to keep the users informed on how to
update TeXLive if necessary.
|
|
Until now, to specify which high-level software you want the project to
contain, it was necessary to go into the `high-level.mk' Makefile that is
complicated and can create bugs.
With this commit, a new `reproduce/software/config/installation/TARGETS.mk'
file has been created that is easily/cleanly in charge of documenting the
final high-level software that must be built for the project.
Also, until now, FFTW was set as a dependency of Numpy while we couldn't
actually get Numpy to use it! It was just there for future reference and to
justify its build rule. But now that many software won't be built and there
is no problem with having rules even though a project might not use them,
it has been removed.
|
|
In two places, I had mistakenly put a <'> instead of a <`>, causing bad
highlighting in the markdown rendering. They have been corrected.
Also, one long line in was broken up into several.
|
|
Until now, the customization checklist of `README-hacking.md' had the same
name for the base template's remote and branch. This was confusing and
would cause Git to print a warning.
With this commit, like before, the template's remote is now called
`template-origin', and `template' is only the branch name.
|
|
Until now, the main template branch was called `template'. However, the
standard Git convention is that the main branch of a project be called
`master'. Many systems rely on this default and it is also easier for new
users (who have been accustomed to this convention).
So with this commit, the main template branch is `master', but in
`README-hacking.mk', we instruct the users on how to rename it to
`template' as part of their customization. This is infact better, because
when we are actually developing the template in a separate fork, we can
refer/use the `master' branch like any other project. And when we are
working on a project that uses this template, we will be referring to the
main template branch as `template'.
|
|
Until now, the software building and analysis steps of the pipeline were
intertwined. However, these steps (of how to build a software, and how to
use it) are logically completely independent.
Therefore with this commit, the pipeline now has a new architecture
(particularly in the `reproduce' directory) to emphasize this distinction:
The `reproduce' directory now has the two `software' and `analysis'
subdirectories and the respective parts of the previous architecture have
been broken up between these two based on their function. There is also no
more `src' directory. The `config' directory for software and analysis is
now mixed with the language-specific directories.
Also, some of the software versions were also updated after some checks
with their webpages.
This new architecture will allow much more focused work on each part of the
pipeline (to install the software and to run them for an analysis).
|
|
All occurances of "pipeline" have been chanaged to "project" or "template"
withint the text (comments, READMEs, and comments) of the template. The
main template branch is now also named `template'.
This was all because `pipeline' is too generic and couldn't be
distinguished from the base, and customized project.
|
|
Until now, the files where the people were meant to change didn't have a
proper copyright notice (for example `Copyright (C) YOUR NAME.'). This was
wrong because the license does not convey copyright ownership. So the name
of the file's original author must always be included and when people
modify it (and add their own copyright-able modifications).
With this commit, the file's original author (and email) are added to the
copyright notice and when more than one person modified a file, both names
have their individual copyright notice.
Based on this, the description for adding a copyright notice in
`README-hacking.md' has also been modified.
|
|
Since `.file-metadata' is a binary file and we couldn't put a copyright
notice within it, it has been mentioned in `README.md' to have the same
copyright.
Also, the copyright modification step in `README-hacking.md' was brought to
a later step to be more clear that it should always be done (on new files
or files that are changed).
|
|
Until now, for short files, we only had a license notice, not an actual
copyright notice. With this commit, a copyright notice has also been
added. We use this new command to find these files, suggested by
`ineiev@gnu.org'.
|
|
Until now, the steps to manage the command-line options of the configure
script were limited (couldn't accept an equal sign or space between the
option name and value). With this commit, it can now also accept optional
equal signs between the option name and value. Thus not causing many
confusions.
Also, it is more logically consistent for the link to the build-directory
to be placed in the top directory (as a hidden file like `.local' until
now), and not as a visible directory like `reproduce/build' (which we used
until now). Therefore, with this commit, the link to easily access the
build-directory is `.build' in the top source directory.
Finally, because `minmapsize' is too specific to Gnuastro and has now been
given its default value at the start of the configure script, the
description for `minmapsize' has been removed (to not confuse users who
don't use Gnuastro). If anyone is familiar enough with Gnuastro to change
it, they already know it from its book.
|
|
In order to be more clear, a copyright statement was added to all the LaTeX
and README files.
|
|
This section was a little outdated and since then, a more clear/exact image
of using the Nix experience for the reproducible paper template has been
added.
|
|
In order to collaborate effectively in the project, even project members
that don't necessarily want (or have the capacity) to do the whole analysis
must be able to contribute to the project. Until now, the users of the
distributed tarball could only modify the text and not the figures (built
with PGFPlots) of the paper.
With this commit, the management of TeX source files in the pipeline was
slightly modified to allow this as cleanly as I could think of now! In
short, the hand-written TeX files are now kept in `tex/src' and for the
pipeline's generated TeX files (in particular the old `tex/pipeline.tex'),
we now have a `tex/pipeline' symbolic-link/directory that points to the
`tex' directory under the build directory.
When packaging the project, `tex/pipeline' will be a full directory with a
copy of all the necessary files. Therefore as far as LaTeX is concerned,
having a build-directory is no longer relevant. Many other small changes
were made to do this job cleanly which will just make this commit message
too long!
Also, the old `tarball' and `zip' targets are now `dist' and `dist-zip' (as
in the standard GNU Build system).
|