Age | Commit message (Collapse) | Author | Lines |
|
As described in Maneage's commit 2bd2e2f18 (which I found while testing
this project), the existing download recipe had problems when using a local
copy of the input dataset. It was first fixed here, then implemented there.
Also, to clarify things for a new user, some long comments were added at
the top of 'INPUTS.conf' to describe each of the variables, that comment
has also been put here (and is also in commit 2bd2e2f18 of Maneage).
|
|
The minor conflict was with 'reproduce/software/make/high-level.mk', and in
particular because we implemented the fix to Maneage's Task #15664 in this
project first. After it was moved to the main Maneage branch some minor
stylistic corrections were done to it, thus causing the conflict. To
resolve the conflict, I simply imported the full Maneage version of the
file with this command:
git checkout maneage -- reproduce/software/make/high-level.mk
The other conflicts were due to the deleted files (that were resolved as
described in 'README-hacking.md') and the LaTeX files that I had told
'.gitattributes' to ignore from the Maneage branch.
|
|
In time, some of the copyright license description had been mistakenly
shortened to two paragraphs instead of the original three that is
recommended in the GPL. With this commit, they are corrected to be exactly
in the same three paragraph format suggested by GPL.
The following files also didn't have a copyright notice, so one was added
for them:
reproduce/software/make/README.md
reproduce/software/bibtex/healpix.tex
reproduce/analysis/config/delete-me-num.conf
reproduce/analysis/config/verify-outputs.conf
|
|
A few small conflicts showed up here and there. They are fixed with this
merge.
|
|
[Compared to first submission to DSJ last week with 11436 words in raw PDF,
we have decreased the paper by ~1000 words to 10493 :-)]
As with the previous commits, the moment Boud changed the structure of
sentences, I was able to find the redundancies and remove them! This is a
fascinating feature of collaboration I had never felt before: it is so hard
to find redundancies in my own raw text, but even a minor correction by
someone else suddeny breaks my mental memories/barrier on the sentence,
allowing me to be more critical to it!
Anyway, besides such corrections, I fixed a few other things: 1) In the
DSJ's recently published papers, ther is no `~' between "Figure" and its
number. 2) I noticed that in `tex/src/figure-src-inputconf.tex' I was
actually using manually input strings for the filename, checksum and size!
This was contrary to the whole philosophy of Maneage(!), I must have rushed
and forgot! So LaTeX variables are now defined and used.
|
|
Until now, throughout Maneage we were using the old name of "Reproducible
Paper Template". But we have finally decided to use Maneage, so to avoid
confusion, the name has been corrected in `README-hacking.md' and also in
the copyright notices.
Note also that in `README-hacking.md', the main Maneage branch is now
called `maneage', and the main Git remote has been changed to
`https://gitlab.com/maneage/project' (this is a new GitLab Group that I
have setup for all Maneage-related projects). In this repository there is
only one `maneage' branch to avoid complications with the `master' branch
of the projects using Maneage later.
|
|
A few minor conflicts occurred and were fixed.
|
|
Until now, there was no explanation on an actual analysis phase, therefore
with this commit an example scenario with a readable Makefile is included.
The Data lineage graph was also simplified to both be more readable, and
also to correspond to this new explanation and subMakefile.
Some random edits/typos were also corrected and some references added for
discussion.
|
|
The main problems with this dataset was the names of the journals (which
sometimes have single quotes or apostrophes in them that is really annoying
for SED)! But ultimately, for the simple study we want to do here, the
journal names are irrelevant, so in the end I just ignored the names. Later
we can set an identifier for the journals if necessary.
But now we have the basic information in a way that is usable in a plot to
show in this paper.
|
|
Until now, the configuration Makefiles (in
`reproduce/software/config/installation' and `reproduce/analysis/config')
had a `.mk' suffix, similar to the workhorse Makefiles. Although they are
indeed Makefiles, but given their nature (to only keep configuration
parameters), it is confusing (especially to early users) for them to also
have a `.mk' (similar to the analysis or software building Makefiles).
To address this issue, with this commit, all the configuration Makefiles
(in those directories) are now given a `.conf' suffix. This is also assumed
for all the files that are loaded.
The configuration (software building) and running of the template have been
checked with this change from scratch, but please report any error that may
not have been noticed.
THIS IS AN IMPORTANT CHANGE AND WILL CAUSE CRASHES OR UNEXPECTED BEHAVIORS
FOR PROJECTS THAT HAVE BRANCHED FROM THIS TEMPLATE. PLEASE CORRECT THE
SUFFIX OF ALL YOUR PROJECT'S CONFIGURATION MAKEFILES (IN THE DIRECTORIES
ABOVE), OTHERWISE THEY AREN'T AUTOMATICALLY LOADED ANYMORE.
|
|
Now that its 2020, its necessary to include this year in the copyright
statements.
|
|
Until now, when an input dataset already exists in `INDIR', the template
would just make a symbolic link to it in the build directory. However, in
many cases, the files in INDIR will actually be links to other locations on
the filesystem and some programs have problems following too many links.
With this commit, the template is now using the `readlink' program (part of
GNU Coreutils) to follow a possible link and point the link in the build
directory directly to an actual non-link file.
|
|
Until now, the software building and analysis steps of the pipeline were
intertwined. However, these steps (of how to build a software, and how to
use it) are logically completely independent.
Therefore with this commit, the pipeline now has a new architecture
(particularly in the `reproduce' directory) to emphasize this distinction:
The `reproduce' directory now has the two `software' and `analysis'
subdirectories and the respective parts of the previous architecture have
been broken up between these two based on their function. There is also no
more `src' directory. The `config' directory for software and analysis is
now mixed with the language-specific directories.
Also, some of the software versions were also updated after some checks
with their webpages.
This new architecture will allow much more focused work on each part of the
pipeline (to install the software and to run them for an analysis).
|