Age | Commit message (Collapse) | Author | Lines |
|
Some minor conflicts (all expected from the commit messages in the Maneage
branch) occurred but were easily fixed.
|
|
When built in 'group' mode, the write permissions of all created files will
be activated for a certain group of users in the host operating system. The
user specifies the name of the group with the '--group' option at configure
time. At the very start, the './project' script checks to see if the given
group name actually exists or not (to avoid hard-to-debug errors popping up
later).
Until now, the checking 'sg' command (that was used to build the project
with group-writable permissions) would always fail due to the excessive
number of redirections. Therefore, it would always print the error message
and abort.
With this commit, the output of 'sg' is no longer re-directed (which also
helps users in debuggin). If the group does actually exist, it will just
print a small statement saying so, and if it fails, the error message is
printed. This fixed the problem, allowing maneage to be built in
group-mode.
I also noticed that the variable name keeping the group name
('reproducible_paper_group_name') used the old name for the project (which
was "Reproducible paper template"! So it has been changed/corrected to
'maneage_group_name'.
|
|
Until now, the './project' script included an '--minmapsize' option which
is an option to one of the original programs that was used in Maneage
(Gnuastro). Such an option doesn't exist in many other programs, so it is
not a suitable option for the generic Maneage project (and can just cause
confusion). It was also not used in any part of Maneage any more!
With this commit, this option is removed from the core Maneage './project'
script and if any project uses it, they can implement it in their own
branch.
|
|
Until now, each time there was a problem in the configuration of Maneage'd
projects and debugging was necessary, we had to take the following changes:
- Run the configuration on a single thread ('-j1') to see the building of
only the problematic software.
- Disable the Zenodo check manually by commenting those parts of
'reproduce/software/shell/configure.sh'. Because the internet connection
wastes a few seconds and is thus very annoying during repeated runs!
- Manually remove the '-k' option that was passed to Make (when building
the software). With the '-k', Make keeps going with the execution of
other targets if something crashes and this usually causes confusions
during the debugging.
Doing the manual changes within the code was both very annoying and prone
to errors (forgetting to correct it!).
With this commit, the existing '--debug' option has been generalized to the
software configuration phase of Maneage also. Until now, it was only
available in the analysis phase (and would directly be passed to the 'make'
command that would run the analysis). When this option is used, and the
project is in the software configuration phase, the Zenodo check won't be
done, it will use one single thread ('-j1'), and it will stop the execution
as soon as an error occurs (Make is not run with '-k').
|
|
There were only three very small conflicts that have been fixed.
|
|
Until now, the build strategy of the paper was to have a single output PDF
that either contains (1) the full paper with appendices in the same paper
(2) only the main body of the paper with no appencies.
But the editor in chief of CiSE recently recommended publishing the
appendices as supplements that is a separate PDF (on its webpage). So with
this commit, the project can make either (1) a single PDF (containing both
the main body and the appendices) that will be published on arXiv and will
be the default output (this is the same as before). (2) two PDFs: one that
is only the main body of the paper and another that is only the appendices.
Since the appendices will be printed as a PDF in any case now, the old
'--no-appendix' option has been replaced by '--supplement'. Also, the
internal shell/TeX variable 'noappendix' has been renamed to
'separatesupplement'.
|
|
Until now there was only a 'clean' (to delete all files created during the
'make' phase) and the 'distclean' (to delete all files during configuration
and make). But sometimes we don't want to delete all the files created
during the full 'make' phase, we only want to delete the files that were
created by LaTeX for building the paper.
Witht this commit, a new target has been added for this job. You can now
run the following command for this job:
./project make texclean
Only the files in '$(BDIR)/tex/build' will be deleted (and the 'tikz'
directory under that location is recreated, ready for a future build).
|
|
Having entered 2021, it was necessary to update the copyright years at the
top of the source files. We recommend that you do this for all your
project-specific source files also.
|
|
Some minor edits were made to the paper to shorten it. In particular the
example of IPOL was removed from the main body of the paper, and we'll just
rely on the more extensive review of IPOL in the appendix. I also updated
the referee report to account for the new Appendix A that is just an
extended introduction.
Also, I noticed that the Menke+20 paper that we replicate here has recently
been published in the iScience journal. So its bibliography was updated
from the bioarXiv information to the journal information.
Also, the number of words (after removing abstract and captions and
accounting for figures) is now only printed when the project is built with
'--no-appendix'. This was done because this information is
extra/annoying/unnecessary for the case where there is an appendix.
|
|
This only concerns the TeX sources in the default branch. In case you don't
use them, there should only be a clean conflict in 'paper.tex' (that is
obvious and easy to fix). Conflicts may only happen in some of the
'tex/src/preamble-*.tex' files if you have actually changed them for your
project. But generally any conflict that does arise by this commit with
your project branch should be very clear and easy to fix and test.
In short, from now on things will even be easier: any LaTeX configuration
that you want to do for your project can be done in
'tex/src/preamble-project.tex', so you don't have to worry about any other
LaTeX preamble file. They are either templates (like the ones for PGFPlots
and BibLaTeX) or low-level things directly related to Maneage. Until now,
this distinction wasn't too clear.
Here is a summary of the improvements:
- Two new options to './project make': with '--highlight-new' and
'--highlight-notes' it is now possible to activate highlighting on the
command-line. Until now, there was a LaTeX macro for this at the start
of 'paper.tex' (\highlightchanges). But changing that line would change
the Git commit hash, making it hard for the readers to trust that this
is the same PDF. With these two new run-time options, the printed commit
hash will not changed.
- paper.tex: the sentences are formatted as one sentence per line (and one
line per sentence). This helps in version controlling narrative and
following the changes per sentence. A description of this format (and
its advantages) is also included in the default text.
- The internal Maneage preambles have been modified:
- 'tex/src/preamble-header.tex' and 'tex/src/preamble-style.tex' have
been merged into one preamble file called
'tex/src/preamble-maneage-default-style.tex'. This helps a lot in
simply removing it when you use a journal style file for example.
- Things like the options to highlight parts of the text are now put in
a special 'tex/src/preamble-maneage.tex'. This helps highlight that
these are Maneage-specific features that are independent of the style
used in the paper.
- There is a new 'tex/src/preamble-project.tex' that is the place you
can add your project-specific customizations.
|
|
There is an answer for all the referee points now. I also did some minor
edits in the paper. But we are still over the limit by around 250 words.
The only remaining point that is not yet addressed (and has '####' around
it) is the discussion on parallelization and its effect on reproducibility.
|
|
A new directory has been added at the top of the project's source called
'peer-review'. The raw reviews of the paper by the editors and referees has
been added there as '1-review.txt'. All the main points raised by the
referees have been listed in a numbered list and addressed (mostly) in
'1-answers.txt'. The text of the paper now also includes all the
implemented answers to the various points.
|
|
Until now, the core Maneage 'paper.tex' had a '\highlightchanges' macro
that defines two LaTeX macros: '\new' and '\tonote'.
When '\highlightchanges' was defined, anything that was written within
'\new' became dark green (highlighting new things that have been
added). Also, anything that was written in '\tonote' was put within a '[]'
and became dark red (to show that there is a note here that should be
addressed later).
When '\highlightchanges' wasn't defined, anything within the '\new' element
would be black (like the rest of the text), and the things in '\tonote'
would not be shown at all.
Commenting the '\newcommand{\highlightchanges}{}' line within 'paper.tex'
(to toggle the modes above) would create a different Git hash and has to be
committed.
But this different commit hash could create a false sense in the reader
that other things have also been changed and the only way they could
confirm was to actually go and look into the project history (which they
will not usually have time to do, and thus won't be able to trust the two
modes of the text).
Also, the added highlights and the note highlights were bundeled together
into one macro, so you couldn't only have one of them.
With this commit, the choice of highlighting either one of the two is now
done as two new run-time options to the './project' script (which are
passed to the Makefiles, and written into the 'project.tex' file which is
loaded into 'paper.tex'). In this way, we can generate two PDFs with the
same Git commit (project's state): one with the selected highlights and
another one without it.
This issue actually came up for me while implementing the changes here: we
need to submit one PDF to the journal/referees with highlights on the added
features. But we also need to submit another PDF to arXiv and Zenodo
without any highlights. If the PDFs have different commit hashes, the
referees may associate it with other changes in any part of the work. For
example https://oadoi.org/10.22541/au.159724632.29528907 that mentions
"Another version of the manuscript was published on arXiv: 2006.03018",
while the only difference was a few words in the abstract after the journal
complained on the abstract word-count of our first submission (where the
commit hashes matched with arXiv/Zenodo).
|
|
Given the referee reports, after discussing with the editors of CiSE, we
decided that it is important to include the complete appendix we had before
that included a thorough review of existing tools and methods. However, the
appendix will not be published in the paper (due to the strict word-count
limit). It will only be used in the arXiv/Zenodo versions of the paper.
This actually created a technical problem: we want the commit hash of the
project source to remain the same when the paper is built with an appendix
or without it.
To fix this problem the choice of including an appendix has gone into the
'project' script as a run-time option called '--no-appendix'. So by default
(when someone just runs './project make'), the PDF will have an appendix,
but when we want to submit to the journal, or when the appendix isn't
needed for a certain reason, we can use this new option. The appendix also
has its own separate bibliography.
Some other corrections made in this commit:
1. Some new references were added that had an '_' in their source, they
were corrected in 'references.tex'.
2. I noticed that 'preamble-style.tex' is not actually used in this paper,
so it has been deleted.
|
|
There weren't any conflicts in this merge.
|
|
Following the previous commit, we recognized that the 'IFS' terms are not
necessary and can be even cause problems. So all their occurances in the
scripts of Maneage have been removed with this commit.
|
|
Only two small conflicts came up:
* The addition of the hardware architecture macro in 'paper.tex' (which
was removed for now, but will be added as the referee has requested
within the text).
* The usage of "" around directory variables in 'paper.mk'.
|
|
While a project is under development, the raw analysis software are not the
only necessary software in a project. We also need tools to all the edit
plain-text files within the Maneaged project. Usually people use their
operating system's plain-text editor. However, when working on the project
on a new computer, or in a container, the plain-text editors will have
different versions, or may not be present at all! This can be very annoying
and frustrating!
With this commit, Maneage now installs GNU Nano as part of the basic
tools. GNU Nano is a very simple and small plain text editor (the installed
size is only ~3.5MB, and it is friendly to new users). Therefore, any
Maneaged project can assume atleast Nano will be present (in particular
when no editor is available on the running system!). GNU Emacs and VIM
(both without extra dependencies, in particular without GUI support) are
also optionally available in 'high-level.mk' (by adding them to
'TARGETS.conf').
The basic idea for the more advanced editors (Emacs and VIM) is that
project authors can add their favorite editor while they are working on the
project, but upon publication they can remove them from 'TARGETS.conf'.
A few other minor things came up during this work and are now also fixed:
- The 'file' program and its libraries like 'libmagic' were linking to
system's 'libseccomp'! This dependency then leaked into Nano (which
depends on 'libmagic'). But this is just an extra feature of 'file',
only for the Linux kernel. Also, we have no dependency on it so far. So
'file' is not configured to not build with 'libseccomp'.
- A typo was fixed in the line where the physical core information is
being read on macOS.
- The top-level directories when running './project shell' are now quoted
(in case they have special characters).
|
|
Until now, no machine-related specifications were being documented in the
workflow. This information can become helpful when observing differences in
the outcome of both software and analysis segments of the workflow by
others (some software may behave differently based on host machine).
With this commit, the host machine's 'hardware class' and 'byte-order' are
collected and now available as LaTeX macros for the authors to use in the
paper. Currently it is placed in the acknowledgments, right after
mentioning the Maneage commit.
Furthermore, the project and configuration scripts are now capable of
dealing with input directory names that have SPACE (and other special
characters) by putting them inside double-quotes. However, having spaces
and metacharacters in the address of the build directory could cause
build/install failure for some software source files which are beyond the
control of Maneage. So we now check the user's given build directory
string, and if the string has any '@', '#', '$', '%', '^', '&', '*', '(',
')', '+', ';', and ' ' (SPACE), it will ask the user to provide a different
directory.
|
|
Some very minor conflicts came up and were easily corrected. They were
mostly in parts that are also shared with the demonstration in the core
Maneage branch.
|
|
Until now, './project --check-config' would only print the names of the
software that were being built. Besides that, it is also useful to know
which packages have most recently finished.
With this commit, we now print the last 5 built software packages with
'--check-config' also, and the output has also been placed in a row of '='s
to help separate it in each round. Also some more sanity checks have been
added so it doesn't print error messages.
|
|
Until now, the 'shell' mode of the './project' script was missing in the
top output of './project --help' and in the error message printed when no
operation was given, or when more than one operation was given.
This is now corrected.
|
|
Only two conflicts came up in the newly added comments of 'paper.mk' in the
Maneage branch. It happened because in this project we don't use
'pdflatex', but 'latex' alone.
|
|
POSSIBLE EFFECT ON YOUR PROJECT: The changes in this commit may only cause
conflicts to your project if you have changed the software building
Makefiles in your project's branch (e.g., 'basic.mk', 'high-level.mk' and
'python.mk'). If your project has only added analysis, it shouldn't be
affected.
This is a large commit, involving a long series of corrections in a
differnt branch which is now finally being merged into the core Maneage
branch. All changes were related and came up naturally as the low-level
infrastructure was improved. So separating them in the end for the final
merge would have been very time consuming and we are merging them as one
commit.
In general, the software building Makefiles are now much more easier to
read, modify and use, along with several new features that have been
added. See below for the full list.
- Until now, Maneage needed the host to have a 'make' implementation
because Make was necessary to build Lzip. Lzip is then used to
uncompress the source of our own GNU Make. However, in the
minimalist/slim versions of operating systems (for example used to build
Docker images) Make isn't included by default. Since Lzip was the only
program before our own GNU Make was installed, we consulting Antonio
Diaz Diaz (creator of Lzip) and he kindly added the necessary
functionality to a new version of Lzip, which we are using now. Hence we
don't need to assume a Make implementation on the host any more. With
this commit, Lzip and GNU Make are built without Make, allowing
everything else to be safely built with our own custom version of GNU
Make and not using the host's 'make' at all.
- Until recently (Commit 3d8aa5953c4) GNU Make was built in
'basic.mk'. Therefore 'basic.mk' was written in a way that it can be
used with other 'make' implementations also (i.e., important shell
commands starting with '&&' and ending in '\' without any comments
between them!). Furthermore, to help in style uniformity, the rules in
'high-level.mk' and 'python.mk' also followed a similar structure. But
due to the point above, we can now guarantee that GNU Make is used from
the very first Makefile, so this hard-to-read structure has been removed
in the software build recipes and they are much more readable and
edit-friendly now.
- Until now, the default backup servers where at some fixed URLs, on our
own pages or on Gitlab. But recently we uploaded all the necessary
software to Zenodo (https://doi.org/10.5281/zenodo.3883409) which is
more suitable for this task (it promises longevity, has a fixed DOI,
while allowing us to add new content, or new software tarball
versions). With this commit, a small script has been written to extract
the most recent Zenodo upload link from the Zenodo DOI and use it for
downloading the software source codes.
- Until now, we primarily used the webpage of each software for
downloading its tarball. But this caused many problems: 1) Some of them
needed Javascript before the download, 2) Some URLs had a complex
dependency on the version number, 3) some servers would be randomly down
for maintenance and etc. So thanks to the point above, we now use the
Zenodo server as the primary download location. However, if a user wants
to use a custom software that is not (yet!) in Zenodo, the download
script gives priority to a custom URL that the users can give as Make
variables. If that variable is defined, then the script will use that
URL before going onto Zenodo. We now have a special place for such URLs:
'reproduce/software/config/urls.conf'. The old URLs (which are a good
documentation themselves) are preserved here, but are commented by
default.
- The software source code downloading and checksum verification step has
been moved into a Make function called 'import-source' (defined in the
'build-rules.mk' and loaded in all software Makefiles). Having taken all
the low-level steps there, I noticed that there is no more need for
having the tarball as a separate target! So with this commit, a single
rule is the only place that needs to be edited/added (greatly
simplifying the software building Makefiles).
- Following task #15272, A new option has been added to the './project'
script called '--all-highlevel'. When this option is given, the contents
of 'TARGETS.conf' are ignored and all the software in Maneage are built
(selected by parsing the 'versions.conf' file). This new option was
added to confirm the extensive changes made in all the software building
recipes and is great for development/testing purposes.
- Many of the software hadn't been tested for a long time! So after using
the newly added '--all-highlevel', we noticed that some need to be
updated. In general, with this commit, 'libpaper' and 'pcre' were added
as new software, and the versions of the following software was updated:
'boost', 'flex', 'libtirpc', 'openblas' and 'lzip'. A 'run-parts.in'
shell script was added in 'reproduce/software/shell/' which is installed
with 'libpaper'.
- Even though we intentionally add the necessary flags to add RPATH inside
the built executable at compilation time, some software don't do it
(different software on different operating systems!). Until now, for
historical reasons this check was done in different ways for different
software on GNU/Linux sytems. But now it is unified: if 'patchelf' is
present we apply it. Because of this, 'patchelf' has been put as a
top-level prerequisite, right after Tar and is installed before anything
else.
- In 'versions.conf', GNU Libtool is recognized as 'libtool', but in
'basic.mk', it was 'glibtool'! This caused many confusions and is
corrected with this commit (in 'basic.mk', it is also 'libtool').
- A new argument is added to the './project' script to allow easy loading
of the project's shell and environment for fast/temporary testing of
things in the same environment as the project. Before activating the
project's shell, we completely remove all host environment variables to
simulate the project's environment. It can be called with this command:
'./project shell'. A simple prompt has also been added to highlight that
the user is using the Maneage shell!
|
|
Until now, the English texts that embeds the list of software to
acknowledge in the paper was hard-wired into the low-level coding
('reproduce/software/shell/configure.sh' to be more specific). But this
file is very low-level, thus discouraging users to modify this surrounding
text.
While the list of software packages can be considered to be 'data' and is
fixed, the surrounding text to describe the lists is something the authors
should decide on. Authors of a scientific research paper take
responsibility for the full paper, including for the style of the
acknowledgments, even if these may well evolve into some standard text.
With this commit, authors who do *not* modify
'reproduce/software/config/acknowledge_software.sh' will have a default
text, with only a minor English correction from earlier versions of
Maneage. However, Authors choosing to use their own wording should be able
to modify the text parameters in
`reproduce/software/config/acknowledge_software.sh` in the obvious
way. This is much more modular than asking project authors to go looking
into the long and technical 'configure.sh' script.
Systematic issues: the file
`reproduce/software/config/acknowledge_software.sh` is an executable shell
script, because it has to be called by
`reproduce/software/shell/configure.sh`, which, in principle, does not yet
have access to `GNU make` (if I understand the bootstrap sequence
correctly). It is placed in `config/` rather than `shell/`, because the
user will expect to find configuration files in `config/`, not in `shell/`.
A possible alternative to avoid having a shell script as a configure file
would be to let `reproduce/software/config/acknowledge_software.sh` appear
to be a `make` file, but analyse it in `configure.sh` using `sed` to remove
whitespace around `=`, and adding other hacks to switch from `make` syntax
to `shell` syntax. However, this risks misleading the user, who will not
know whether s/he should follow `make` conventions or `shell` conventions.
|
|
When publishing a project, it is necessary to also publish the source code
of all necessary software of the project. We had recently added a new
'./project make' target called 'dist-software' for this job, but had
forgotten to add it in the output of './project --help'! There was also a
small bug inside of it that didn't allow the successful copying of the
created tarball to the top project directory.
With this commit, an explanation for this target has been added in the
output of './project --help' and that bug has been fixed.
|
|
Until now we were using 'which' for this job, but throughout Maneage, we
have used 'type', so to help in consistancy, we also use 'type' for this
final command for this project also.
|
|
There were two small warnings that are removed with this commit:
- In the end, when we print the number of words in the PDF, we hadn't
accounted for the fact that 'paper.pdf' doesn't always exist (for
example when './project make clean' is run). So a check was added to
only print the number of words when a PDF exists.
- I noticed that the '$(texdir)/to-publish' directory was being built both
in 'initialize.mk' and in 'demo-plot.mk'. So the one in 'demo-plot.mk'
has been removed.
|
|
Some minor conflicts came up in 'initialize.mk' and 'verify.mk'. For the
former, I chose the version on Maneage, for the latter, I kept the 'master'
version on the checksums of this project, but kept the Maneage version for
the rest of the improvements there (like printing the verified files as
LaTeX comments in 'verify.tex'.
While testing the conflicts, I noticed a bug (in the LaTeX macro for the
number of years in the Menke+20 paper) in the previous build, thanks to the
verification step :-)! Fortunately it wasn't actually printed in the PDF,
so a normal reader won't recognize.
The bug was caused by the recently added meta-data/commented lines in the
'tools-per-year.txt' file: when calculating the number of years studied in
that paper, we were simply counting all the lines and we had forgot to
correct this after adding comments. As a result, the un-used LaTeX macro
file was saying that they have studied 47 years instead of the real 31
years! This element was actually used in the very first (+40 page!) draft
of the paper that was summarized to fit into the journal limits.
|
|
Possible semantic conflicts (that may not show up as Git conflicts but may
cause a crash in your project after the merge):
1) The project title (and other basic metadata) should be set in
'reproduce/analysis/conf/metadata.conf'. Please include this file in
your merge (if it is ignored because of '.gitattributes'!).
2) Consider importing the changes in 'initialize.mk' and 'verify.mk' (if
you have added all analysis Makefiles to the '.gitattributes' file
(thus not merging any change in them with your branch). For example
with this command:
git diff master...maneage -- reproduce/analysis/make/initialize.mk
3) The old 'verify-txt-no-comments-leading-space' function has been
replaced by 'verify-txt-no-comments-no-space'. The new function will
also remove all white-space characters between the columns (not just
white space characters at the start of the line). Thus the resulting
check won't involve spacing between columns.
A common set of steps are always necessary to prepare a project for
publication. Until now, we would simply look at previous submissions and
try to follow them, but that was prone to errors and could cause
confusion. The internal infrastructure also didn't have some useful
features to make good publication possible. Now that the submission of a
paper fully devoted to the founding criteria of Maneage is complete
(arXiv:2006.03018), it was time to formalize the necessary steps for easier
submission of a project using Maneage and implement some low-level features
that can make things easier.
With this commit a first draft of the publication checklist has been added
to 'README-hacking.md', it was tested in the submission of arXiv:2006.03018
and zenodo.3872248. To help guide users on implementing the good practices
for output datasets, the outputs of the default project shown in the paper
now use the new features). After reading the checklist, please inspect
these.
Some other relevant changes in this commit:
- The publication involves a copy of the necessary software
tarballs. Hence a new target ('dist-software') was also added to
package all the project's software tarballs in one tarball for easy
distribution.
- A new 'dist-lzip' target has been defined for those who want to
distribute an Lzip-compressed tarball.
- The '\includetikz' LaTeX macro now has a second argument to allow
configuring the '\includegraphics' call when the plot should not be
built, but just imported.
|
|
The minor conflict was with 'reproduce/software/make/high-level.mk', and in
particular because we implemented the fix to Maneage's Task #15664 in this
project first. After it was moved to the main Maneage branch some minor
stylistic corrections were done to it, thus causing the conflict. To
resolve the conflict, I simply imported the full Maneage version of the
file with this command:
git checkout maneage -- reproduce/software/make/high-level.mk
The other conflicts were due to the deleted files (that were resolved as
described in 'README-hacking.md') and the LaTeX files that I had told
'.gitattributes' to ignore from the Maneage branch.
|
|
David and Raul had both reported that because 'pdftotext' wasn't available
on their system, the project failed (even though the PDF was built!). So
with this commit, we first check if the system has 'pdftotext' and call it
only if its is available.
Some minor edits were made, building upon Boud's previous commit.
|
|
In time, some of the copyright license description had been mistakenly
shortened to two paragraphs instead of the original three that is
recommended in the GPL. With this commit, they are corrected to be exactly
in the same three paragraph format suggested by GPL.
The following files also didn't have a copyright notice, so one was added
for them:
reproduce/software/make/README.md
reproduce/software/bibtex/healpix.tex
reproduce/analysis/config/delete-me-num.conf
reproduce/analysis/config/verify-outputs.conf
|
|
Following the fact that the DSJ editor decided that this paper doesn't fit
into their scope, we decided to submit it to IEEE's Computing in Science
and Engineering (CiSE). So with this commit the text was re-written to fit
into their style and word-count limitations.
|
|
Until now, the primary Maneage URLs were under GitLab, but since we now
have a dedicated URL and Git repository, its better to transfer to this as
soon as possible. Therefore with this commit, throughout Maneage, any place
that Maneage was referenced through GitLab has been corrected.
Please correct your project's remote to point to the new repository at
`git.maneage.org/project.git', and please make sure it follows the
`maneage' branch. There is no more `master' branch on Maneage.
|
|
Until now, throughout Maneage we were using the old name of "Reproducible
Paper Template". But we have finally decided to use Maneage, so to avoid
confusion, the name has been corrected in `README-hacking.md' and also in
the copyright notices.
Note also that in `README-hacking.md', the main Maneage branch is now
called `maneage', and the main Git remote has been changed to
`https://gitlab.com/maneage/project' (this is a new GitLab Group that I
have setup for all Maneage-related projects). In this repository there is
only one `maneage' branch to avoid complications with the `master' branch
of the projects using Maneage later.
|
|
Until now, the preparation phase was always executed before the final build
phase when running `./project make'. But when it becomes necessary, project
preparation can be slow and will un-necessarily slow down the project while
the project is growing (focus is on the analysis that is done after
preparation).
With this commit, preparation will be done automatically the first time
that the project is run (`.build/software/preparation-done.mk' doesn't
exist). However, after preperation is complete once, future runs of
`./project make' won't do preparation any more (by calling
`top-prepare.mk'). They will directly call `top-make.mk' for the analysis.
To manually invoke preparation after the first attempt, the `./project
make' script should be run with the new `--prepare-redo' option.
Also, since the preparation phase is now automatically done before the
analysis phase, the long notice that describes running `./project make' at
the end of the preparation phase has been removed in `top-prepare.mk'. It
now just prints a short line, saying the preparation has been complete.
Finally, when the project has not been run with the proper group
configuration, it ends with an `exit 1' so the main `./project' script
doesn't proceed any further.
|
|
Until now, when `./project make' was run after an insuccessful run of
`./project configure', it would just say to run `./project configure'. But
for a first time user, this could be confusing because when the
configuration is done in parallel, the error message can be very high on
the command-line outputs and not seen clearly.
With this commit, the error message is more complete and describes the
problem and what the users should do in which circumstance.
|
|
Until now the shell scripts in the software building phase were in the
`reproduce/software/bash' directory. But given our recent change to a
POSIX-only start, the `configure.sh' shell script (which is the main
component of this directory) is no longer written with Bash.
With this commit, to fix that problem, that directory's name has been
changed to `reproduce/software/shell'.
|
|
Until now, the initial project scripts were primarily tested with GNU
Bash. But Bash is not generally available on all systems (it has many
features beyond POSIX). Because of this, effectively we were imposing the
requirement on the user that they must have Bash installed. We recently
started this with setting the shebang of `project' and
`reproduce/software/bash/configure.sh' to `/bin/sh'. After doing so, Raul
and Gaspar reported an error on their systems.
To fix the problem, I installed Dash (a minimalist POSIX-compliant shell)
on my computer and temporarily set the shebangs to `/bin/dash', ran the
project configuration step and fixed all issues that came up. With this
commit, it can go all the way to building GCC on my system's Dash. After
this stage (when `high-level.mk' is called), there is no problem, because
we have our own version of GNU Bash and that installed version is used.
Probably some more issues still remain and will hopefully be found in the
future.
While doing this, I also noticed the following two minor issues:
- The `./project configure' option `--input-dir' was not recognized
because it was mistakenly checking `--inputdir'. It has been corrected.
- The test C programs now use the `<<EOF' method instead of `echo'.
- In `basic.mk', the extra space between `syspath' and `:=' was removed
(it was an ancient relic!).
|
|
Until now, the hashbang of these two shell scripts was set to `/bin/bash',
hence assuming that GNU Bash exists on the host system! But this is an
extra requirement on the host operating system and these two scripts should
be written such that they operate on a POSIX shell (the generic `/bin/sh'
which can point to any shell program).
With this commit this has been implemented! We may confront some errors as
the system is run on other systems, but we should fix such errors and work
hard to make these two scripts as POSIX-compatible as possible (runnable on
any shell, so as not to force users to install Bash before running the
project).
This completes Task #15525.
|
|
Until now, the main commands to run the project were these: `./project
configure' (to build the software), `./project prepare' (to possibly
arrange input datasets and build special configuration Makefiles) and
finally `./project make' to run the project.
The main logic behind the "prepare" phase `top-prepare.mk' is to build
configuration files that can be fed into the "make" step and optimize its
operation. For example when the total number of necessary inputs for the
majority of the analysis is not as large as the total number of
inputs. With "prepare" (when necessary), you go through the raw inputs,
select the ones that are necessary for the rest of the project. The output
of `top-prepare.mk' is a configuration file (a Make variable) that keeps
the IDs (numbers, names, etc). That configuration file would then be used
in the `top-make.mk' to identify the lower level targets and allow optimal
project organization and management.
But the last two are both part of the analysis, and while they indeed need
different calls to Make to be executed, many projects don't actually need a
preparation phase: ultimately, its an implementation choice by the project
developers and doesn't concern the project users (or the developers when
they are running it).
To avoid confusing the users, or simply annoying them when a projet doesn't
need it, with this commit, the top-level `top-prepare.mk' and `top-make.mk'
Makefiles are called with the single `./project make' command and
`./project prepare' has been dropped. I noticed this while writing the
paper on this system.
|
|
Until now, it was necessry to run a long `while true' loop to see what is
currently being built at configure time. So with this commit, a new
`--checkconfig' option has been added to `./project' that can be called to
run that loop and make it easier to check.
|
|
Now that its 2020, its necessary to include this year in the copyright
statements.
|
|
Until now, after removing all environment variables, we were just giving
Make the top Makefile to execute. By default, for every target, Make will
try many built-in rules (which is good when compiling programs, but
redundant in other cases). All these checkings also populate the debugging
output of Make (with `-d'). So its easier and slightly faster to just tell
Make to ignore builtin rules and variables.
With this commit, to address this issue, the `project' script runs
`.local/bin/make' with `--no-builtin-rules' and `--no-builtin-variables'.
|
|
A special directory is now defined in `initialize.mk' that can be used in
both the preparation and build phases. Also, the contents of prepared
results can now be conditionally read during `./project make'.
|
|
In many real-world scenarios, `./project make' can really benefit from
having some basic information about the data before being run. For example
when quering a server. If we know how many datasets were downloaded and
their general properties, it can greatly optmize the process when we are
designing the solution to be run in `./project make'.
Therefore with this commit, a new phase has been added to the template's
design: `./project prepare'. In the raw template this is empty, because the
simple analysis done in the template doesn't warrant it. But everything is
ready for projects using the template to add preparation phases prior to
the analysis.
|
|
Until now, when the project's source was downloaded from something like
arXiv, in `README.md', we were instructing them to set the executable flags
of all the files that need it. But except for `./project', the reader
shouldn't have to worry about the project internals! Once its executable,
`./project' can easily fix the executable flags of all the files that need
it automatically.
With this commit, in `README.md', we just instruct the reader to set the
executable flag of `./project' and any other file that needs an executable
flag is given one at the start of the set of commands for `./project
configure'. In customized projects, if an author needs executable flags on
any other files, they can easily add it there without involving the user.
|
|
Until now, we were assuming that the users would just clone the project in
Git. But after submitting arXiv:1909.11230, and trying to build directly
from the arXiv source, I noticed several problems that wouldn't allow users
to build it automatically. So I tried the build step by step and was able
to find a fix for the several issues that came up.
The scripting parts of the fix were primarily related to the fact that the
unpacked arXiv tarball isn't under version control, so some checks had to
be put there. Also, we wanted to make it easy to remove the extra files, so
an extra `--clean-texdit' option was added to `./project'.
Finally, some manual corrections were necessary (prior to running
`./project', which are now described in `README.md'. Most of the later
steps can be automated and we should do it later, I just don't have enough
time now.
|
|
`./project make dist' will package all the LaTeX-specific files (and
analysis source files) into one `tar.gz' file that is ready to upload to
servers like arXiv. However, it wasn't updated for some time, so running it
would complain about not having a `configure' script in the top of the
project.
With this commit, it now works with the new file-structure of the project
and also copies all the BibLaTeX source files and `paper.bbl' into the top
tarball directory, which allows arXiv to build the paper as intended.
The output of `./project make dist' has been uploaded and tested on arXiv
and it is built by arXiv perfectly.
Also, a short description of all the special `make' targets was added to
the output of `./project --help'.
|