Age | Commit message (Collapse) | Author | Lines |
|
A few minor conflicts occurred and were fixed.
|
|
Until now, there was no explanation on an actual analysis phase, therefore
with this commit an example scenario with a readable Makefile is included.
The Data lineage graph was also simplified to both be more readable, and
also to correspond to this new explanation and subMakefile.
Some random edits/typos were also corrected and some references added for
discussion.
|
|
The main problems with this dataset was the names of the journals (which
sometimes have single quotes or apostrophes in them that is really annoying
for SED)! But ultimately, for the simple study we want to do here, the
journal names are irrelevant, so in the end I just ignored the names. Later
we can set an identifier for the journals if necessary.
But now we have the basic information in a way that is usable in a plot to
show in this paper.
|
|
Until now, the configuration Makefiles (in
`reproduce/software/config/installation' and `reproduce/analysis/config')
had a `.mk' suffix, similar to the workhorse Makefiles. Although they are
indeed Makefiles, but given their nature (to only keep configuration
parameters), it is confusing (especially to early users) for them to also
have a `.mk' (similar to the analysis or software building Makefiles).
To address this issue, with this commit, all the configuration Makefiles
(in those directories) are now given a `.conf' suffix. This is also assumed
for all the files that are loaded.
The configuration (software building) and running of the template have been
checked with this change from scratch, but please report any error that may
not have been noticed.
THIS IS AN IMPORTANT CHANGE AND WILL CAUSE CRASHES OR UNEXPECTED BEHAVIORS
FOR PROJECTS THAT HAVE BRANCHED FROM THIS TEMPLATE. PLEASE CORRECT THE
SUFFIX OF ALL YOUR PROJECT'S CONFIGURATION MAKEFILES (IN THE DIRECTORIES
ABOVE), OTHERWISE THEY AREN'T AUTOMATICALLY LOADED ANYMORE.
|
|
Now that its 2020, its necessary to include this year in the copyright
statements.
|
|
Until now, when an input dataset already exists in `INDIR', the template
would just make a symbolic link to it in the build directory. However, in
many cases, the files in INDIR will actually be links to other locations on
the filesystem and some programs have problems following too many links.
With this commit, the template is now using the `readlink' program (part of
GNU Coreutils) to follow a possible link and point the link in the build
directory directly to an actual non-link file.
|
|
Until now, the software building and analysis steps of the pipeline were
intertwined. However, these steps (of how to build a software, and how to
use it) are logically completely independent.
Therefore with this commit, the pipeline now has a new architecture
(particularly in the `reproduce' directory) to emphasize this distinction:
The `reproduce' directory now has the two `software' and `analysis'
subdirectories and the respective parts of the previous architecture have
been broken up between these two based on their function. There is also no
more `src' directory. The `config' directory for software and analysis is
now mixed with the language-specific directories.
Also, some of the software versions were also updated after some checks
with their webpages.
This new architecture will allow much more focused work on each part of the
pipeline (to install the software and to run them for an analysis).
|