diff options
author | Mohammad Akhlaghi <mohammad@akhlaghi.org> | 2022-04-15 04:57:54 +0200 |
---|---|---|
committer | Mohammad Akhlaghi <mohammad@akhlaghi.org> | 2022-04-15 05:22:19 +0200 |
commit | 91799fe4b6d62230e99a1520a23a0d30c3eb963e (patch) | |
tree | 7b04b6e7ec5901b2d7d3f4096ec8dac2b78da001 /for-group | |
parent | c5d7f2adbea2038d240868e0192fb306256e3b92 (diff) |
IMPORTANT: more generic, robust and secure INPUTS.conf and download.mk
SUMMARY: it is necessary to update your 'INPUTS.conf' and 'download.mk'.
Until now, adding an input file involved several steps that needed manual
(and inconvenient!) intervention: for every file, you needed to define four
variables in 'INPUTS.conf', and in 'reproduce/analysis/make/download.mk'
you had to use a (complex for large number of files) shell 'if/elif/else'
condition to link the names of the input files to those variables. Besides
inconvenience, this could cause bugs (typos!). Furthermore, a basic MD5
checksum was used for verifying the files.
With this commit, a new structure has been defined for 'INPUTS.conf' that
(thanks to some pretty useful GNU Make features), removes the need for
users to manually edit 'reproduce/analysis/make/download.mk', and reduces
the number of variables necessary for each file to three (from
four). Furthermore, we now use the SHA256 checksum for input data
validation.
Regarding the trick used in 'INPUTS.conf' (form the newly added description
in 'download.mk'): In GNU Make, '.VARIABLES' "... expands to a list of the
names of all global variables defined so far" (from the "Other Special
Variables" section of the GNU Make manual). Assuming that the pattern
'INPUT-%-sha256' is only used for input files, we find all the variables
that contain the input file names (the '%' is the filename). Finally, using
the pattern-substitution function ('patsubst'), we remove the fixed string
at the start and end of the variable name.
Steps you need to take:
- INPUTS.conf: translate your old format to the new format (after
carefully reading the description in the comments at the start of the
file). After applying the new standards, you don't need to use the
variables of 'INPUTS.conf' directly in your Makefiles! For example if
one of your input datasets is called 'abc.fits', the checksum variable
will be 'INPUT-abc.fits-sha256' and in your high-level Makefiles, you
can simply set '$(indir)/abc.fits' as a prerequisite (like you probably
did already).
- reproduce/analysis/make/download.mk: for the definition and rule of
'inputdatasets', simply use the Maneage branch, and remove anything you
had added in your project.
In the process, I also noticed that 'README-hacking.md' still referred to
'master' as the main project branch, while we have used 'main' in the paper
(and is the common convention with Git).
Diffstat (limited to 'for-group')
0 files changed, 0 insertions, 0 deletions