aboutsummaryrefslogtreecommitdiff
path: root/reproduce/analysis/config/INPUTS.conf
blob: 395815306b092bb002eb923d5d345604813f6b90 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
# This project's input file information (metadata).
#
# For each input (external) data file that is used within the project,
# three variables are suggested here (two of them are mandatory). These
# variables will be used by 'reproduce/analysis/make/download.mk' to import
# the dataset into the project (within the build directory):
#
#   - If the file already exists locally in '$(INDIR)' (the optional input
#     directory that may have been specified at configuration time with
#     '--input-dir'), a symbolic link will be added in '$(indir)' (in the
#     build directory). A symbolic link is used to avoid extra storage when
#     files are large.
#
#   - If the file doesn't exist in '$(INDIR)', or no input directory was
#     specified at configuration time, then the file is downloaded from a
#     specific URL.
#
# In both cases, before placing the file (or its link) in the build
# directory, 'reproduce/analysis/make/download.mk' will check the SHA256
# checksum of the dataset and if it differs from the pre-defined value (set
# for that file, here), it will abort (since this is not the intended
# dataset).
#
# Therefore, the two variables specifying the URL and SHA256 checksum of
# the file are MANDATORY. The third variable (INPUT-%-size) showing the
# human-readable size of the file (from 'ls -lh') is optional (but
# recommended: because it gives future scientists to get a feeling of the
# volume of data they need to input: will become important if the
# size/number of files is large).
#
# The naming convension is critical for the input files to be properly
# imported into the project. In the patterns below, the '%' is the full
# file name (including its suffix): for example in the demo input of this
# file in the 'maneage' branch, we have 'INPUT-wfpc2.fits-sha256':
# therefore, the input file (within the project's '$(indir)') is called
# 'wfpc2.fits'. This allows you to simply set '$(indir)/wfpc2.fits' as the
# pre-requisite of any recipe that needs the input file: you will rarely
# (if at all!) need to use these variables directly.
#
#   INPUT-%-sha256: The sha256 checksum of the file. You can generate the
#                   SHA256 checksum of a file with the 'sha256sum FILENAME'
#                   command (where 'FILENAME' is the name of your
#                   file). this is very important for an automatic
#                   verification of the file: that it hasn't changed
#                   between different runs of the project (locally or in
#                   the URL). There are more robust checksum algorithms
#                   like the 'SHA' standards.
#
#   INPUT-%-url: The URL to download the file if it is not available
#                locally. It can happen that during the first phases of
#                your project the data aren't yet public. In this case, you
#                set a phony URL like this (just as a clear place-holder):
#                'https://this.file/is/not/yet/public'.
#
#   INPUT-%-size: The human-readable size of the file (output of 'ls
#                 -lh'). This is not used by default but can help other
#                 scientists who would like to run your project get a
#                 good feeling of the necessary network and storage
#                 capacity that is necessary to start the project.
#
# The input dataset's name (that goes into the '%') can be different from
# the URL's file name (last component of the URL, after the last '/'). Just
# note that it is assumed that the local copy (outside of your project) is
# also called '%' (if your local copy of the input dataset and the only
# repository names are the same, be sure to set '%' accordingly).
#
# Copyright (C) 2018-2022 Mohammad Akhlaghi <mohammad@akhlaghi.org>
#
# Copying and distribution of this file, with or without modification, are
# permitted in any medium without royalty provided the copyright notice and
# this notice are preserved.  This file is offered as-is, without any
# warranty.





# Demo dataset used in the histogram plot (remove when customizing).
INPUT-wfpc2.fits-size = 62K
INPUT-wfpc2.fits-url  = https://fits.gsfc.nasa.gov/samples/WFPC2ASSNu5780205bx.fits
INPUT-wfpc2.fits-sha256 = 9851bc2bf9a42008ea606ec532d04900b60865daaff2f233e5c8565dac56ad5f