# This project's input file information (metadata). # # For each input (external) data file that is used within the project, # three variables are suggested here (only the verification variable is # strictly mandatory). These variables will be used by the download rule of # 'reproduce/analysis/make/initialize.mk' to import the dataset into the # project (within the build directory): # # - If the file already exists locally in '$(INDIR)' (the optional input # directory that may have been specified at configuration time with # '--input-dir'), a symbolic link will be added in '$(indir)' (in the # build directory). A symbolic link is used to avoid extra storage when # files are large. # # - If the file doesn't exist in '$(INDIR)', or no input directory was # specified at configuration time, then the file is downloaded from the # specified URL for that dataset. # # In both cases, before placing the file (or its link) in the build # directory, the download rule of 'reproduce/analysis/make/initialize.mk' # will check the verification of the dataset and if it differs from the # pre-defined value (set for that file, here), it will abort (since this is # not the intended dataset). # # Verification (two modes) # ------------------------ # - SHA256 checksum. This will check the full contents of the file, and # is generic to any data format. However, if the server inserts custom # headers like the query date or query code and etc, this form of # validation is not useful: because every download will have different # headers. In such cases, you should use the other verification methods # below. In other words, this method is only good for files that are # "static" on the server (and left there unchanged). If the file is # generated at request time, the server usually inserts custom run-time # dependent headers; making it impossible to verify with an SHA # checksum of the whole file. # - The FITS Standard's 'DATASUM' (which will only check the data, not # the headers). According to the FITS standard, this sum ignores all # headers, and is only calculated on a HDU's data. By default, this # will require Gnuastro (which can easily calculate and return the # value on the command-line), and it assumes HDU number 1 (counting # from 0). You can modify the defaults by modifying the rule in # 'reproduce/analysis/make/initialize.mk'. # # Automatic writing of verification # --------------------------------- # In case you would like Maneage to find the checksum upon downloading, put # the string '--auto-replace--' instead of a checksum. This can be helpful # for large datasets; where downloading only for adding the checksum is not # easy/possible and can be buggy. In this scenario, upon downloading the # file its checksum will be calculated and will be replaced with the # '--auto-replace--' in this file. But since this file is under version # control, be sure to commit all the updated checksums after your downloads # are finished! # # Variable description # -------------------- # The naming convension is critical for the input files to be properly # imported into Maneage. In the patterns below, the '%' is the full file # name (including its suffix): for example in the demo input of this file # in the 'maneage' branch, we have 'INPUT-wfpc2.fits-sha256': therefore, # the input file (within the project's '$(indir)') is called # 'wfpc2.fits'. This allows you to simply set '$(indir)/wfpc2.fits' as the # pre-requisite of any recipe that needs the input file: you will rarely # (if at all!) need to use these variables directly. # # INPUT-%-sha256: The sha256 checksum of the file. You can generate the # SHA256 checksum of a file with the 'sha256sum FILENAME' # command (where 'FILENAME' is the name of your # file). Don't use this if you give the 'fitsdatasum' # keyvalue. # # INPUT-%-fitsdatasum: The FITS standard DATASUM value for HDU number 1 # of the FITS file (counting from 0). Don't use this # if you give the 'sha256' keyword. # # INPUT-%-fitshdu: The HDU identifier (counter from 0, or name) to use # for the verification. This is only relevant in the # 'fitsdatasum' verification method and optional (if not # given, HDU number 1 is used; counting from 0). # # INPUT-%-url: The URL to download the file if it is not available # locally. It can happen that during the first phases of # your project the data aren't yet public. In this case, you # set a phony URL like this (just as a clear place-holder): # 'https://this.file/is/not/yet/public'. # # INPUT-%-size: The human-readable size of the file (output of 'ls # -lh'). This is not used by default but can help other # scientists who would like to run your project get a # good feeling of the necessary network and storage # capacity that is necessary to start the project. # # Therefore, the the verification variable is MANDATORY in any case. The # variable with a URL is only necessary if you do not have the file # locally. However, The size variable is optional (but recommended: because # it gives future scientists a feeling of the volume of data they need to # input to run your project: will become important if the size/number of # files is large). # # The input dataset's name (that goes into the '%') can be different from # the URL's file name (last component of the URL, after the last '/'). Just # note that it is assumed that the local copy (outside of your project) is # also called '%' (if your local copy of the input dataset and the only # repository names are the same, be sure to set '%' accordingly). # # Copyright (C) 2018-2022 Mohammad Akhlaghi # # Copying and distribution of this file, with or without modification, are # permitted in any medium without royalty provided the copyright notice and # this notice are preserved. This file is offered as-is, without any # warranty. # Demo dataset used in the histogram plot # --------------------------------------- # # Remove this part while you are entering your project's datasets. # # Since the demonstration dataset is a FITS file, we have also added the # two '$(INPUT-%-fits*)' variables as a demonstration. But they are # commented because the SHA256 method is also possible for this file (its # not generated on the server at query time; it is a static file on the # server). INPUT-wfpc2.fits-size = 62K INPUT-wfpc2.fits-url = https://fits.gsfc.nasa.gov/samples/WFPC2ASSNu5780205bx.fits INPUT-wfpc2.fits-sha256 = 9851bc2bf9a42008ea606ec532d04900b60865daaff2f233e5c8565dac56ad5f #INPUT-wfpc2.fits-fitshdu = 0 #INPUT-wfpc2.fits-fitsdatasum = 2218330266