From 3c9bf3aff30f02c7d31bd86f36c4db2520f8ffa4 Mon Sep 17 00:00:00 2001 From: Mohammad Akhlaghi Date: Fri, 3 May 2024 13:07:41 +0200 Subject: Configuration: no dependency on /bin/sh and useful run-time options SUMMARY: no change necessary in your project, this commit only involves changes in how already-existing software are built. Some handy options have also been added to the top-level project script and the copyright years have been updated. Until now, if the host's '/bin/sh' had conflicts with the Maneage environment, the configuration of Maneage would crash as soon as we entered the building of high-level software. The full scenario is described in the comments of the newly added 'reproduce/software/shell/prep-source.sh'. This is most relevant when building older Maneage'd project in newer environments. With this commit, the following changes were made to avoid the problem above: - Maneage edits the source code of all installed software to replace '/bin/sh' with Maneage's own shell before the programs are built. Through this, we were able to solve the problem described above. - The portable '#!/usr/bin/env sh' shebangs are now used at the start of the scripts that run during configure time so it uses the first available shell that it finds in its PATH (the system's before Dash is built), then Dash, and after Dash is built, Bash. - For TeXLive, since we don't install it from source, it was necessary to add the libraries necessary for the local '/bin/sh' in LD_LIBRARY_PATH. Some high-level options have been added to the './project' script to simplify certain operations: --keep-going: do not stop upon the first crash, but keep going on to build targets until all build-able targets have been built. This is very useful for debugging large pipelines and allows you to isolate the problematic part of your project. --highlight-all: equivalent to calling both '--highlight-new' & '--highlight-notes'. --- reproduce/analysis/bash/download-multi-try | 155 -------------------- reproduce/analysis/bash/download-multi-try.sh | 160 +++++++++++++++++++++ reproduce/analysis/config/INPUTS.conf | 9 +- .../analysis/config/delete-me-squared-num.conf | 2 +- reproduce/analysis/config/metadata.conf | 2 +- reproduce/analysis/config/pdf-build.conf | 2 +- reproduce/analysis/config/verify-outputs.conf | 2 +- reproduce/analysis/make/delete-me.mk | 2 +- reproduce/analysis/make/initialize.mk | 6 +- reproduce/analysis/make/paper.mk | 12 +- reproduce/analysis/make/prepare.mk | 2 +- reproduce/analysis/make/top-make.mk | 2 +- reproduce/analysis/make/top-prepare.mk | 2 +- reproduce/analysis/make/verify.mk | 2 +- 14 files changed, 188 insertions(+), 172 deletions(-) delete mode 100755 reproduce/analysis/bash/download-multi-try create mode 100755 reproduce/analysis/bash/download-multi-try.sh (limited to 'reproduce/analysis') diff --git a/reproduce/analysis/bash/download-multi-try b/reproduce/analysis/bash/download-multi-try deleted file mode 100755 index d7e9be2..0000000 --- a/reproduce/analysis/bash/download-multi-try +++ /dev/null @@ -1,155 +0,0 @@ -#!/bin/sh -# -# Attempt downloading multiple times before crashing whole project. From -# the top project directory (for the shebang above), this script must be -# run like this: -# -# $ /path/to/download-multi-try downloader lockfile input-url downloaded-name -# -# NOTE: The 'downloader' must contain the option to specify the output name -# in its end. For example "wget -O". Any other option can also be placed in -# the middle. -# -# Due to temporary network problems, a download may fail suddenly, but -# succeed in a second try a few seconds later. Without this script that -# temporary glitch in the network will permanently crash the project and -# it can't continue. The job of this script is to be patient and try the -# download multiple times before crashing the whole project. -# -# LOCK FILE: Since there is usually only one network port to the outside -# world, downloading is done much faster in serial, not in parallel. But -# the project's processing may be done in parallel (with multiple threads -# needing to download different files at the same time). Therefore, this -# script uses the 'flock' program to only do one download at a time. To -# benefit from it, any call to this script must be given the same lock -# file. If your system has multiple ports to the internet, or for any -# reason, you don't want to use a lock file, set the 'lockfile' name to -# 'nolock'. -# -# Copyright (C) 2019-2023 Mohammad Akhlaghi -# -# This program is free software: you can redistribute it and/or modify -# it under the terms of the GNU General Public License as published by -# the Free Software Foundation, either version 3 of the License, or -# (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program. If not, see . - - - - - -# Script settings -# --------------- -# Stop the script if there are any errors. -set -e - - - - - -# Input arguments and necessary sanity checks. Note that the 5th argument -# (backup servers) isn't mandatory. -inurl="$3" -outname="$4" -lockfile="$2" -downloader="$1" -backupservers="$5" -if [ "x$downloader" = x ]; then - echo "$0: downloader (first argument) not given."; exit 1; -fi -if [ "x$lockfile" = x ]; then - echo "$0: lock file (second argument) not given."; exit 1; -fi -if [ "x$inurl" = x ]; then - echo "$0: full input URL (third argument) not given."; exit 1; -fi -if [ "x$outname" = x ]; then - echo "$0: output name (fourth argument) not given."; exit 1; -fi - - - - - -# Separate the actual filename, to possibly use backup server. -urlfile=$(echo "$inurl" | awk -F "/" '{print $NF}') - - - - - -# Try downloading multiple times before crashing. -counter=0 -maxcounter=10 -while [ ! -f "$outname" ]; do - - # Increment the counter. - counter=$(echo $counter | awk '{print $1+1}') - - # If we have passed a maximum number of trials, just exit with - # a failed code. - reachedmax=$(echo $counter \ - | awk '{if($1>'$maxcounter') print "yes"; else print "no";}') - if [ x$reachedmax = xyes ]; then - echo "" - echo "Failed $maxcounter download attempts: $outname" - echo "" - exit 1 - fi - - # If this isn't the first attempt print a notice and wait a little for - # the next trail. - if [ x$counter = x1 ]; then - just_a_place_holder=1 - else - tstep=$(echo $counter | awk '{print $1*5}') - echo "Download trial $counter for '$outname' in $tstep seconds." - sleep $tstep - fi - - # Attempt downloading the file. Note that the 'downloader' ends with - # the respective option to specify the output name. For example "wget - # -O" (so 'outname', that comes after it) will be the name of the - # downloaded file. - if [ x"$lockfile" = xnolock ]; then - if ! $downloader $outname $inurl; then rm -f $outname; fi - else - # Try downloading from the requested URL. - flock "$lockfile" sh -c \ - "if ! $downloader $outname \"$inurl\"; then rm -f $outname; fi" - fi - - # If the download failed, try the backup server(s). - if [ ! -f "$outname" ]; then - if [ x"$backupservers" != x ]; then - for bs in $backupservers; do - - # Use this backup server. - if [ x"$lockfile" = xnolock ]; then - if ! $downloader $outname $bs/$urlfile; then rm -f $outname; fi - else - flock "$lockfile" sh -c \ - "if ! $downloader $outname $bs/$urlfile; then rm -f $outname; fi" - fi - - # If the file was downloaded, break out of the loop that - # parses over the backup servers. - if [ -f "$outname" ]; then break; fi - done - fi - fi -done - - - - - -# Return successfully -exit 0 diff --git a/reproduce/analysis/bash/download-multi-try.sh b/reproduce/analysis/bash/download-multi-try.sh new file mode 100755 index 0000000..bea88d5 --- /dev/null +++ b/reproduce/analysis/bash/download-multi-try.sh @@ -0,0 +1,160 @@ +#!/usr/bin/env sh +# +# Attempt downloading multiple times before crashing whole project. From +# the top project directory (for the shebang above), this script must be +# run like this: +# +# $ $SHELL /path/to/download-multi-try.sh downloader lockfile \ +# input-url downloaded-name +# +# NOTE: +# - This script doesn't have a Shebang because in different stages it +# should be built with different shells ('/bin/sh' before Maneage +# installs its own shell and afterwards with Maneage's own shell). +# - The 'downloader' must contain the option to specify the output name +# in its end. For example "wget -O". Any other option can also be placed in +# the middle. +# +# Due to temporary network problems, a download may fail suddenly, but +# succeed in a second try a few seconds later. Without this script that +# temporary glitch in the network will permanently crash the project and +# it can't continue. The job of this script is to be patient and try the +# download multiple times before crashing the whole project. +# +# LOCK FILE: Since there is usually only one network port to the outside +# world, downloading is done much faster in serial, not in parallel. But +# the project's processing may be done in parallel (with multiple threads +# needing to download different files at the same time). Therefore, this +# script uses the 'flock' program to only do one download at a time. To +# benefit from it, any call to this script must be given the same lock +# file. If your system has multiple ports to the internet, or for any +# reason, you don't want to use a lock file, set the 'lockfile' name to +# 'nolock'. +# +# Copyright (C) 2019-2025 Mohammad Akhlaghi +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + + + + + +# Script settings +# --------------- +# Stop the script if there are any errors. +set -e + + + + + +# Input arguments and necessary sanity checks. Note that the 5th argument +# (backup servers) isn't mandatory. +inurl="$3" +outname="$4" +lockfile="$2" +downloader="$1" +backupservers="$5" +if [ "x$downloader" = x ]; then + echo "$0: downloader (first argument) not given."; exit 1; +fi +if [ "x$lockfile" = x ]; then + echo "$0: lock file (second argument) not given."; exit 1; +fi +if [ "x$inurl" = x ]; then + echo "$0: full input URL (third argument) not given."; exit 1; +fi +if [ "x$outname" = x ]; then + echo "$0: output name (fourth argument) not given."; exit 1; +fi + + + + + +# Separate the actual filename, to possibly use backup server. +urlfile=$(echo "$inurl" | awk -F "/" '{print $NF}') + + + + + +# Try downloading multiple times before crashing. +counter=0 +maxcounter=10 +while [ ! -f "$outname" ]; do + + # Increment the counter. + counter=$(echo $counter | awk '{print $1+1}') + + # If we have passed a maximum number of trials, just exit with + # a failed code. + reachedmax=$(echo $counter \ + | awk '{if($1>'$maxcounter') print "yes"; else print "no";}') + if [ x$reachedmax = xyes ]; then + echo "" + echo "Failed $maxcounter download attempts: $outname" + echo "" + exit 1 + fi + + # If this isn't the first attempt print a notice and wait a little for + # the next trail. + if [ x$counter = x1 ]; then + just_a_place_holder=1 + else + tstep=$(echo $counter | awk '{print $1*5}') + echo "Download trial $counter for '$outname' in $tstep seconds." + sleep $tstep + fi + + # Attempt downloading the file. Note that the 'downloader' ends with + # the respective option to specify the output name. For example "wget + # -O" (so 'outname', that comes after it) will be the name of the + # downloaded file. + if [ x"$lockfile" = xnolock ]; then + if ! $downloader $outname $inurl; then rm -f $outname; fi + else + # Try downloading from the requested URL. + flock "$lockfile" sh -c \ + "if ! $downloader $outname \"$inurl\"; then rm -f $outname; fi" + fi + + # If the download failed, try the backup server(s). + if [ ! -f "$outname" ]; then + if [ x"$backupservers" != x ]; then + for bs in $backupservers; do + + # Use this backup server. + if [ x"$lockfile" = xnolock ]; then + if ! $downloader $outname $bs/$urlfile; then rm -f $outname; fi + else + flock "$lockfile" sh -c \ + "if ! $downloader $outname $bs/$urlfile; then rm -f $outname; fi" + fi + + # If the file was downloaded, break out of the loop that + # parses over the backup servers. + if [ -f "$outname" ]; then break; fi + done + fi + fi +done + + + + + +# Return successfully +exit 0 diff --git a/reproduce/analysis/config/INPUTS.conf b/reproduce/analysis/config/INPUTS.conf index 1090e44..5860806 100644 --- a/reproduce/analysis/config/INPUTS.conf +++ b/reproduce/analysis/config/INPUTS.conf @@ -70,9 +70,10 @@ # file). Don't use this if you give the 'fitsdatasum' # keyvalue. # -# INPUT-%-fitsdatasum: The FITS standard DATASUM value for HDU number 1 -# of the FITS file (counting from 0). Don't use this -# if you give the 'sha256' keyword. +# INPUT-%-fitsdatasum: The FITS standard DATASUM value for the HDU given +# to '-fitshdu' (below) of the FITS file (counting +# from 0). Don't use this if you give the 'sha256' +# keyword. # # INPUT-%-fitshdu: The HDU identifier (counter from 0, or name) to use # for the verification. This is only relevant in the @@ -104,7 +105,7 @@ # also called '%' (if your local copy of the input dataset and the only # repository names are the same, be sure to set '%' accordingly). # -# Copyright (C) 2018-2023 Mohammad Akhlaghi +# Copyright (C) 2018-2025 Mohammad Akhlaghi # # Copying and distribution of this file, with or without modification, are # permitted in any medium without royalty provided the copyright notice and diff --git a/reproduce/analysis/config/delete-me-squared-num.conf b/reproduce/analysis/config/delete-me-squared-num.conf index c2fa79c..ba8c960 100644 --- a/reproduce/analysis/config/delete-me-squared-num.conf +++ b/reproduce/analysis/config/delete-me-squared-num.conf @@ -1,6 +1,6 @@ # Number of samples in the demonstration analysis (to be deleted). # -# Copyright (C) 2019-2023 Mohammad Akhlaghi +# Copyright (C) 2019-2025 Mohammad Akhlaghi # # Copying and distribution of this file, with or without modification, are # permitted in any medium without royalty provided the copyright notice and diff --git a/reproduce/analysis/config/metadata.conf b/reproduce/analysis/config/metadata.conf index 7bafc4b..1ab7de1 100644 --- a/reproduce/analysis/config/metadata.conf +++ b/reproduce/analysis/config/metadata.conf @@ -15,7 +15,7 @@ # and the copyright license name and standard link to the fully copyright # license. # -# Copyright (C) 2020-2023 Mohammad Akhlaghi +# Copyright (C) 2020-2025 Mohammad Akhlaghi # # Copying and distribution of this file, with or without modification, are # permitted in any medium without royalty provided the copyright notice and diff --git a/reproduce/analysis/config/pdf-build.conf b/reproduce/analysis/config/pdf-build.conf index 2aa2e9a..7821306 100644 --- a/reproduce/analysis/config/pdf-build.conf +++ b/reproduce/analysis/config/pdf-build.conf @@ -12,7 +12,7 @@ # LaTeX. Otherwise, a notice will just printed that, no PDF will be # created. # -# Copyright (C) 2018-2023 Mohammad Akhlaghi +# Copyright (C) 2018-2025 Mohammad Akhlaghi # # Copying and distribution of this file, with or without modification, are # permitted in any medium without royalty provided the copyright notice and diff --git a/reproduce/analysis/config/verify-outputs.conf b/reproduce/analysis/config/verify-outputs.conf index db7751d..031085d 100644 --- a/reproduce/analysis/config/verify-outputs.conf +++ b/reproduce/analysis/config/verify-outputs.conf @@ -1,6 +1,6 @@ # To enable verification of output datasets set this variable to 'yes'. # -# Copyright (C) 2019-2023 Mohammad Akhlaghi +# Copyright (C) 2019-2025 Mohammad Akhlaghi # # Copying and distribution of this file, with or without modification, are # permitted in any medium without royalty provided the copyright notice and diff --git a/reproduce/analysis/make/delete-me.mk b/reproduce/analysis/make/delete-me.mk index 325280d..a20abc6 100644 --- a/reproduce/analysis/make/delete-me.mk +++ b/reproduce/analysis/make/delete-me.mk @@ -1,6 +1,6 @@ # Dummy Makefile to create a random dataset for plotting. # -# Copyright (C) 2018-2023 Mohammad Akhlaghi +# Copyright (C) 2018-2025 Mohammad Akhlaghi # # This Makefile is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by diff --git a/reproduce/analysis/make/initialize.mk b/reproduce/analysis/make/initialize.mk index 9e8db4a..92e5eff 100644 --- a/reproduce/analysis/make/initialize.mk +++ b/reproduce/analysis/make/initialize.mk @@ -1,6 +1,6 @@ # Project initialization. # -# Copyright (C) 2018-2023 Mohammad Akhlaghi +# Copyright (C) 2018-2025 Mohammad Akhlaghi # # This Makefile is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by @@ -469,7 +469,7 @@ dist-software: # process with a file and make sure that only one downloading event is in # progress at every moment. $(indir):; mkdir $@ -downloadwrapper = $(bashdir)/download-multi-try +downloadwrapper = $(bashdir)/download-multi-try.sh inputdatasets := $(foreach i, \ $(patsubst INPUT-%-sha256,%, \ $(filter INPUT-%-sha256,$(.VARIABLES))) \ @@ -672,7 +672,7 @@ print-general-metadata = \ # for the final PDF. Since these are not version controlled, it must be # calculated everytime the project is run. So even though this file # actually exists, it is also aded as a '.PHONY' target above. -$(mtexdir)/initialize.tex: | $(mtexdir) +$(mtexdir)/initialize.tex: # Version and title of project. About the starting '@': since these # commands are run every time with './project make', it is annoying diff --git a/reproduce/analysis/make/paper.mk b/reproduce/analysis/make/paper.mk index 791108b..66c6859 100644 --- a/reproduce/analysis/make/paper.mk +++ b/reproduce/analysis/make/paper.mk @@ -1,6 +1,6 @@ # Build the final PDF paper/report. # -# Copyright (C) 2018-2023 Mohammad Akhlaghi +# Copyright (C) 2018-2025 Mohammad Akhlaghi # # This Makefile is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by @@ -128,6 +128,11 @@ $(texbdir)/paper.bbl: tex/src/references.tex $(mtexdir)/dependencies-bib.tex \ # use PGFPlots, then you should remove the '-shell-escape' option # for better security. See https://savannah.nongnu.org/task/?15694 # for details. +# +# We need the modification to 'LD_LIBRARY_PATH' because we do not +# build LaTeX from source and it uses '/bin/sh' (among other +# possible system-wide things). + export LD_LIBRARY_PATH="$(sys_library_sh_path):$$LD_LIBRARY_PATH" pdflatex -shell-escape -halt-on-error "$$p"/paper.tex biber paper @@ -158,6 +163,11 @@ paper.pdf: $(mtexdir)/project.tex paper.tex $(texbdir)/paper.bbl # See above for a warning and brief discussion on the the pdflatex # option '-shell-escape'. +# +# We need the modification to 'LD_LIBRARY_PATH' because we do not +# build LaTeX from source and it uses '/bin/sh' (among other +# possible system-wide things). + export LD_LIBRARY_PATH="$(sys_library_sh_path):$$LD_LIBRARY_PATH" pdflatex -shell-escape -halt-on-error "$$p"/paper.tex # Come back to the top project directory and copy the built PDF diff --git a/reproduce/analysis/make/prepare.mk b/reproduce/analysis/make/prepare.mk index 36f5294..2cc1187 100644 --- a/reproduce/analysis/make/prepare.mk +++ b/reproduce/analysis/make/prepare.mk @@ -1,6 +1,6 @@ # Basic preparations, called by './project make'. # -# Copyright (C) 2019-2023 Mohammad Akhlaghi +# Copyright (C) 2019-2025 Mohammad Akhlaghi # # This Makefile is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by diff --git a/reproduce/analysis/make/top-make.mk b/reproduce/analysis/make/top-make.mk index 460433b..2689e64 100644 --- a/reproduce/analysis/make/top-make.mk +++ b/reproduce/analysis/make/top-make.mk @@ -1,6 +1,6 @@ # Top-level Makefile (first to be loaded). # -# Copyright (C) 2018-2023 Mohammad Akhlaghi +# Copyright (C) 2018-2025 Mohammad Akhlaghi # # This Makefile is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by diff --git a/reproduce/analysis/make/top-prepare.mk b/reproduce/analysis/make/top-prepare.mk index 930c2a9..7d92d72 100644 --- a/reproduce/analysis/make/top-prepare.mk +++ b/reproduce/analysis/make/top-prepare.mk @@ -4,7 +4,7 @@ # are not included here. Please see that file for thorough comments on each # step. # -# Copyright (C) 2019-2023 Mohammad Akhlaghi +# Copyright (C) 2019-2025 Mohammad Akhlaghi # # This Makefile is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by diff --git a/reproduce/analysis/make/verify.mk b/reproduce/analysis/make/verify.mk index aa026b5..c74f8ca 100644 --- a/reproduce/analysis/make/verify.mk +++ b/reproduce/analysis/make/verify.mk @@ -1,6 +1,6 @@ # Verify the project outputs before building the paper. # -# Copyright (C) 2020-2023 Mohammad Akhlaghi +# Copyright (C) 2020-2025 Mohammad Akhlaghi # # This Makefile is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by -- cgit v1.2.1