From df9e291826fbc7e717b40d2d07f1d7607a2f2455 Mon Sep 17 00:00:00 2001 From: Giacomo Lorenzetti Date: Thu, 3 Apr 2025 15:21:16 +0200 Subject: IMPORTANT: software configuration optimized and better modularized Summary: after merging this commit into your project, it should be re-configured since the location of software installation files like 'LOCAL.conf' or the LaTeX macros of the software environment have changed. But it should not affect the analysis phase of your project. Until this commit, it was not possible to run a pre-built Maneage'd project (in a container) on a newly cloned Maneage'd project source. This was because the containers should be read-only, but during the various checks of the configuration (to verify that we are using the same software environment in the container and the source), we were writing/testing many things in the build directory, and 'LOCAL.conf' which was actually in the source directory! Furthermore, the '.local' and '.build' were built at configure time, making it hard to run the same container from a newly cloned Maneage'd project. To make things harder for the scenario above, the 'configure.sh' script would pause on every message and didn't have a quiet mode (making it practically impossible to run './project configure' before './project make' on every container run). With this commit, all these issues have been addressed and it is now possible to simply get a built container, clone a Maneage'd project and run the analysis (using the built environment of the container that is verified on every run). The respective changes/additions are described below: - The high-level container scripts ('apptainer.sh' and 'docker.sh', along with their READMEs) have been moved to the 'reproduce/software/shell' directory and the old 'reproduce/software/containers' directory has been deleted. This is because we have classified the software files by their language/format and the container scripts are scripts in the end. - The './project' script: - Now has two extra options: '--quiet' and '--no-pause'. Both are directly passed to the 'configure.sh' script. They will respectively disable any informative printed message or any pause after that message (if it is printed). - The '--build-dir' option is now also relevant for './project make': when it is given, it will re-create the two '.build' and '.local' symbolic links at the top source directory in all scenarios ('configure', 'make' or 'shell'). This will allow both the configuration, analysis and shell phases to safely assume they exist and match the user's desire at run-time. - The build/analysis directory's sub-directories that need to be built before 'top-make.mk' are now built in a separate function to help in readability. - The 'configure.sh' script: - For developers: a new 'check_elapsed' variable has been defined that will enable the newly added 'elapsed_time_from_prev_step' function. This function should be used from now on at the end of every major step to help find bottlenecks. - The targets of the software in 'pre-make-build.sh' now also have the version of the software in their file name. Until now, they didn't have the version, so there was no way to detect if the software has been updated or not in the source. For Lzip and Make (that also get built after GCC), the ones in this script have a '-pre-make' suffix also. - 'Local.conf.in' now has descriptions for every variable. - The '-std=gnu17' option is now used instead of '-std=c17' for basic software that cannot be built without specifying the C standard in GCC 15.1 (described in previous commit: 2881fc0a6205). See [1] for more details; in summary: '-std=gnu17' is also supported on macOS's Clang and has some features that 'pkg-config' needs - Generally: some longer code lines have been broken or indentation decreased to fit the 75 character line length. This has not reduced readability however. For example the long 'echo' commands are now replaced by multiple 'printf's, or the indentation is still clearly visible. The seeds of the work on this commit started by a branch containing three commits by Giacomo Lorenzetti (133 insertions, 100 deletions). Upon merging with the main 'maneage' branch, they were generalized and re-organized to become this commit. The following issues have also been addressed with this commit: - The LaTeX calls (during the building of 'paper.pdf') do not contain Maneage'd dynamic libraries. This is because we don't build the LaTeX binaries from source, an TeXLive manager uses the host environment. - The 'docker.sh' script: - Adds the '--project-name' option: its internal variable existed, but the option for the user to define it at run-time was not. - Ported to macOS: it does not check being a member of the 'docker' group, and finds the number of threads using macOS-specific tools. - The 'apptainer.sh' script: - Now installs 'wget' in the base container also (necessary when the user doesn't have the tarballs). [1] https://savannah.nongnu.org/bugs/?67068#comment2 --- reproduce/software/shell/configure.sh | 1693 ++++++++++++++++++--------------- 1 file changed, 905 insertions(+), 788 deletions(-) (limited to 'reproduce/software/shell/configure.sh') diff --git a/reproduce/software/shell/configure.sh b/reproduce/software/shell/configure.sh index e291f7b..4887816 100755 --- a/reproduce/software/shell/configure.sh +++ b/reproduce/software/shell/configure.sh @@ -40,6 +40,14 @@ set -e # had the chance to implement it yet (please help if you can!). Until then, # please set them based on your project (if they differ from the core # branch). + +# If equals 1, a message will be printed, showing the nano-seconds since +# previous step: useful with '-e --offline --nopause --quiet' to find +# bottlenecks for speed optimization. Speed is important because this +# script is called automatically every time by the container scripts. +check_elapsed=0 + +# In case a fortran compiler is necessary to check. need_gfortran=0 @@ -52,14 +60,12 @@ need_gfortran=0 # These are defined to help make this script more readable. topdir="$(pwd)" optionaldir="/optional/path" -adir=reproduce/analysis/config cdir=reproduce/software/config -pconf=$cdir/LOCAL.conf -ptconf=$cdir/LOCAL_tmp.conf -poconf=$cdir/LOCAL_old.conf -depverfile=$cdir/versions.conf -depshafile=$cdir/checksums.conf + + + + @@ -73,14 +79,21 @@ depshafile=$cdir/checksums.conf # that their changes are not going to be permenant. create_file_with_notice () { - if echo "# IMPORTANT: file can be RE-WRITTEN after './project configure'" > "$1" + if printf "# IMPORTANT: " > "$1" then - echo "#" >> "$1" - echo "# This file was created during configuration" >> "$1" - echo "# ('./project configure'). Therefore, it is not under" >> "$1" - echo "# version control and any manual changes to it will be" >> "$1" - echo "# over-written if the project re-configured." >> "$1" - echo "#" >> "$1" + # These commands may look messy, but the produced comments in the + # file are the main goal and they are readable. (without having to + # break our source-code line length). + printf "file can be RE-WRITTEN after './project " >> "$1" + printf "configure'.\n" >> "$1" + printf "#\n" >> "$1" + printf "# This file was created during configuration " >> "$1" + printf "('./project configure').\n" >> "$1" + printf "# Therefore, it is not under version control " >> "$1" + printf "and any manual changes\n" >> "$1" + printf "# to it will be over-written when the " >> "$1" + printf "project is re-configured.\n" >> "$1" + printf "#\n" >> "$1" else echo; echo "Can't write to $1"; echo; exit 1 @@ -102,7 +115,7 @@ absolute_dir () if stat "$address" 1> /dev/null; then echo "$(cd "$(dirname "$1")" && pwd )/$(basename "$1")" else - exit 1; + echo "$optionaldir" fi } @@ -200,30 +213,113 @@ free_space_warning() -# See if we are on a Linux-based system -# -------------------------------------- +# Function to empty the temporary software building directory. This can +# either be a symbolic link (to RAM) or an actual directory, so we can't +# simply use 'rm -r' (because a symbolic link is not a directory for 'rm'). +empty_build_tmp() { + + # 'ls -A' does not print the '.' and '..' and the '-z' option of '[' + # checks if the string is empty or not. This allows us to only attempt + # deleting the directory's contents if it actually has anything inside + # of it. Otherwise, '*' will not expand and we'll get an 'rm' error + # complaining that '$tmpblddir/*' doesn't exist. We also don't want to + # use 'rm -rf $tmpblddir/*' because in case of a typo or while + # debugging (if '$tmpblddir' becomes an empty string), this can + # accidentally delete the whole root partition (or a least the '/home' + # partition of the user). + if ! [ x"$( ls -A $tmpblddir )" = x ]; then + rm -r "$tmpblddir"/* + fi + rm -r "$tmpblddir" +} + + + + + +# Function to report the elapsed time between steps (if it was activated +# above with 'check_elapsed'). +elapsed_time_from_prev_step() { + if [ $check_elapsed = 1 ]; then + chel_now=$(date +"%N"); + chel_delta=$(echo $chel_prev $chel_now \ + | awk '{ delta=($2-$1)/1e6; \ + if(delta>0) d=delta; else d=0; \ + print d}') + chel_dsum=$(echo $chel_dsum $chel_delta | awk '{print $1+$2}') + echo $chel_counter $chel_delta "$1" \ + | awk '{ printf "Step %02d: %-6.2f [millisec]; %s\n", \ + $1, $2, $3}' + chel_counter=$((chel_counter+1)) + chel_prev=$(date +"%N") + fi +} + + + + + + + + + + +# In already-built container +# -------------------------- +# +# We need to run './project configure' at the start of every run of Maneage +# within a container (with 'shell' or 'make'). This is because we need to +# ensure the versions of all software are correct. However, the container +# filesystem (where the build/software directory is located) should be run +# as read-only when doing the analysis. So we will not be able to run some +# of the tests that require writing files or are generally not relevant +# when the container is already built (we want the configure command to be +# as fast as possible). +# +# The project source in Maneage'd containers is '/home/maneager/source'. +built_container=0 +if [ "$topdir" = /home/maneager/source ] \ + && [ -f .build/software/config/hardware-parameters.tex ]; then + built_container=1; +fi + +# Initialize the elapsed time measurement parameters. +if [ $check_elapsed = 1 ]; then + chel_dsum=0.00 + chel_counter=1 + chel_prev=$(date +"%N") + chel_start=$(date +"%N") +fi + + + + +# Identify the running OS +# ----------------------- # # Some features are tailored to GNU/Linux systems, while the BSD-based # behavior is different. Initially we only tested macOS (hence the name of # the variable), but as FreeBSD is also being inlucded in our tests. As # more systems get used, we need to tailor these kinds of things better. -kernelname=$(uname -s) -if [ x$kernelname = xLinux ]; then - on_mac_os=no - - # Don't forget to add the respective C++ compiler below (leave 'cc' in - # the end). - c_compiler_list="gcc clang cc" -elif [ x$kernelname = xDarwin ]; then - host_cc=1 - on_mac_os=yes - - # Don't forget to add the respective C++ compiler below (leave 'cc' in - # the end). - c_compiler_list="clang gcc cc" -else - on_mac_os=no - cat < /dev/null 2>/dev/null; then - export CC=$c; - if type $cplus > /dev/null 2>/dev/null; then - export CXX=$cplus - has_compilers=yes - break + # Check if they exist. + if type $c > /dev/null 2>/dev/null; then + export CC=$c; + if type $cplus > /dev/null 2>/dev/null; then + export CXX=$cplus + has_compilers=yes + break + fi fi - fi -done -if [ x$has_compilers = xno ]; then - cat < $testsource < $testsource < #include -int main(void){printf("...C compiler works.\n"); - return EXIT_SUCCESS;} +int main(void){printf("Good!\n"); return EXIT_SUCCESS;} EOF -if $CC $noccwarnings $testsource -o$testprog && $testprog; then - rm $testsource $testprog -else - rm $testsource - cat < /dev/null; then + if [ $quiet = 0 ]; then echo "... yes"; fi + rm $testsource $testprog + else + rm $testsource + cat < $testsource < $testsource < #include int @@ -502,17 +614,17 @@ main(void) { return 0; } EOF -if $CC $testsource -o$testprog 2>/dev/null > /dev/null; then - needs_ldl=no; -else - needs_ldl=yes; + if $CC $testsource -o$testprog 2>/dev/null > /dev/null; then + needs_ldl=no; + else + needs_ldl=yes; + fi + elapsed_time_from_prev_step compiler-needs-dynamic-linker fi - - # See if the C compiler can build static libraries # ------------------------------------------------ # @@ -528,32 +640,30 @@ fi # the library came from the system or our build. static_build=no - - - - # Print warning if the host CC is to be used. -if [ x$host_cc = x1 ]; then +if [ $built_container = 0 ] && [ x$host_cc = x1 ]; then cat < $testsource < $testsource < #include #include -int main(void){printf("...yes\n"); - return EXIT_SUCCESS;} +int main(void){printf("...yes\n"); return EXIT_SUCCESS;} EOF - cc_call="$CC $testsource $CPPFLAGS $LDFLAGS -o$testprog -static -lc" - if $cc_call && $testprog; then - gccwarning=0 - rm $testsource $testprog - else - echo; echo "Compilation command:"; echo "$cc_call" - rm $testsource - gccwarning=1 - host_cc=1 - cat < /dev/null; then + gccwarning=0 + rm $testsource $testprog + if [ $quiet = 0 ]; then echo "... yes"; fi + else + echo; echo "Compilation command:"; echo "$cc_call" + rm $testsource + gccwarning=1 + host_cc=1 + cat < $testsourcef - echo " END" >> $testsourcef + echo " PRINT *, \"... Fortran Compiler works.\"" \ + > $testsourcef + echo " END" >> $testsourcef if gfortran $testsourcef -o$testprog && $testprog; then rm $testsourcef $testprog else @@ -732,6 +849,68 @@ EOF exit 1 fi fi + elapsed_time_from_prev_step compiler-fortran +fi + + + + + +# See if the linker accepts -Wl,-rpath-link +# ----------------------------------------- +# +# '-rpath-link' is used to write the information of the linked shared +# library into the shared object (library or program). But some versions of +# LLVM's linker don't accept it an can cause problems. +# +# IMPORTANT NOTE: This test has to be done **AFTER** the definition of +# 'instdir', otherwise, it is going to be used as an empty string. +if [ $built_container = 0 ]; then + cat > $testsource < +#include +int main(void) {return EXIT_SUCCESS;} +EOF + if $CC $testsource -o$testprog -Wl,-rpath-link 2>/dev/null \ + > /dev/null; then + export rpath_command="-Wl,-rpath-link=$instdir/lib" + else + export rpath_command="" + fi + + # Delete the temporary directory for compiler checking. + rm -f $testprog $testsource + rm -r $compilertestdir + elapsed_time_from_prev_step compiler-rpath +fi + + + + + +# Paths needed by the host compiler (only for 'basic.mk') +# ------------------------------------------------------- +# +# At the end of the basic build, we need to build GCC. But GCC will build +# in multiple phases, making its own simple compiler in order to build +# itself completely. The intermediate/simple compiler doesn't recognize +# some system specific locations like '/usr/lib/ARCHITECTURE' that some +# operating systems use. We thus need to tell the intermediate compiler +# where its necessary libraries and headers are. +if [ $built_container = 0 ]; then + if [ x"$sys_library_path" != x ]; then + if [ x"$LIBRARY_PATH" = x ]; then + export LIBRARY_PATH="$sys_library_path" + else + export LIBRARY_PATH="$LIBRARY_PATH:$sys_library_path" + fi + if [ x"$CPATH" = x ]; then + export CPATH="$sys_cpath" + else + export CPATH="$CPATH:$sys_cpath" + fi + fi + elapsed_time_from_prev_step compiler-paths fi @@ -743,7 +922,8 @@ fi # # Print some basic information so the user gets a feeling of what is going # on and is prepared on what will happen next. -cat < /dev/null 2>/dev/null; then - - # 'which' isn't in POSIX, so we are using 'command -v' instead. - name=$(command -v wget) - - # See if the host wget has the '--no-use-server-timestamps' option - # (for example wget 1.12 doesn't have it). If not, we'll have to - # remove it. This won't affect the analysis of Maneage in anyway, - # its just to avoid re-downloading if the server timestamps are - # bad; at the worst case, it will just cause a re-download of an - # input software source code (for data inputs, we will use our own - # wget that has this option). - tsname="no-use-server-timestamps" - tscheck=$(wget --help | grep $tsname || true) - if [ x"$tscheck" = x ]; then wgetts="" - else wgetts="--$tsname"; - fi - - # By default Wget keeps the remote file's timestamp, so we'll have - # to disable it manually. - downloader="$name $wgetts -O"; - elif type curl > /dev/null 2>/dev/null; then - name=$(command -v curl) - - # - cURL doesn't keep the remote file's timestamp by default. - # - With the '-L' option, we tell cURL to follow redirects. - downloader="$name -L -o" - else - cat <> $pconf -else - # Read the values from existing configuration file. Note that the build - # directory may have space characters. Even though we currently check - # against it, we hope to be able to remove this condition in the - # future. - inbdir=$(awk '$1=="BDIR" { for(i=3; i /dev/null 2>/dev/null; then -################################################################# -######## ERORR reading existing configuration file ############ -################################################################# -EOF - if [ $verr = 1 ]; then - cat < /dev/null 2>/dev/null; then + name=$(command -v curl) + # - cURL doesn't keep the remote file's timestamp by default. + # - With the '-L' option, we tell cURL to follow redirects. + downloader="$name -L -o" + else cat <> $lconf +fi +elapsed_time_from_prev_step LOCAL-write @@ -1217,99 +1430,58 @@ rm -f "$finaltarget" # avoid too many directory dependencies throughout the software and # analysis Makefiles (thus making them hard to read), we are just building # them here -# Software tarballs tardir="$sdir"/tarballs -if ! [ -d "$tardir" ]; then mkdir "$tardir"; fi - -# Installed software instdir="$sdir"/installed -if ! [ -d "$instdir" ]; then mkdir "$instdir"; fi +tmpblddir="$sdir"/build-tmp -# To record software versions and citation. +# Second-level directories. +instlibdir="$instdir"/lib +instbindir="$instdir"/bin verdir="$instdir"/version-info -if ! [ -d "$verdir" ]; then mkdir "$verdir"; fi - -# Program and library versions and citation. -ibidir="$verdir"/proglib -if ! [ -d "$ibidir" ]; then mkdir "$ibidir"; fi -# Python module versions and citation. +# Sub-directories of version-info +itidir="$verdir"/tex +ictdir="$verdir"/cite ipydir="$verdir"/python -if ! [ -d "$ipydir" ]; then mkdir "$ipydir"; fi - -# R module versions and citation. +ibidir="$verdir"/proglib ircrandir="$verdir"/r-cran -if ! [ -d "$ircrandir" ]; then mkdir "$ircrandir"; fi - -# Used software BibTeX entries. -ictdir="$verdir"/cite -if ! [ -d "$ictdir" ]; then mkdir "$ictdir"; fi - -# TeXLive versions. -itidir="$verdir"/tex -if ! [ -d "$itidir" ]; then mkdir "$itidir"; fi - -# Some software install their libraries in '$(idir)/lib64'. But all other -# libraries are in '$(idir)/lib'. Since Maneage's build is only for a -# single architecture, we can set the '$(idir)/lib64' as a symbolic link to -# '$(idir)/lib' so all the libraries are always available in the same -# place. -instlibdir="$instdir"/lib -if ! [ -d "$instlibdir" ]; then mkdir "$instlibdir"; fi -ln -fs "$instlibdir" "$instdir"/lib64 - -# Wrapper over Make as a single command so it does not default to '/bin/sh' -# during installation (needed by some programs like CMake). -instbindir=$instdir/bin -if ! [ -d $instbindir ]; then mkdir $instbindir; fi -makewshell="$instbindir/make-with-shell" -echo "$instbindir/make SHELL=$instbindir/bash \$@" > $makewshell -chmod +x $makewshell - - - - - -# Project's top-level built analysis directories -# ---------------------------------------------- +if [ $built_container = 0 ]; then + + # Top-level directories. + if ! [ -d "$tardir" ]; then mkdir "$tardir"; fi + if ! [ -d "$instdir" ]; then mkdir "$instdir"; fi + + # Second-level directories. + if ! [ -d "$verdir" ]; then mkdir "$verdir"; fi + if ! [ -d "$instbindir" ]; then mkdir "$instbindir"; fi + + # Sub-directories of version-info + if ! [ -d "$itidir" ]; then mkdir "$itidir"; fi + if ! [ -d "$ictdir" ]; then mkdir "$ictdir"; fi + if ! [ -d "$ipydir" ]; then mkdir "$ipydir"; fi + if ! [ -d "$ibidir" ]; then mkdir "$ibidir"; fi + if ! [ -d "$ircrandir" ]; then mkdir "$ircrandir"; fi + + # Some software install their libraries in '$(idir)/lib64'. But all + # other libraries are in '$(idir)/lib'. Since Maneage's build is only + # for a single architecture, we can set the '$(idir)/lib64' as a + # symbolic link to '$(idir)/lib' so all the libraries are always + # available in the same place. + if ! [ -d "$instlibdir" ]; then mkdir "$instlibdir"; fi + ln -fs "$instlibdir" "$instdir"/lib64 + + # Wrapper over Make as a single command so it does not default to + # '/bin/sh' during installation (needed by some programs like CMake). + makewshell="$instbindir/make-with-shell" + if ! [ -f "$makewshell" ]; then + echo "$instbindir/make SHELL=$instbindir/bash \$@" > $makewshell + chmod +x $makewshell + fi -# Top-level LaTeX. -texdir="$sdir"/tex -if ! [ -d "$texdir" ]; then mkdir "$texdir"; fi - -# If 'tex/build' and 'tex/tikz' are symbolic links then 'rm -f' will delete -# them and we can continue. However, when the project is being built from -# the tarball, these two are not symbolic links but actual directories with -# the necessary built-components to build the PDF in them. In this case, -# because 'tex/build' is a directory, 'rm -f' will fail, so we'll just -# rename the two directories (as backup) and let the project build the -# proper symbolic links afterwards. -if rm -f tex/build; then - rm -f tex/tikz -else - mv tex/tikz tex/tikz-from-tarball - mv tex/build tex/build-from-tarball + # Report the execution time of this step. + elapsed_time_from_prev_step subdirectories-of-build fi -# Set the symbolic links for easy access to the top project build -# directories. Note that these are put in each user's source/cloned -# directory, not in the build directory (which can be shared between many -# users and thus may already exist). -# -# Note: if we don't delete them first, it can happen that an extra link -# will be created in each directory that points to its parent. So to be -# safe, we are deleting all the links on each re-configure of the -# project. Note that at this stage, we are using the host's 'ln', not our -# own, so its best not to assume anything (like 'ln -sf'). -rm -f .build .local - -ln -s "$bdir" .build -ln -s "$instdir" .local - -# --------- Delete for no Gnuastro --------- -rm -f .gnuastro -# ------------------------------------------ - @@ -1322,120 +1494,116 @@ rm -f .gnuastro # HDDs/SSDs and improve speed, it is therefore better to build them in the # RAM when possible. The RAM of most systems today (>8GB) is large enough # for the parallel building of the software. - +# # Set the top-level shared memory location. Currently there is only one # standard location (for GNU/Linux OSs), so doing this check here and the # main job below may seem redundant. However, it is written separately from # the main code below because later, we expect to add more possible # mounting locations (for other OSs). -if [ -d /dev/shm ]; then shmdir=/dev/shm -else shmdir="" -fi +if [ $built_container = 0 ]; then + if [ -d /dev/shm ]; then shmdir=/dev/shm + else shmdir="" + fi -# If a shared memory mounted directory exists and has the necessary -# conditions, set that directory to build software. -if [ x"$shmdir" != x ]; then - - # Make sure it has enough space. - needed_space=2000000 - available_space=$(df "$shmdir" | awk 'NR==2{print $4}') - if [ $available_space -gt $needed_space ]; then - - # Set the Maneage-specific directory within the shared - # memory. We'll use the names of the two parent directories to the - # current/running directory, separated by a '-' instead of - # '/'. We'll then appended that with the user's name (in case - # multiple users may be working on similar project names). - # - # Maybe later, we can use something like 'mktemp' to add random - # characters to this name and make it unique to every run (even for - # a single user). - dirname=$(pwd | sed -e's/\// /g' \ - | awk '{l=NF-1; printf("%s-%s", $l, $NF)}') - tbshmdir="$shmdir"/"$dirname"-$(whoami) - - # Try to make the directory if it does not yet exist. A failed - # directory creation will be tested for a few lines later, when - # testing for the existence and executability of a test file. - if ! [ -d "$tbshmdir" ]; then (mkdir "$tbshmdir" || true); fi - - # Some systems may protect '/dev/shm' against the right to execute - # programs by ordinary users. We thus need to check that the device - # allows execution within this directory by this user. - shmexecfile="$tbshmdir"/shm-execution-check.sh - rm -f $shmexecfile # We also don't want any existing flags. - - # Create the file to be executed, but do not fail fatally if it - # cannot be created. We will check a few lines later if the file - # really exists. - (cat > "$shmexecfile" < "$shmexecfile" < /dev/null' after the execution command - # because it can produce false failures randomly on some systems. - if [ -e "$shmexecfile" ]; then - - # Add the executable flag. - chmod +x "$shmexecfile" - - # The following line tries to execute the file. - if "$shmexecfile"; then - # Successful execution. The colon is a "no-op" (no - # operation) shell command. - : + # If the file was successfully created, then make the file + # executable and see if it runs. If not, set 'tbshmdir' to an + # empty string so it is not used in later steps. In any case, + # delete the temporary file afterwards. + # + # We aren't adding '&> /dev/null' after the execution command + # because it can produce false failures randomly on some + # systems. + if [ -e "$shmexecfile" ]; then + + # Add the executable flag. + chmod +x "$shmexecfile" + + # The following line tries to execute the file. + if "$shmexecfile"; then + # Successful execution. The colon is a "no-op" (no + # operation) shell command. + : + else + tbshmdir="" + fi + rm "$shmexecfile" else tbshmdir="" fi - rm "$shmexecfile" - else - tbshmdir="" fi + else + tbshmdir="" fi -else - tbshmdir="" -fi - - - + # If a shared memory directory was created, set the software building + # directory to be a symbolic link to it. Otherwise, just build the + # temporary build directory under the project's build directory. + # + # If it is a link, we need to empty its contents first, then itself. + if [ -d "$tmpblddir" ]; then empty_build_tmp; fi + + # Now that we are sure it doesn't exist, we'll make it (either as a + # directory or as a symbolic link). + if [ x"$tbshmdir" = x ]; then mkdir "$tmpblddir"; + else ln -s "$tbshmdir" "$tmpblddir"; + fi -# If a shared memory directory was created, set the software building -# directory to be a symbolic link to it. Otherwise, just build the -# temporary build directory under the project's build directory. -tmpblddir="$sdir"/build-tmp -rm -rf "$tmpblddir"/* "$tmpblddir" # If it is a link, we need to empty - # its contents first, then itself. -if [ x"$tbshmdir" = x ]; then mkdir "$tmpblddir"; -else ln -s "$tbshmdir" "$tmpblddir"; + # Report the time this step took. + elapsed_time_from_prev_step temporary-software-building-dir fi -# Make sure the temporary build directory is empty (un-finished -# source/build files from previous builds can remain there during debugging -# or software updates). -rm -rf $tmpblddir/* - - - - - # Inform the user that the build process is starting # ------------------------------------------------- # # Everything is ready, let the user know that the building is going to # start. -if [ $printnotice = yes ]; then - tsec=10 +if [ $quiet = 0 ]; then cat < /dev/null 2> /dev/null; then - numthreads=$(nproc --all); - else - numthreads=$(sysctl -a | awk '/^hw\.ncpu/{print $2}') - if [ x"$numthreads" = x ]; then numthreads=1; fi - fi -else - numthreads=$jobs -fi - - - - - -# See if the linker accepts -Wl,-rpath-link -# ----------------------------------------- -# -# '-rpath-link' is used to write the information of the linked shared -# library into the shared object (library or program). But some versions of -# LLVM's linker don't accept it an can cause problems. -# -# IMPORTANT NOTE: This test has to be done **AFTER** the definition of -# 'instdir', otherwise, it is going to be used as an empty string. -cat > $testsource < -#include -int main(void) {return EXIT_SUCCESS;} -EOF -if $CC $testsource -o$testprog -Wl,-rpath-link 2>/dev/null > /dev/null; then - export rpath_command="-Wl,-rpath-link=$instdir/lib" -else - export rpath_command="" -fi - - - - - -# Delete the compiler testing directory -# ------------------------------------- -# -# This directory was made above to make sure the necessary compilers can be -# run. -rm -f $testprog $testsource -rm -rf $compilertestdir - - - - - -# Paths needed by the host compiler (only for 'basic.mk') -# ------------------------------------------------------- # -# At the end of the basic build, we need to build GCC. But GCC will build -# in multiple phases, making its own simple compiler in order to build -# itself completely. The intermediate/simple compiler doesn't recognize -# some system specific locations like '/usr/lib/ARCHITECTURE' that some -# operating systems use. We thus need to tell the intermediate compiler -# where its necessary libraries and headers are. -if [ x"$sys_library_path" != x ]; then - if [ x"$LIBRARY_PATH" = x ]; then - export LIBRARY_PATH="$sys_library_path" - else - export LIBRARY_PATH="$LIBRARY_PATH:$sys_library_path" - fi - if [ x"$CPATH" = x ]; then - export CPATH="$sys_cpath" +# This check is also used in 'reproduce/software/shell/docker.sh'. +if [ $built_container = 0 ]; then + if [ $jobs = 0 ]; then + if type nproc > /dev/null 2> /dev/null; then + numthreads=$(nproc --all); + else + numthreads=$(sysctl -a | awk '/^hw\.ncpu/{print $2}') + if [ x"$numthreads" = x ]; then numthreads=1; fi + fi else - export CPATH="$CPATH:$sys_cpath" + numthreads=$jobs fi + elapsed_time_from_prev_step num-threads fi - # Libraries necessary for the system's shell # ------------------------------------------ # @@ -1579,29 +1689,31 @@ fi # [1] https://savannah.nongnu.org/bugs/index.php?66847 # [2] https://stackoverflow.com/questions/34428037/how-to-interpret-the-output-of-the-ldd-program # [3] man vdso -if [ x"$on_mac_os" = xyes ]; then - sys_library_sh_path=$(otool -L /bin/sh \ - | awk '/\/lib/{print $1}' \ - | sed 's#/[^/]*$##' \ - | sort \ - | uniq \ - | awk '{if (NR==1) printf "%s", $1; \ - else printf ":%s", $1}') -else - sys_library_sh_path=$(ldd /bin/sh \ - | awk '{if($3!="") print $3}' \ - | sed 's#/[^/]*$##' \ - | grep -v "(0x[^)]*)" \ - | sort \ - | uniq \ - | awk '{if (NR==1) printf "%s", $1; \ - else printf ":%s", $1}') +if [ $built_container = 0 ]; then + if [ x"$on_mac_os" = xyes ]; then + sys_library_sh_path=$(otool -L /bin/sh \ + | awk '/\/lib/{print $1}' \ + | sed 's#/[^/]*$##' \ + | sort \ + | uniq \ + | awk '{if (NR==1) printf "%s", $1; \ + else printf ":%s", $1}') + else + sys_library_sh_path=$(ldd /bin/sh \ + | awk '{if($3!="") print $3}' \ + | sed 's#/[^/]*$##' \ + | grep -v "(0x[^)]*)" \ + | sort \ + | uniq \ + | awk '{if (NR==1) printf "%s", $1; \ + else printf ":%s", $1}') + fi + elapsed_time_from_prev_step sys-library-sh-path fi - # Find Zenodo URL for software downloading # ---------------------------------------- # @@ -1619,42 +1731,32 @@ fi # which will download the DOI-resolved webpage, and extract the Zenodo-URL # of the most recent version from there (using the 'coreutils' tarball as # an example, the directory part of the URL for all the other software are -# the same). This is not done if the options '--debug' or `--offline` are used. +# the same). This is not done if the options '--debug' or `--offline` are +# used. zenodourl="" user_backup_urls="" -zenodocheck=.build/software/zenodo-check.html -if [ x$debug = x ] && [ x$offline = x ]; then - if $downloader $zenodocheck https://doi.org/10.5281/zenodo.3883409; then - zenodourl=$(sed -n -e'/coreutils/p' $zenodocheck \ - | sed -n -e'/http/p' \ - | tr ' ' '\n' \ - | grep http \ - | sed -e 's/href="//' -e 's|/coreutils| |' \ - | awk 'NR==1{print $1}') - fi +zenodocheck="$bdir"/software/zenodo-check.html +if [ $built_container = 0 ]; then + if [ x$debug = x ] && [ x$offline = x ]; then + if $downloader $zenodocheck \ + https://doi.org/10.5281/zenodo.3883409; then + zenodourl=$(sed -n -e'/coreutils/p' $zenodocheck \ + | sed -n -e'/http/p' \ + | tr ' ' '\n' \ + | grep http \ + | sed -e 's/href="//' -e 's|/coreutils| |' \ + | awk 'NR==1{print $1}') + fi + fi + rm -f $zenodocheck + + # Add the Zenodo URL to the user's given back software URLs. Since the + # user can specify 'user_backup_urls' (not yet implemented as an option + # in './project'), we'll give preference to their specified servers, + # then add the Zenodo URL afterwards. + user_backup_urls="$user_backup_urls $zenodourl" + elapsed_time_from_prev_step zenodo-url fi -rm -f $zenodocheck - -# Add the Zenodo URL to the user's given back software URLs. Since the user -# can specify 'user_backup_urls' (not yet implemented as an option in -# './project'), we'll give preference to their specified servers, then add -# the Zenodo URL afterwards. -user_backup_urls="$user_backup_urls $zenodourl" - - - - - -# Build core tools for project -# ---------------------------- -# -# Here we build the core tools that 'basic.mk' depends on: Lzip -# (compression program), GNU Make (that 'basic.mk' is written in), Dash -# (minimal Bash-like shell) and Flock (to lock files and enable serial -# download). -export on_mac_os -./reproduce/software/shell/pre-make-build.sh \ - "$bdir" "$ddir" "$downloader" "$user_backup_urls" @@ -1682,13 +1784,29 @@ fi -# Build other basic tools our own GNU Make -# ---------------------------------------- +# Core software +# ------------- # -# When building these software we don't have our own un-packing software, -# Bash, Make, or AWK. In this step, we'll install such low-level basic -# tools, but we have to be very portable (and use minimal features in all). -echo; echo "Building necessary software (if necessary)..." +# Here we build the core tools that 'basic.mk' depends on: Lzip +# (compression program), GNU Make (that 'basic.mk' is written in), Dash +# (minimal Bash-like shell) and Flock (to lock files and enable serial +# operations where necessary: mostly in download). +export on_mac_os +if [ $quiet = 0 ]; then echo "Building/validating software: pre-make"; fi +./reproduce/software/shell/pre-make-build.sh \ + "$bdir" "$ddir" "$downloader" "$user_backup_urls" +elapsed_time_from_prev_step make-software-pre-make + + + + + +# Basic software +# -------------- +# +# Having built the core tools, we are now ready to build GCC and all its +# dependencies (the "basic" software). +if [ $quiet = 0 ]; then echo "Building/validating software: basic"; fi .local/bin/make $keepgoing -f reproduce/software/make/basic.mk \ sys_library_sh_path=$sys_library_sh_path \ user_backup_urls="$user_backup_urls" \ @@ -1700,23 +1818,19 @@ echo; echo "Building necessary software (if necessary)..." on_mac_os=$on_mac_os \ host_cc=$host_cc \ -j$numthreads +elapsed_time_from_prev_step make-software-basic -# All other software -# ------------------ +# High-level software +# ------------------- # -# We will be making all the dependencies before running the top-level -# Makefile. To make the job easier, we'll do it in a Makefile, not a -# script. Bash and Make were the tools we need to run Makefiles, so we had -# to build them in this script. But after this, we can rely on Makefiles. -if [ $jobs = 0 ]; then - numthreads=$(.local/bin/nproc --all) -else - numthreads=$jobs -fi +# Having our custom GCC in place, we can now build the high-level (science) +# software: we are using our custom-built 'env' to ensure that nothing from +# the host environment leaks into the high-level software environment. +if [ $quiet = 0 ]; then echo "Building/validating software: high-level"; fi .local/bin/env -i HOME=$bdir \ .local/bin/make $keepgoing \ -f reproduce/software/make/high-level.mk \ @@ -1732,16 +1846,7 @@ fi host_cc=$host_cc \ offline=$offline \ -j$numthreads - - - - - -# Delete the temporary Make wrapper -# --------------------------------- -# -# See above for its description. -rm $makewshell +elapsed_time_from_prev_step make-software-high-level @@ -1756,17 +1861,17 @@ rm $makewshell # will just stop at the stage when all the processing is complete and it is # only necessary to build the PDF. So we don't want to stop the project's # configuration and building if its not present. -if [ -f $itidir/texlive-ready-tlmgr ]; then - texlive_result=$(cat $itidir/texlive-ready-tlmgr) -else - texlive_result="NOT!" -fi -if [ x"$texlive_result" = x"NOT!" ]; then - cat < $pkgver -.local/bin/echo "$thank_progs_libs $proglibs. " >> $pkgver -if [ x"$pymodules" != x ]; then - .local/bin/echo "$thank_python $pymodules. " >> $pkgver -fi -.local/bin/echo "$thank_latex $texpkg. " >> $pkgver -.local/bin/echo "$thank_software_conclude" >> $pkgver - -# Prepare the BibTeX entries for the used software (if there are any). -hasentry=0 -bibfiles="$ictdir/*" -for f in $bibfiles; do if [ -f $f ]; then hasentry=1; break; fi; done; - -# Make sure we start with an empty output file. -pkgbib=$texdir/dependencies-bib.tex -echo "" > $pkgbib - -# Fill it in with all the BibTeX entries in this directory. We'll just -# avoid writing any comments (usually copyright notices) and also put an -# empty line after each file's contents to make the output more readable. -if [ $hasentry = 1 ]; then - for f in $bibfiles; do - awk '!/^%/{print} END{print ""}' $f >> $pkgbib - done -fi - +# Relevant files +pkgver=$sconfdir/dependencies.tex +pkgbib=$sconfdir/dependencies-bib.tex +# Build the software LaTeX source but only when not in a container. +if [ $built_container = 0 ]; then + # Import the context/sentences for placing between the list of software + # names during their acknowledgment. + . $cdir/software_acknowledge_context.sh + # Report the different software in separate contexts (separating Python + # and TeX packages from the C/C++ programs and libraries). + proglibs=$(prepare_name_version $verdir/proglib/*) + pymodules=$(prepare_name_version $verdir/python/*) + texpkg=$(prepare_name_version $verdir/tex/texlive) -# Report machine architecture -# --------------------------- -# -# Report hardware -hwparam="$texdir/hardware-parameters.tex" + # Acknowledge these software packages in a LaTeX paragraph. + .local/bin/echo "$thank_software_introduce " > $pkgver + .local/bin/echo "$thank_progs_libs $proglibs. " >> $pkgver + if [ x"$pymodules" != x ]; then + .local/bin/echo "$thank_python $pymodules. " >> $pkgver + fi + .local/bin/echo "$thank_latex $texpkg. " >> $pkgver + .local/bin/echo "$thank_software_conclude" >> $pkgver + + # Prepare the BibTeX entries for the used software (if there are any). + hasentry=0 + bibfiles="$ictdir/*" + for f in $bibfiles; do if [ -f $f ]; then hasentry=1; break; fi; done; + + # Fill it in with all the BibTeX entries in this directory. We'll just + # avoid writing any comments (usually copyright notices) and also put an + # empty line after each file's contents to make the output more readable. + echo "" > $pkgbib # We don't want to inherit any pre-existing content. + if [ $hasentry = 1 ]; then + for f in $bibfiles; do + awk '!/^%/{print} END{print ""}' $f >> $pkgbib + done + fi -# Add the text to the ${hwparam} file. Since harware class might include -# underscore, it must be replaced with '\_', otherwise pdftex would -# complain and break the build process when doing ./project make. -hw_class_fixed="$(echo $hw_class | sed -e 's/_/\\_/')" -.local/bin/echo "\\newcommand{\\machinearchitecture}{$hw_class_fixed}" > $hwparam -.local/bin/echo "\\newcommand{\\machinebyteorder}{$byte_order}" >> $hwparam -.local/bin/echo "\\newcommand{\\machineaddresssizes}{$address_sizes}" >> $hwparam + # Report the time that this operation took. + elapsed_time_from_prev_step tex-macros +fi -# Clean the temporary build directory -# --------------------------------- +# Report machine architecture (has to be final created file) +# ---------------------------------------------------------- # -# By the time the script reaches here the temporary software build -# directory should be empty, so just delete it. Note 'tmpblddir' may be a -# symbolic link to shared memory. So, to work in any scenario, first delete -# the contents of the directory (if it has any), then delete 'tmpblddir'. -.local/bin/rm -rf $tmpblddir/* $tmpblddir - - - - - -# Register successful completion -# ------------------------------ -echo `.local/bin/date` > $finaltarget - +# This is the final file that is created in the configuration phase: it is +# used by the high-level project script to verify that configuration has +# been completed. If any other files should be created in the final statges +# of configuration, be sure to add them before this. +# +# Since harware class might include underscore, it must be replaced with +# '\_', otherwise pdftex would complain and break the build process when +# doing ./project make. +if [ $built_container = 0 ]; then + hw_class=$(uname -m) + hwparam="$sconfdir/hardware-parameters.tex" + hw_class_fixed="$(echo $hw_class | sed -e 's/_/\\_/')" + .local/bin/echo "\\newcommand{\\machinearchitecture}{$hw_class_fixed}" \ + > $hwparam + .local/bin/echo "\\newcommand{\\machinebyteorder}{$byte_order}" \ + >> $hwparam + .local/bin/echo "\\newcommand{\\machineaddresssizes}{$address_sizes}" \ + >> $hwparam + elapsed_time_from_prev_step hardware-params +fi -# Final notice -# ------------ +# Clean up and final notice +# ------------------------- # -# The configuration is now complete, we can inform the user on the next -# step(s) to take. -if [ x$maneage_group_name = x ]; then - buildcommand="./project make -j8" -else - buildcommand="./project make --group=$maneage_group_name -j8" -fi -cat <