From c151eddbcc5f4208b40dc3037a8ae8adb0ff9173 Mon Sep 17 00:00:00 2001 From: Mohammad Akhlaghi Date: Wed, 10 Jun 2020 23:44:13 +0100 Subject: IMPORTANT: many improvements to low-level software building phase POSSIBLE EFFECT ON YOUR PROJECT: The changes in this commit may only cause conflicts to your project if you have changed the software building Makefiles in your project's branch (e.g., 'basic.mk', 'high-level.mk' and 'python.mk'). If your project has only added analysis, it shouldn't be affected. This is a large commit, involving a long series of corrections in a differnt branch which is now finally being merged into the core Maneage branch. All changes were related and came up naturally as the low-level infrastructure was improved. So separating them in the end for the final merge would have been very time consuming and we are merging them as one commit. In general, the software building Makefiles are now much more easier to read, modify and use, along with several new features that have been added. See below for the full list. - Until now, Maneage needed the host to have a 'make' implementation because Make was necessary to build Lzip. Lzip is then used to uncompress the source of our own GNU Make. However, in the minimalist/slim versions of operating systems (for example used to build Docker images) Make isn't included by default. Since Lzip was the only program before our own GNU Make was installed, we consulting Antonio Diaz Diaz (creator of Lzip) and he kindly added the necessary functionality to a new version of Lzip, which we are using now. Hence we don't need to assume a Make implementation on the host any more. With this commit, Lzip and GNU Make are built without Make, allowing everything else to be safely built with our own custom version of GNU Make and not using the host's 'make' at all. - Until recently (Commit 3d8aa5953c4) GNU Make was built in 'basic.mk'. Therefore 'basic.mk' was written in a way that it can be used with other 'make' implementations also (i.e., important shell commands starting with '&&' and ending in '\' without any comments between them!). Furthermore, to help in style uniformity, the rules in 'high-level.mk' and 'python.mk' also followed a similar structure. But due to the point above, we can now guarantee that GNU Make is used from the very first Makefile, so this hard-to-read structure has been removed in the software build recipes and they are much more readable and edit-friendly now. - Until now, the default backup servers where at some fixed URLs, on our own pages or on Gitlab. But recently we uploaded all the necessary software to Zenodo (https://doi.org/10.5281/zenodo.3883409) which is more suitable for this task (it promises longevity, has a fixed DOI, while allowing us to add new content, or new software tarball versions). With this commit, a small script has been written to extract the most recent Zenodo upload link from the Zenodo DOI and use it for downloading the software source codes. - Until now, we primarily used the webpage of each software for downloading its tarball. But this caused many problems: 1) Some of them needed Javascript before the download, 2) Some URLs had a complex dependency on the version number, 3) some servers would be randomly down for maintenance and etc. So thanks to the point above, we now use the Zenodo server as the primary download location. However, if a user wants to use a custom software that is not (yet!) in Zenodo, the download script gives priority to a custom URL that the users can give as Make variables. If that variable is defined, then the script will use that URL before going onto Zenodo. We now have a special place for such URLs: 'reproduce/software/config/urls.conf'. The old URLs (which are a good documentation themselves) are preserved here, but are commented by default. - The software source code downloading and checksum verification step has been moved into a Make function called 'import-source' (defined in the 'build-rules.mk' and loaded in all software Makefiles). Having taken all the low-level steps there, I noticed that there is no more need for having the tarball as a separate target! So with this commit, a single rule is the only place that needs to be edited/added (greatly simplifying the software building Makefiles). - Following task #15272, A new option has been added to the './project' script called '--all-highlevel'. When this option is given, the contents of 'TARGETS.conf' are ignored and all the software in Maneage are built (selected by parsing the 'versions.conf' file). This new option was added to confirm the extensive changes made in all the software building recipes and is great for development/testing purposes. - Many of the software hadn't been tested for a long time! So after using the newly added '--all-highlevel', we noticed that some need to be updated. In general, with this commit, 'libpaper' and 'pcre' were added as new software, and the versions of the following software was updated: 'boost', 'flex', 'libtirpc', 'openblas' and 'lzip'. A 'run-parts.in' shell script was added in 'reproduce/software/shell/' which is installed with 'libpaper'. - Even though we intentionally add the necessary flags to add RPATH inside the built executable at compilation time, some software don't do it (different software on different operating systems!). Until now, for historical reasons this check was done in different ways for different software on GNU/Linux sytems. But now it is unified: if 'patchelf' is present we apply it. Because of this, 'patchelf' has been put as a top-level prerequisite, right after Tar and is installed before anything else. - In 'versions.conf', GNU Libtool is recognized as 'libtool', but in 'basic.mk', it was 'glibtool'! This caused many confusions and is corrected with this commit (in 'basic.mk', it is also 'libtool'). - A new argument is added to the './project' script to allow easy loading of the project's shell and environment for fast/temporary testing of things in the same environment as the project. Before activating the project's shell, we completely remove all host environment variables to simulate the project's environment. It can be called with this command: './project shell'. A simple prompt has also been added to highlight that the user is using the Maneage shell! --- reproduce/software/shell/configure.sh | 67 ++++++++++++++++++++---- reproduce/software/shell/pre-make-build.sh | 81 ++++++++++++++++++++++-------- reproduce/software/shell/run-parts.in | 71 ++++++++++++++++++++++++++ 3 files changed, 188 insertions(+), 31 deletions(-) create mode 100755 reproduce/software/shell/run-parts.in (limited to 'reproduce/software/shell') diff --git a/reproduce/software/shell/configure.sh b/reproduce/software/shell/configure.sh index 6694e17..323aed1 100755 --- a/reproduce/software/shell/configure.sh +++ b/reproduce/software/shell/configure.sh @@ -1280,6 +1280,48 @@ fi +# Find Zenodo URL for software downloading +# ---------------------------------------- +# +# All free-software source tarballs that are potentially used in Maneage +# are also archived in Zenodo with a certain concept-DOI. A concept-DOI is +# a Zenodo terminology, meaning a fixed DOI of the project (that can have +# many sub-DOIs for different versions). By default, the concept-DOI points +# to the most recently uploaded version. However, the concept-DOI itself is +# not directly usable for downloading files. The concept-DOI will just take +# us to the top webpage of the most recent version of the upload. +# +# The problem is that as more software are added (as new Zenodo versions), +# the most recent Zenodo-URL that the concept-DOI points to, also +# changes. The most reliable solution was found to be the tiny script below +# which will download the DOI-resolved webpage, and extract the Zenodo-URL +# of the most recent version from there (using the 'coreutils' tarball as +# an example, the directory part of the URL for all the other software are +# the same). +user_backup_urls="" +zenodocheck=.build/software/zenodo-check.html +if $downloader $zenodocheck https://doi.org/10.5281/zenodo.3883409; then + zenodourl=$(sed -n -e'/coreutils/p' $zenodocheck \ + | sed -n -e'/http/p' \ + | tr ' ' '\n' \ + | grep http \ + | sed -e 's/href="//' -e 's|/coreutils| |' \ + | awk 'NR==1{print $1}') +else + zenodourl="" +fi +rm -f $zenodocheck + +# Add the Zenodo URL to the user's given back software URLs. Since the user +# can specify 'user_backup_urls' (not yet implemented as an option in +# './project'), we'll give preference to their specified servers, then add +# the Zenodo URL afterwards. +user_backup_urls="$user_backup_urls $zenodourl" + + + + + # Build core tools for project # ---------------------------- # @@ -1288,7 +1330,7 @@ fi # (minimal Bash-like shell) and Flock (to lock files and enable serial # download). ./reproduce/software/shell/pre-make-build.sh \ - "$bdir" "$ddir" "$downloader" + "$bdir" "$ddir" "$downloader" "$user_backup_urls" @@ -1302,6 +1344,7 @@ fi # tools, but we have to be very portable (and use minimal features in all). echo; echo "Building necessary software (if necessary)..." .local/bin/make -k -f reproduce/software/make/basic.mk \ + user_backup_urls="$user_backup_urls" \ sys_library_path=$sys_library_path \ rpath_command=$rpath_command \ static_build=$static_build \ @@ -1323,20 +1366,22 @@ echo; echo "Building necessary software (if necessary)..." # script. Bash and Make were the tools we need to run Makefiles, so we had # to build them in this script. But after this, we can rely on Makefiles. if [ $jobs = 0 ]; then - numthreads=$($instdir/bin/nproc --all) + numthreads=$(.local/bin/nproc --all) else numthreads=$jobs fi .local/bin/env -i HOME=$bdir \ - .local/bin/make -k -f reproduce/software/make/high-level.mk \ - sys_library_path=$sys_library_path \ - rpath_command=$rpath_command \ - static_build=$static_build \ - numthreads=$numthreads \ - on_mac_os=$on_mac_os \ - sys_cpath=$sys_cpath \ - host_cc=$host_cc \ - -j$numthreads + .local/bin/make -k -f reproduce/software/make/high-level.mk \ + user_backup_urls="$user_backup_urls" \ + sys_library_path=$sys_library_path \ + rpath_command=$rpath_command \ + all_highlevel=$all_highlevel \ + static_build=$static_build \ + numthreads=$numthreads \ + on_mac_os=$on_mac_os \ + sys_cpath=$sys_cpath \ + host_cc=$host_cc \ + -j$numthreads diff --git a/reproduce/software/shell/pre-make-build.sh b/reproduce/software/shell/pre-make-build.sh index e2ac789..05a4143 100755 --- a/reproduce/software/shell/pre-make-build.sh +++ b/reproduce/software/shell/pre-make-build.sh @@ -34,6 +34,7 @@ set -e bdir=$1 ddir=$2 downloader="$3" +user_backup_urls="$4" @@ -51,6 +52,7 @@ downloadwrapper=reproduce/analysis/bash/download-multi-try # Derived directories bindir=$instdir/bin +urlfile=$confdir/urls.conf versionsfile=$confdir/versions.conf checksumsfile=$confdir/checksums.conf backupfile=$confdir/servers-backup.conf @@ -65,8 +67,18 @@ export PATH="$bindir:$PATH" -# Load the backup servers -backupservers=$(awk '!/^#/{printf "%s ", $1}' $backupfile) +# Load the backup servers, but separate the first one. +backupservers="" +topbackupserver="" +maneage_backup_urls=$(awk '!/^#/{printf "%s ", $1}' $backupfile) +backupservers_all="$user_backup_urls $maneage_backup_urls" +for b in $backupservers_all; do + if [ x$topbackupserver = x ]; then + topbackupserver=$b + else + backupservers="$backupservers $b" + fi +done @@ -83,12 +95,21 @@ download_tarball() { else ucname=$tardir/$tarball.unchecked + # If the URL is empty, use the top backup server + if [ x$w = x ]; then + bservers="$backupservers" + tarballurl=$topbackupserver/$tarball + else + bservers="$backupservers_all" + tarballurl=$url/$tarball + fi + # See if it is in the input software directory. if [ -f "$ddir/$tarball" ]; then cp $ddir/$tarball $ucname else - $downloadwrapper "$downloader" nolock $url/$tarball $ucname \ - "$backupservers" + $downloadwrapper "$downloader" nolock $tarballurl $ucname \ + "$bservers" fi # Make sure this is the correct tarball. @@ -119,10 +140,14 @@ download_tarball() { -# Build the program from the tarball +# Build the program from the tarball. This function takes one argument +# which is the configure-time options. build_program() { if ! [ -f $ibidir/$progname ]; then + # Options + configoptions=$1 + # Go into the temporary building directory. cd $tmpblddir unpackdir="$progname"-"$version" @@ -140,13 +165,31 @@ build_program() { intar=$tardir/$tarball fi - # Unpack the tarball and build the program. + # Unpack the tarball and go into it. tar xf $intar if [ x$intarrm = x1 ]; then rm $intar; fi cd $unpackdir - ./configure --prefix=$instdir - make - make install + + # build the project, either with Make and either without it. + if [ x$progname = xlzip ]; then + ./configure --build --check --installdir=$instdir/bin $configoptions + else + # All others accept the configure script. + ./configure --prefix=$instdir $configoptions + + # To build GNU Make, we don't want to assume the existance of a + # Make program, so we use its 'build.sh' script and its own built + # 'make' program to install itself. + if [ x$progname = xmake ]; then + /bin/sh build.sh + ./make install + else + make + make install + fi + fi + + # Clean up the source directory cd $topdir rm -rf $tmpblddir/$unpackdir echo "$progname_tex $version" > $ibidir/$progname @@ -167,7 +210,7 @@ build_program() { # won't rely on the host's compression tools at all. progname="lzip" progname_tex="Lzip" -url=http://akhlaghi.org/src +url=$(awk '/^'$progname'-url/{print $3}' $urlfile) version=$(awk '/^'$progname'-version/{print $3}' $versionsfile) tarball=$progname-$version.tar download_tarball @@ -180,19 +223,17 @@ build_program # GNU Make # -------- # -# The job orchestrator of Maneage is GNU Make. Although it is not -# impossible to account for all the differences between various Make -# implementations, its much easier (for reading the code and -# writing/debugging it) if we can count on a special implementation. So -# before going into the complex job orchestration in building high-level -# software, we start by building GNU Make. +# The job orchestrator of Maneage is GNU Make. The +# '--disable-dependency-tracking' configure-time option is necessary so +# Make doesn't check for an existing 'make' implementation (recall that we +# aren't assuming any 'make' on the host). progname="make" progname_tex="GNU Make" -url=http://akhlaghi.org/src +url=$(awk '/^'$progname'-url/{print $3}' $urlfile) version=$(awk '/^'$progname'-version/{print $3}' $versionsfile) tarball=$progname-$version.tar.lz download_tarball -build_program +build_program --disable-dependency-tracking @@ -206,7 +247,7 @@ build_program # (which builds GNU Bash). progname="dash" progname_tex="Dash" -url=http://akhlaghi.org/src +url=$(awk '/^'$progname'-url/{print $3}' $urlfile) version=$(awk '/^'$progname'-version/{print $3}' $versionsfile) tarball=$progname-$version.tar.lz download_tarball @@ -235,7 +276,7 @@ fi # many simultaneous download commands are called. progname="flock" progname_tex="Discoteq flock" -url=http://akhlaghi.org/src +url=$(awk '/^'$progname'-url/{print $3}' $urlfile) version=$(awk '/^'$progname'-version/{print $3}' $versionsfile) tarball=$progname-$version.tar.lz download_tarball diff --git a/reproduce/software/shell/run-parts.in b/reproduce/software/shell/run-parts.in new file mode 100755 index 0000000..9213585 --- /dev/null +++ b/reproduce/software/shell/run-parts.in @@ -0,0 +1,71 @@ +#!MANEAGESHELL +# run-parts: Runs all the scripts found in a directory. +# from Slackware, by Patrick J. Volkerding with ideas borrowed +# from the Red Hat and Debian versions of this utility. +# +# USAGE IN MANEAGE: this script is built with the 'libpaper' package. +# +# The original file was taken from Linux From Scratch: +# http://www.linuxfromscratch.org/blfs/view/svn/general/libpaper.html +# However, it didn't have a copyright statement. So one is being added +# here. +# +# Copyright (C) 2020 Authors mentioned above. +# Copyright (C) 2020 Mohammad Akhlaghi +# +# This script is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This script is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this script. If not, see . + +# keep going when something fails +set +e + +if [ $# -lt 1 ]; then + echo "Usage: run-parts " + exit 1 +fi + +if [ ! -d $1 ]; then + echo "Not a directory: $1" + echo "Usage: run-parts " + exit 1 +fi + +# There are several types of files that we would like to +# ignore automatically, as they are likely to be backups +# of other scripts: +IGNORE_SUFFIXES="~ ^ , .bak .new .rpmsave .rpmorig .rpmnew .swp" + +# Main loop: +for SCRIPT in $1/* ; do + # If this is not a regular file, skip it: + if [ ! -f $SCRIPT ]; then + continue + fi + # Determine if this file should be skipped by suffix: + SKIP=false + for SUFFIX in $IGNORE_SUFFIXES ; do + if [ ! "$(basename $SCRIPT $SUFFIX)" = "$(basename $SCRIPT)" ]; then + SKIP=true + break + fi + done + if [ "$SKIP" = "true" ]; then + continue + fi + # If we've made it this far, then run the script if it's executable: + if [ -x $SCRIPT ]; then + $SCRIPT || echo "$SCRIPT failed." + fi +done + +exit 0 -- cgit v1.2.1