From df9e291826fbc7e717b40d2d07f1d7607a2f2455 Mon Sep 17 00:00:00 2001 From: Giacomo Lorenzetti Date: Thu, 3 Apr 2025 15:21:16 +0200 Subject: IMPORTANT: software configuration optimized and better modularized Summary: after merging this commit into your project, it should be re-configured since the location of software installation files like 'LOCAL.conf' or the LaTeX macros of the software environment have changed. But it should not affect the analysis phase of your project. Until this commit, it was not possible to run a pre-built Maneage'd project (in a container) on a newly cloned Maneage'd project source. This was because the containers should be read-only, but during the various checks of the configuration (to verify that we are using the same software environment in the container and the source), we were writing/testing many things in the build directory, and 'LOCAL.conf' which was actually in the source directory! Furthermore, the '.local' and '.build' were built at configure time, making it hard to run the same container from a newly cloned Maneage'd project. To make things harder for the scenario above, the 'configure.sh' script would pause on every message and didn't have a quiet mode (making it practically impossible to run './project configure' before './project make' on every container run). With this commit, all these issues have been addressed and it is now possible to simply get a built container, clone a Maneage'd project and run the analysis (using the built environment of the container that is verified on every run). The respective changes/additions are described below: - The high-level container scripts ('apptainer.sh' and 'docker.sh', along with their READMEs) have been moved to the 'reproduce/software/shell' directory and the old 'reproduce/software/containers' directory has been deleted. This is because we have classified the software files by their language/format and the container scripts are scripts in the end. - The './project' script: - Now has two extra options: '--quiet' and '--no-pause'. Both are directly passed to the 'configure.sh' script. They will respectively disable any informative printed message or any pause after that message (if it is printed). - The '--build-dir' option is now also relevant for './project make': when it is given, it will re-create the two '.build' and '.local' symbolic links at the top source directory in all scenarios ('configure', 'make' or 'shell'). This will allow both the configuration, analysis and shell phases to safely assume they exist and match the user's desire at run-time. - The build/analysis directory's sub-directories that need to be built before 'top-make.mk' are now built in a separate function to help in readability. - The 'configure.sh' script: - For developers: a new 'check_elapsed' variable has been defined that will enable the newly added 'elapsed_time_from_prev_step' function. This function should be used from now on at the end of every major step to help find bottlenecks. - The targets of the software in 'pre-make-build.sh' now also have the version of the software in their file name. Until now, they didn't have the version, so there was no way to detect if the software has been updated or not in the source. For Lzip and Make (that also get built after GCC), the ones in this script have a '-pre-make' suffix also. - 'Local.conf.in' now has descriptions for every variable. - The '-std=gnu17' option is now used instead of '-std=c17' for basic software that cannot be built without specifying the C standard in GCC 15.1 (described in previous commit: 2881fc0a6205). See [1] for more details; in summary: '-std=gnu17' is also supported on macOS's Clang and has some features that 'pkg-config' needs - Generally: some longer code lines have been broken or indentation decreased to fit the 75 character line length. This has not reduced readability however. For example the long 'echo' commands are now replaced by multiple 'printf's, or the indentation is still clearly visible. The seeds of the work on this commit started by a branch containing three commits by Giacomo Lorenzetti (133 insertions, 100 deletions). Upon merging with the main 'maneage' branch, they were generalized and re-organized to become this commit. The following issues have also been addressed with this commit: - The LaTeX calls (during the building of 'paper.pdf') do not contain Maneage'd dynamic libraries. This is because we don't build the LaTeX binaries from source, an TeXLive manager uses the host environment. - The 'docker.sh' script: - Adds the '--project-name' option: its internal variable existed, but the option for the user to define it at run-time was not. - Ported to macOS: it does not check being a member of the 'docker' group, and finds the number of threads using macOS-specific tools. - The 'apptainer.sh' script: - Now installs 'wget' in the base container also (necessary when the user doesn't have the tarballs). [1] https://savannah.nongnu.org/bugs/?67068#comment2 --- reproduce/software/containers/docker.sh | 486 -------------------------------- 1 file changed, 486 deletions(-) delete mode 100755 reproduce/software/containers/docker.sh (limited to 'reproduce/software/containers/docker.sh') diff --git a/reproduce/software/containers/docker.sh b/reproduce/software/containers/docker.sh deleted file mode 100755 index d5b5682..0000000 --- a/reproduce/software/containers/docker.sh +++ /dev/null @@ -1,486 +0,0 @@ -#!/bin/sh -# -# Create a Docker container from an existing image of the built software -# environment, but with the source, data and build (analysis) directories -# directly within the host file system. This script is assumed to be run in -# the top project source directory (that has 'README.md' and -# 'paper.tex'). If not, use the '--source-dir' option to specify where the -# Maneage'd project source is located. -# -# Usage: -# -# - When you are at the top Maneage'd project directory, you can run this -# script like the example below. Just set all the '/PATH/TO/...' -# directories (see below for '--tmp-dir'). See the items below for -# optional values. -# -# ./reproduce/software/containers/docker.sh --shm-size=15gb \ -# --software-dir=/PATH/TO/SOFTWARE/TARBALLS \ -# --build-dir=/PATH/TO/BUILD/DIRECTORY -# -# - Non-mandatory options: -# -# - If you already have the input data that is necessary for your -# project's, use the '--input-dir' option to specify its location -# on your host file system. Otherwise the necessary analysis -# files will be downloaded directly into the build -# directory. Note that this is only necessary when '--build-only' -# is not given. -# -# - The '--software-dir' is only useful if you want to build a -# container. Even in that case, it is not mandatory: if not -# given, the software tarballs will be downloaded (thus requiring -# internet). -# -# - To avoid having to set the directory(s) every time you want to -# start the docker environment, you can put this command (with the -# proper directories) into a 'run.sh' script in the top Maneage'd -# project source directory and simply execute that. The special name -# 'run.sh' is in Maneage's '.gitignore', so it will not be included -# in your git history by mistake. -# -# Known problems: -# -# - As of 2025-04-06 the log file containing the output of the 'docker -# build' command that configures the Maneage'd project does not keep -# all the output (which gets clipped by Docker). with a "[output -# clipped, log limit 2MiB reached]" message. We need to find a way to -# fix this (so nothing gets clipped: useful for debugging). -# -# Copyright (C) 2021-2025 Mohammad Akhlaghi -# -# This script is free software: you can redistribute it and/or modify it -# under the terms of the GNU General Public License as published by the -# Free Software Foundation, either version 3 of the License, or (at your -# option) any later version. -# -# This script is distributed in the hope that it will be useful, but -# WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General -# Public License for more details. -# -# You should have received a copy of the GNU General Public License along -# with this script. If not, see . - - - - - -# Script settings -# --------------- -# Stop the script if there are any errors. -set -e - - - - - -# Default option values -jobs= -quiet=0 -source_dir= -build_only= -image_file="" -shm_size=20gb -scriptname="$0" -project_shell=0 -container_shell=0 -project_name=maneaged -base_name=maneage-base -base_os=debian:stable-slim - -print_help() { - # Print the output. - cat < /dev/null; then - if [ $quiet = 0 ]; then - printf "$scriptname: info: base OS docker image ('$base_name') " - printf "already exists and will be used. If you want to build a " - printf "new base OS image, give a new name to '--base-name'. " - printf "To remove this message run with '--quiet'\n" - fi -else - - # In case an image file is given, load the environment from that (no - # need to build the environment from scratch). - if ! [ x"$image_file" = x ] && [ -f "$image_file" ]; then - docker load --input $image_file - else - - # Build the temporary directory. - tmp_dir_check - - # Build the Dockerfile. - uid=$(id -u) - cat < $tmp_dir/Dockerfile -FROM $base_os -RUN useradd -ms /bin/sh --uid $uid maneager; \\ - printf '123\n123' | passwd maneager; \\ - printf '456\n456' | passwd root -RUN apt update; apt install -y gcc g++ wget; echo 'export PS1="[\[\033[01;31m\]\u@\h \W\[\033[32m\]\[\033[00m\]]# "' >> ~/.bashrc -USER maneager -WORKDIR /home/maneager -RUN mkdir build; mkdir build/analysis; echo 'export PS1="[\[\033[01;35m\]\u@\h \W\[\033[32m\]\[\033[00m\]]$ "' >> ~/.bashrc -EOF - - # Build the base-OS container and delete the temporary directory. - curdir="$(pwd)" - cd $tmp_dir - docker build ./ \ - -t $base_name \ - --shm-size=$shm_size - cd "$curdir" - rm -rf $tmp_dir - fi -fi - - - - - -# Maneage software configuration -# ------------------------------ -# -# Having the base operating system in place, we can now construct the -# project's docker file. -if docker image list | grep $project_name &> /dev/null; then - if [ $quiet = 0 ]; then - printf "$scriptname: info: project's image ('$project_name') " - printf "already exists and will be used. If you want to build a " - printf "new project image, give a new name to '--project-name'. " - printf "To remove this message run with '--quiet'\n" - fi -else - - # Build the temporary directory. - tmp_dir_check - df=$tmp_dir/Dockerfile - - # The only way to mount a directory inside the Docker build environment - # is the 'RUN --mount' command. But Docker doesn't recognize things - # like symbolic links. So we need to copy the project's source under - # this temporary directory. - sdir=source - mkdir $tmp_dir/$sdir - dsr=/home/maneager/source-raw - cp -r $source_dir/* $source_dir/.git $tmp_dir/$sdir - - # Start constructing the Dockerfile. - # - # Note on the printf's '\x5C\n' part: this will print out as a - # backslash at the end of the line to allow easy human readability of - # the Dockerfile (necessary for debugging!). - echo "FROM $base_name" > $df - printf "RUN --mount=type=bind,source=$sdir,target=$dsr \x5C\n" >> $df - - # If a software directory was given, copy it and add its line. - tsdir=tarballs-software - dts=/home/maneager/tarballs-software - if ! [ x"$software_dir" = x ]; then - - # Make the directory to host the software and copy the contents - # that the user gave there. - mkdir $tmp_dir/$tsdir - cp -r "$software_dir"/* $tmp_dir/$tsdir/ - printf " --mount=type=bind,source=$tsdir,target=$dts \x5C\n" >> $df - fi - - # Construct the rest of the 'RUN' command. - printf " cp -r $dsr /home/maneager/source; \x5C\n" >> $df - printf " cd /home/maneager/source; \x5C\n" >> $df - printf " ./project configure --jobs=$jobs \x5C\n" >> $df - printf " --build-dir=/home/maneager/build \x5C\n" >> $df - printf " --input-dir=/home/maneager/input \x5C\n" >> $df - printf " --software-dir=$dts; \x5C\n" >> $df - - # We are deleting the '.build/software/tarballs' directory because this - # directory is not relevant for the analysis of the project. But in - # case any tarball was downloaded, it will consume space within the - # container. - printf " rm -rf .build/software/tarballs; \x5C\n" >> $df - - # We are deleting the source directory becaues later (at 'docker run' - # time), the 'source' will be mounted directly from the host operating - # system. - printf " cd /home/maneager; \x5C\n" >> $df - printf " rm -rf source\n" >> $df - - # Build the Maneage container and delete the temporary directory. The - # '--progress plain' option is for Docker to print all the outputs - # (otherwise, it will only print a very small part!). - cd $tmp_dir - docker build ./ -t $project_name \ - --progress=plain \ - --shm-size=$shm_size \ - --no-cache \ - 2>&1 | tee build.log - cd .. - rm -rf $tmp_dir -fi - -# If the user wants to save the container (into a file that does not -# exist), do it here. If the file exists, it will only be used for creating -# the container in the previous stages. -if ! [ x"$image_file" = x ] && ! [ -f "$image_file" ]; then - - # Save the image into a tarball - tarname=$(echo $image_file | sed -e's|.gz$||') - if [ $quiet = 0 ]; then - printf "$scriptname: info: saving docker image to '$tarname'" - fi - docker save -o $tarname $project_name - - # Compress the saved image - if [ $quiet = 0 ]; then - printf "$scriptname: info: compressing to '$image_file' (can " - printf "take +10 minutes, but volume decreases by more than half!)" - fi - gzip --best $tarname -fi - -# If the user just wanted to build the base operating system, abort the -# script here. -if ! [ x"$build_only" = x ]; then - if [ $quiet = 0 ]; then - printf "$scriptname: info: Maneaged project has been configured " - printf "successfully in the '$project_name' image" - fi - exit 0 -fi - - - - - -# Run the analysis within the Maneage'd container -# ----------------------------------------------- -# -# The startup command of the container is managed though the 'shellopt' -# variable that starts here. -shellopt="" -if [ $container_shell = 1 ] || [ $project_shell = 1 ]; then - - # If the user wants to start the project shell within the container, - # add the necessary command. - if [ $project_shell = 1 ]; then - shellopt="/bin/bash -c 'cd source; ./project shell;'" - fi - - # Finish the 'shellop' string with a single quote (necessary in any - # case) and run Docker. - interactiveopt="-it" - -# No interactive shell requested, just run the project. -else - interactiveopt="" - shellopt="/bin/bash -c 'cd source; ./project make --jobs=$jobs;'" -fi - -# Execute Docker. The 'eval' is because the 'shellopt' variable contains a -# single-quote that the shell should "evaluate". -eval docker run \ - -v "$analysis_dir":/home/maneager/build/analysis \ - -v "$source_dir":/home/maneager/source \ - $input_dir_mnt \ - $shm_mnt \ - $interactiveopt \ - $project_name \ - $shellopt -- cgit v1.2.1