diff options
Diffstat (limited to 'reproduce/software/containers')
-rw-r--r-- | reproduce/software/containers/README-apptainer.md | 69 | ||||
-rw-r--r-- | reproduce/software/containers/README-docker.md | 180 | ||||
-rwxr-xr-x | reproduce/software/containers/apptainer.sh | 441 | ||||
-rwxr-xr-x | reproduce/software/containers/docker.sh | 486 |
4 files changed, 0 insertions, 1176 deletions
diff --git a/reproduce/software/containers/README-apptainer.md b/reproduce/software/containers/README-apptainer.md deleted file mode 100644 index 9608dc8..0000000 --- a/reproduce/software/containers/README-apptainer.md +++ /dev/null @@ -1,69 +0,0 @@ -# Maneage'd projects in Apptainer - -Copyright (C) 2025-2025 Mohammad Akhlaghi <mohammad@akhlaghi.org>\ -Copyright (C) 2025-2025 Giacomo Lorenzetti <glorenzetti@cefca.es>\ -See the end of the file for license conditions. - -For an introduction on containers, see the "Building in containers" section -of the `README.md` file within the top-level directory of this -project. Here, we focus on Apptainer with a simple checklist on how to use -the `apptainer-run.sh` script that we have already prepared in this -directory for easy usage in a Maneage'd project. - - - - - -## Building your Maneage'd project in Apptainer - -Through the steps below, you will create an Apptainer image that will only -contain the software environment and keep the project source and built -analysis files (data and PDF) on your host operating system. This enables -you to keep the size of the image to a minimum (only containing the built -software environment) to easily move it from one computer to another. - - 1. Using your favorite text editor, create a `apptainer-local.sh` in your - project's top directory that contains the usage command shown at the - top of the 'apptainer.sh' script and take the following steps: - * Set the respective directories based on your own preferences. - * The `--software-dir` is optional (if you don't have the source - tarballs, Maneage will download them automatically. But that requires - internet (which may not always be available). If you regularly build - Maneage'd projects, you can clone the repository containing all the - tarballs at https://gitlab.cefca.es/maneage/tarballs-software - * Add an extra `--build-only` for the first run so it doesn't go onto - doing the analysis and just builds the image. After it has completed, - remove the `--build-only` and it will only run the analysis of your - project. - - 2. Once step one finishes, the build directory will contain two - Singularity Image Format (SIF) files listed below. You can move them to - any other (more permanent) positions in your filesystem or to other - computers as needed. - * `maneage-base.sif`: image containing the base operating system that - was used to build your project. You can safely delete this unless you - need to keep it for future builds without internet (you can give it - to the `--base-name` option of this script). If you want a different - name for this, put the same option in your - * `maneaged.sif`: image with the full software environment of your - project. This file is necessary for future runs of your project - within the container. - - - - - -## Copyright information - -This file is free software: you can redistribute it and/or modify it under -the terms of the GNU General Public License as published by the Free -Software Foundation, either version 3 of the License, or (at your option) -any later version. - -This file is distributed in the hope that it will be useful, but WITHOUT -ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or -FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for -more details. - -You should have received a copy of the GNU General Public License along -with this file. If not, see <https://www.gnu.org/licenses/>. diff --git a/reproduce/software/containers/README-docker.md b/reproduce/software/containers/README-docker.md deleted file mode 100644 index f86dceb..0000000 --- a/reproduce/software/containers/README-docker.md +++ /dev/null @@ -1,180 +0,0 @@ -# Maneage'd projects in Docker - -Copyright (C) 2021-2025 Mohammad Akhlaghi <mohammad@akhlaghi.org>\ -See the end of the file for license conditions. - -For an introduction on containers, see the "Building in containers" section -of the `README.md` file within the top-level directory of this -project. Here, we focus on Docker with a simple checklist on how to use the -`docker.sh` script that we have already prepared in this directory for easy -usage in a Maneage'd project. - - - - - -## Building your Maneage'd project in Docker - -Through the steps below, you will create a Docker image that will only -contain the software environment and keep the project source and built -analysis files (data and PDF) on your host operating system. This enables -you to keep the size of the image to a minimum (only containing the built -software environment) to easily move it from one computer to another. - - 0. Add your user to the `docker` group: `usermod -aG docker - USERNAME`. This is only necessary once on an operating system. - - 1. Start the Docker daemon (root permissions required). If the operating - system uses systemd you can use the command below. If you want the - Docker daemon to be available after a reboot also (so you don't have to - restart it after turning off your computer), run this command again but - replacing `start` with `enable` (this is not recommended if you don't - regularly use Docker: it will slow the boot time of your OS). - - ```shell - systemctl start docker - ``` - - 2. Using your favorite text editor, create a `docker-local.sh` in your top - Maneage directory (as described in the comments at the start of the - `docker.sh` script in this directory). Just activate `--build-only` on - the first run so it doesn't go onto doing the analysis and just sets up - the software environment. - - 3. After the setup is complete, run the following command to confirm that - the `maneage-base` (the OS of the container) and `maneaged` (your - project's full Maneage'd environment) images are available. If you want - different names for these images, add the `--project-name` and - `--base-name` options to the `docker.sh` call. - - ```shell - docker image list - ``` - - 4. You are now ready to do your analysis by removing the `--build-only` - option. - - - - - -## Script usage tips - -The `docker.sh` script introduced above has many options allowing certain -customizations that you can see when running it with the `--help` -option. The tips below are some of the more useful scenarios that we have -encountered so far. - -### Docker image in a single file - -In case you want to store the image as a single file as backup or to move -to another computer. For such cases, run the `docker.sh` script with the -`--image-file` option (for example `--image-file=myproj.tar.gz`). After -moving the file to the other system, run `docker.sh` with the same option. - -When the given file to `docker.sh` already exists, it will only be used for -loading the environment. When it doesn't exist, the script will save the -image into it. - - - - - -## Docker usage tips - -Below are some useful Docker usage scenarios that have proved to be -relevant for us in Maneage'd projects. - -### Cleaning up - -Docker has stored many large files in your operating system that can drain -valuable storage space. The storage of the cached files are usually orders -of magnitudes larger than what you see in `docker image list`! So after -doing your work, it is best to clean up all those files. If you feel you -may need the image later, you can save it in a single file as mentioned -above and delete all the un-necessary cached files. Afterwards, when you -load the image, only that image will be present with nothing extra. - -The easiest and most powerful way to clean up everything in Docker is the -two commands below. The first will close all open containers. The second -will remove all stopped containers, all networks not used by at least one -container, all images without at least one container associated to them, -and all build cache. - -```shell -docker ps -a -q | xargs docker rm -docker system prune -a -``` - -If you only want to delete the existing used images, run the command -below. But be careful that the cache is the largest storage consumer! So -the command above is the solution if your OS's root partition is close to -getting filled. - -```shell -docker images -a -q | xargs docker rmi -f -``` - - -### Preserving the state of an open container - -All interactive changes in a container will be deleted as soon as you exit -it. This is a very good feature of Docker in general! If you want to make -persistent changes, you should do it in the project's plain-text source and -commit them into your project's online Git repository. But in certain -situations, it is necessary to preserve the state of an interactive -container. To do this, you need to `commit` the container (and thus save it -as a Docker "image"). To do this, while the container is still running, -open another terminal and run these commands: - -```shell -# These two commands should be done in another terminal -docker container list - -# Get the 'XXXXXXX' of your desired container from the first column above. -# Give the new image a name by replacing 'NEW-IMAGE-NAME'. -docker commit XXXXXXX NEW-IMAGE-NAME -``` - - -### Interactive tests on built container - -If you later want to start a container with the built image and enter it in -interactive mode (for example for temporary tests), run the following -command. Just replace `NAME` with the same name you specified when building -the project. You can always exit the container with the `exit` command -(note that all your changes will be discarded once you exit, see below if -you want to preserve your changes after you exit). - -```shell -docker run -it NAME -``` - - -### Copying files from the Docker image to host operating system - -Except for the mounted directories, the Docker environment's file system is -indepenent of your host operating system. One easy way to copy files to and -from an open container is to use the `docker cp` command (very similar to -the shell's `cp` command). - -```shell -docker cp CONTAINER:/file/path/within/container /host/path/target -``` - - - -## Copyright information - -This file is free software: you can redistribute it and/or modify it under -the terms of the GNU General Public License as published by the Free -Software Foundation, either version 3 of the License, or (at your option) -any later version. - -This file is distributed in the hope that it will be useful, but WITHOUT -ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or -FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for -more details. - -You should have received a copy of the GNU General Public License along -with this file. If not, see <https://www.gnu.org/licenses/>. diff --git a/reproduce/software/containers/apptainer.sh b/reproduce/software/containers/apptainer.sh deleted file mode 100755 index 52315f6..0000000 --- a/reproduce/software/containers/apptainer.sh +++ /dev/null @@ -1,441 +0,0 @@ -#!/bin/sh -# -# Create a Apptainer container from an existing image of the built software -# environment, but with the source, data and build (analysis) directories -# directly within the host file system. This script is assumed to be run in -# the top project source directory (that has 'README.md' and -# 'paper.tex'). If not, use the '--source-dir' option to specify where the -# Maneage'd project source is located. -# -# Usage: -# -# - When you are at the top Maneage'd project directory, you can run this -# script like the example below. Just set all the '/PATH/TO/...' -# directories. See the items below for optional values. -# -# ./reproduce/software/containers/apptainer.sh \ -# --build-dir=/PATH/TO/BUILD/DIRECTORY \ -# --software-dir=/PATH/TO/SOFTWARE/TARBALLS -# -# - Non-mandatory options: -# -# - If you already have the input data that is necessary for your -# project's, use the '--input-dir' option to specify its location -# on your host file system. Otherwise the necessary analysis -# files will be downloaded directly into the build -# directory. Note that this is only necessary when '--build-only' -# is not given. -# -# - The '--software-dir' is only useful if you want to build a -# container. Even in that case, it is not mandatory: if not -# given, the software tarballs will be downloaded (thus requiring -# internet). -# -# - To avoid having to set them every time you want to start the -# apptainer environment, you can put this command (with the proper -# directories) into a 'run.sh' script in the top Maneage'd project -# source directory and simply execute that. The special name 'run.sh' -# is in Maneage's '.gitignore', so it will not be included in your -# git history by mistake. -# -# Known problems: -# -# Copyright (C) 2025-2025 Mohammad Akhlaghi <mohammad@akhlaghi.org> -# Copyright (C) 2025-2025 Giacomo Lorenzetti <glorenzetti@cefca.es> -# -# This script is free software: you can redistribute it and/or modify it -# under the terms of the GNU General Public License as published by the -# Free Software Foundation, either version 3 of the License, or (at your -# option) any later version. -# -# This script is distributed in the hope that it will be useful, but -# WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General -# Public License for more details. -# -# You should have received a copy of the GNU General Public License along -# with this script. If not, see <http://www.gnu.org/licenses/>. - - - - - -# Script settings -# --------------- -# Stop the script if there are any errors. -set -e - - - - - -# Default option values -jobs= -quiet=0 -source_dir= -build_only= -base_name="" -shm_size=20gb -scriptname="$0" -project_name="" -project_shell=0 -container_shell=0 -base_os=debian:stable-slim - -print_help() { - # Print the output. - cat <<EOF -Usage: $scriptname [OPTIONS] - -Top-level script to build and run a Maneage'd project within Apptainer. - - Host OS directories (to be mounted in the container): - -b, --build-dir=STR Dir. to build in (only analysis in host). - -i, --input-dir=STR Dir. of input datasets (optional). - -s, --software-dir=STR Directory of necessary software tarballs. - --source-dir=STR Directory of source code (default: 'pwd -P'). - - Apptainer images - --base-os=STR Base OS name (default: '$base_os'). - --base-name=STR Base OS apptainer image (a '.sif' file). - --project-name=STR Project's apptainer image (a '.sif' file). - - Interactive shell - --project-shell Open the project's shell within the container. - --container-shell Open the container shell. - - Operating mode: - --quiet Do not print informative statements. - -?, --help Give this help list. - -j, --jobs=INT Number of threads to use in each phase. - --build-only Just build the container, don't run it. - -Mandatory or optional arguments to long options are also mandatory or -optional for any corresponding short options. - -Maneage URL: https://maneage.org - -Report bugs to mohammad@akhlaghi.org -EOF -} - -on_off_option_error() { - if [ "x$2" = x ]; then - echo "$scriptname: '$1' doesn't take any values" - else - echo "$scriptname: '$1' (or '$2') doesn't take any values" - fi - exit 1 -} - -check_v() { - if [ x"$2" = x ]; then - printf "$scriptname: option '$1' requires an argument. " - printf "Try '$scriptname --help' for more information\n" - exit 1; - fi -} - -while [ $# -gt 0 ] -do - case $1 in - - # OS directories - -b|--build-dir) build_dir="$2"; check_v "$1" "$build_dir"; shift;shift;; - -b=*|--build-dir=*) build_dir="${1#*=}"; check_v "$1" "$build_dir"; shift;; - -b*) build_dir=$(echo "$1" | sed -e's/-b//'); check_v "$1" "$build_dir"; shift;; - -i|--input-dir) input_dir="$2"; check_v "$1" "$input_dir"; shift;shift;; - -i=*|--input-dir=*) input_dir="${1#*=}"; check_v "$1" "$input_dir"; shift;; - -i*) input_dir=$(echo "$1" | sed -e's/-i//'); check_v "$1" "$input_dir"; shift;; - -s|--software-dir) software_dir="$2"; check_v "$1" "$software_dir"; shift;shift;; - -s=*|--software-dir=*) software_dir="${1#*=}"; check_v "$1" "$software_dir"; shift;; - -s*) software_dir=$(echo "$1" | sed -e's/-s//'); check_v "$1" "$software_dir"; shift;; - --source-dir) source_dir="$2"; check_v "$1" "$source_dir"; shift;shift;; - --source-dir=*) source_dir="${1#*=}"; check_v "$1" "$source_dir"; shift;; - - # Container options. - --base-name) base_name="$2"; check_v "$1" "$base_name"; shift;shift;; - --base-name=*) base_name="${1#*=}"; check_v "$1" "$base_name"; shift;; - --project-name) project_name="$2"; check_v "$1" "$project_name"; shift;shift;; - --project-name=*) project_name="${1#*=}"; check_v "$1" "$project_name"; shift;; - - # Interactive shell. - --project-shell) project_shell=1; shift;; - --project_shell=*) on_off_option_error --project-shell;; - --container-shell) container_shell=1; shift;; - --container_shell=*) on_off_option_error --container-shell;; - - # Operating mode - --quiet) quiet=1; shift;; - --quiet=*) on_off_option_error --quiet;; - -j|--jobs) jobs="$2"; check_v "$1" "$jobs"; shift;shift;; - -j=*|--jobs=*) jobs="${1#*=}"; check_v "$1" "$jobs"; shift;; - -j*) jobs=$(echo "$1" | sed -e's/-j//'); check_v "$1" "$jobs"; shift;; - --build-only) build_only=1; shift;; - --build-only=*) on_off_option_error --build-only;; - --shm-size) shm_size="$2"; check_v "$1" "$shm_size"; shift;shift;; - --shm-size=*) shm_size="${1#*=}"; check_v "$1" "$shm_size"; shift;; - -'?'|--help) print_help; exit 0;; - -'?'*|--help=*) on_off_option_error --help -?;; - - # Unrecognized option: - -*) echo "$scriptname: unknown option '$1'"; exit 1;; - esac -done - - - - - -# Sanity checks -# ------------- -# -# Make sure that the build directory is given and that it exists. -if [ x$build_dir = x ]; then - printf "$scriptname: '--build-dir' not provided, this is the location " - printf "that all built analysis files will be kept on the host OS\n" - exit 1; -else - if ! [ -d $build_dir ]; then - printf "$scriptname: '$build_dir' (value to '--build-dir') doesn't " - printf "exist\n" - exit 1; - fi -fi - -# Set the default project and base-OS image names (inside the build -# directory). -if [ x"$base_name" = x ]; then base_name=$build_dir/maneage-base.sif; fi -if [ x"$project_name" = x ]; then project_name=$build_dir/maneaged.sif; fi - - - - - -# Directory preparations -# ---------------------- -# -# If the host operating system has '/dev/shm', then give Apptainer access -# to it also for improved speed in some scenarios (like configuration). -if [ -d /dev/shm ]; then - shm_mnt="--mount type=bind,src=/dev/shm,dst=/dev/shm"; -else shm_mnt=""; -fi - -# If the following directories do not exist within the build directory, -# create them to make sure the '--mount' commands always work and -# that any file. Ideally, the 'input' directory should not be under the 'build' -# directory, but if the user hasn't given it then they don't care about -# potentially deleting it later (Maneage will download the inputs), so put -# it in the build directory. -analysis_dir="$build_dir"/analysis -if ! [ -d $analysis_dir ]; then mkdir $analysis_dir; fi -analysis_dir_mnt="--mount type=bind,src=$analysis_dir,dst=/home/maneager/build/analysis" - -# If no '--source-dir' was given, set it to the output of 'pwd -P' (to get -# the direct path without potential symbolic links) in the running directory. -if [ x"$source_dir" = x ]; then source_dir=$(pwd -P); fi -source_dir_mnt="--mount type=bind,src=$source_dir,dst=/home/maneager/source" - -# Only when an an input directory is given, we need the respective 'mount' -# option for the 'apptainer run' command. -input_dir_mnt="" -if ! [ x"$input_dir" = x ]; then - input_dir_mnt="--mount type=bind,src=$input_dir,dst=/home/maneager/input" -fi - -# If no '--jobs' has been specified, use the maximum available jobs to the -# operating system. -if [ x$jobs = x ]; then jobs=$(nproc); fi - -# [APPTAINER-ONLY] Optional mounting option for the software directory. -software_dir_mnt="" -if ! [ x"$software_dir" = x ]; then - software_dir_mnt="--mount type=bind,src=$software_dir,dst=/home/maneager/tarballs-software" -fi - -# [APPTAINER-ONLY] Since the container is read-only and is run with the -# '--contain' option (which makes an empty '/tmp'), we need to make a -# dedicated directory for the container to be able to write to. This is -# necessary because some software (Biber in particular on the default -# branch) need to write there! See https://github.com/plk/biber/issues/494. -# We'll keep the directory on the host OS within the build directory, but -# as a hidden file (since it is not necessary in other types of build and -# ultimately only contains temporary files of programs that need it). -toptmp=$build_dir/.apptainer-tmp-$(whoami) -if ! [ -d $toptmp ]; then mkdir $toptmp; fi -rm -rf $toptmp/* # So previous runs don't affect this run. - - - - - -# Maneage'd Apptainer SIF container -# --------------------------------- -# -# Build the base operating system using Maneage's './project configure' -# step. -if [ -f $project_name ]; then - if [ $quiet = 0 ]; then - printf "$scriptname: info: project's image ('$project_name') " - printf "already exists and will be used. If you want to build a " - printf "new project image, give a new name to '--project-name'. " - printf "To remove this message run with '--quiet'\n" - fi -else - - # Build the basic definition, with just Debian and gcc/g++ - if [ -f $base_name ]; then - if [ $quiet = 0 ]; then - printf "$scriptname: info: base OS docker image ('$base_name') " - printf "already exists and will be used. If you want to build a " - printf "new base OS image, give a new name to '--base-name'. " - printf "To remove this message run with '--quiet'\n" - fi - else - - base_def=$build_dir/base.def - cat <<EOF > $base_def -Bootstrap: docker -From: $base_os - -%post - apt-get update && apt-get install -y gcc g++ -EOF - # Build the base operating system container and delete the - # temporary definition file. - apptainer build $base_name $base_def - rm $base_def - fi - - # Build the Maneage definition file. - # - About the '$jobs' variable: this definition file is temporarily - # built and deleted immediately after the SIF file is created. So - # instead of using Apptainer's more complex '{{ jobs }}' format to - # pass an argument, we simply write the value of the configure - # script's '--jobs' option as a shell variable here when we are - # building that file. - # - About the removal of Maneage'd tarballs: we are doing this so if - # Maneage has downloaded tarballs during the build they do not - # unecessarily bloat the container. Even when the user has given a - # software tarball directory, they will all be symbolic links that - # aren't valid when the user runs the container (since we only - # mount the software tarballs at build time). - maneage_def=$build_dir/maneage.def - cat <<EOF > $maneage_def -Bootstrap: localimage -From: $base_name - -%setup - mkdir -p \${APPTAINER_ROOTFS}/home/maneager/input - mkdir -p \${APPTAINER_ROOTFS}/home/maneager/source - mkdir -p \${APPTAINER_ROOTFS}/home/maneager/build/analysis - mkdir -p \${APPTAINER_ROOTFS}/home/maneager/tarballs-software - -%post - cd /home/maneager/source - ./project configure --jobs=$jobs \\ - --input-dir=/home/maneager/input \\ - --build-dir=/home/maneager/build \\ - --software-dir=/home/maneager/tarballs-software - rm /home/maneager/build/software/tarballs/* - -%runscript - cd /home/maneager/source - if [ x"\$maneage_apptainer_stat" = xshell ]; then \\ - ./project shell; \\ - elif [ x"\$maneage_apptainer_stat" = xrun ]; then \\ - if [ x"\$maneage_jobs" = x ]; then \\ - ./project make; \\ - else \\ - ./project make --jobs=\$maneage_jobs; \\ - fi; \\ - else \\ - printf "$scriptname: '\$maneage_apptainer_stat' (value "; \\ - printf "to 'maneage_apptainer_stat' environment variable) "; \\ - printf "is not recognized: should be either 'shell' or 'run'"; \\ - exit 1; \\ - fi -EOF - - # Build the maneage container. The last two are arguments (where order - # matters). The first few are options where order does not matter (so - # we have sorted them by line length). - apptainer build \ - $shm_mnt \ - $input_dir_mnt \ - $source_dir_mnt \ - $analysis_dir_mnt \ - $software_dir_mnt \ - --ignore-fakeroot-command \ - \ - $project_name \ - $maneage_def - - # Clean up. - rm $maneage_def -fi - -# If the user just wanted to build the base operating system, abort the -# script here. -if ! [ x"$build_only" = x ]; then - if [ $quiet = 0 ]; then - printf "$scriptname: info: Maneaged project has been configured " - printf "successfully in the '$project_name' image" - fi - exit 0 -fi - - - - - -# Run the Maneage'd container -# --------------------------- -# -# Set the high-level Apptainer operational mode. -if [ $container_shell = 1 ]; then - aopt="shell" -elif [ $project_shell = 1 ]; then - aopt="run --env maneage_apptainer_stat=shell" -else - aopt="run --env maneage_apptainer_stat=run --env maneage_jobs=$jobs" -fi - -# Build the hostname from the name of the SIF file of the project name. -hstname=$(echo "$project_name" \ - | awk 'BEGIN{FS="/"}{print $NF}' \ - | sed -e's|.sif$||') - -# Execute Apptainer: -# -# - We are not using '--unsquash' (to run within a sandbox) because it -# loads the full multi-gigabyte container into RAM (which we usually -# need for data processing). The container is read-only and we are -# using the following two options instead to ensure that we have no -# influence from outside the container. (description of each is from -# the Apptainer manual) -# --contain: use minimal /dev and empty other directories (e.g. /tmp -# and $HOME) instead of sharing filesystems from your host. -# --cleanenv: clean environment before running container". -# -# - We are not mounting '/dev/shm' since Apptainer prints a warning that -# it is already mounted (apparently does not need it at run time). -# -# --no-home and --home: the first ensures that the 'HOME' variable is -# different from the user's home on the host operating system, the -# second sets it to a directory we specify (to keep things like -# '.bash_history'). -apptainer $aopt \ - --no-home \ - --contain \ - --cleanenv \ - --home $toptmp \ - $input_dir_mnt \ - $source_dir_mnt \ - $analysis_dir_mnt \ - --workdir $toptmp \ - --hostname $hstname \ - --cwd /home/maneager/source \ - \ - $project_name diff --git a/reproduce/software/containers/docker.sh b/reproduce/software/containers/docker.sh deleted file mode 100755 index d5b5682..0000000 --- a/reproduce/software/containers/docker.sh +++ /dev/null @@ -1,486 +0,0 @@ -#!/bin/sh -# -# Create a Docker container from an existing image of the built software -# environment, but with the source, data and build (analysis) directories -# directly within the host file system. This script is assumed to be run in -# the top project source directory (that has 'README.md' and -# 'paper.tex'). If not, use the '--source-dir' option to specify where the -# Maneage'd project source is located. -# -# Usage: -# -# - When you are at the top Maneage'd project directory, you can run this -# script like the example below. Just set all the '/PATH/TO/...' -# directories (see below for '--tmp-dir'). See the items below for -# optional values. -# -# ./reproduce/software/containers/docker.sh --shm-size=15gb \ -# --software-dir=/PATH/TO/SOFTWARE/TARBALLS \ -# --build-dir=/PATH/TO/BUILD/DIRECTORY -# -# - Non-mandatory options: -# -# - If you already have the input data that is necessary for your -# project's, use the '--input-dir' option to specify its location -# on your host file system. Otherwise the necessary analysis -# files will be downloaded directly into the build -# directory. Note that this is only necessary when '--build-only' -# is not given. -# -# - The '--software-dir' is only useful if you want to build a -# container. Even in that case, it is not mandatory: if not -# given, the software tarballs will be downloaded (thus requiring -# internet). -# -# - To avoid having to set the directory(s) every time you want to -# start the docker environment, you can put this command (with the -# proper directories) into a 'run.sh' script in the top Maneage'd -# project source directory and simply execute that. The special name -# 'run.sh' is in Maneage's '.gitignore', so it will not be included -# in your git history by mistake. -# -# Known problems: -# -# - As of 2025-04-06 the log file containing the output of the 'docker -# build' command that configures the Maneage'd project does not keep -# all the output (which gets clipped by Docker). with a "[output -# clipped, log limit 2MiB reached]" message. We need to find a way to -# fix this (so nothing gets clipped: useful for debugging). -# -# Copyright (C) 2021-2025 Mohammad Akhlaghi <mohammad@akhlaghi.org> -# -# This script is free software: you can redistribute it and/or modify it -# under the terms of the GNU General Public License as published by the -# Free Software Foundation, either version 3 of the License, or (at your -# option) any later version. -# -# This script is distributed in the hope that it will be useful, but -# WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General -# Public License for more details. -# -# You should have received a copy of the GNU General Public License along -# with this script. If not, see <http://www.gnu.org/licenses/>. - - - - - -# Script settings -# --------------- -# Stop the script if there are any errors. -set -e - - - - - -# Default option values -jobs= -quiet=0 -source_dir= -build_only= -image_file="" -shm_size=20gb -scriptname="$0" -project_shell=0 -container_shell=0 -project_name=maneaged -base_name=maneage-base -base_os=debian:stable-slim - -print_help() { - # Print the output. - cat <<EOF -Usage: $scriptname [OPTIONS] - -Top-level script to build and run a Maneage'd project within Docker. - - Host OS directories (to be mounted in the container): - -b, --build-dir=STR Dir. to build in (only analysis in host). - -i, --input-dir=STR Dir. of input datasets (optional). - -s, --software-dir=STR Directory of necessary software tarballs. - --source-dir=STR Directory of source code (default: 'pwd -P'). - - Docker images - --base-os=STR Base OS name (default: '$base_os'). - --base-name=STR Base OS docker image (default: $base_name). - --project-name=STR Project's docker image (default: $project_name). - --image-file=STR [Docker only] Load (if given file exists), or - save (if given file does not exist), the image. - For saving, the given name has to have an - '.tar.gz' suffix. - - Interactive shell - --project-shell Open the project's shell within the container. - --container-shell Open the container shell. - - Operating mode: - --quiet Do not print informative statements. - -?, --help Give this help list. - --shm-size=STR Passed to 'docker build' (default: $shm_size). - -j, --jobs=INT Number of threads to use in each phase. - --build-only Just build the container, don't run it. - -Mandatory or optional arguments to long options are also mandatory or -optional for any corresponding short options. - -Maneage URL: https://maneage.org - -Report bugs to mohammad@akhlaghi.org -EOF -} - -on_off_option_error() { - if [ "x$2" = x ]; then - echo "$scriptname: '$1' doesn't take any values" - else - echo "$scriptname: '$1' (or '$2') doesn't take any values" - fi - exit 1 -} - -check_v() { - if [ x"$2" = x ]; then - printf "$scriptname: option '$1' requires an argument. " - printf "Try '$scriptname --help' for more information\n" - exit 1; - fi -} - -while [ $# -gt 0 ] -do - case $1 in - - # OS directories - -b|--build-dir) build_dir="$2"; check_v "$1" "$build_dir"; shift;shift;; - -b=*|--build-dir=*) build_dir="${1#*=}"; check_v "$1" "$build_dir"; shift;; - -b*) build_dir=$(echo "$1" | sed -e's/-b//'); check_v "$1" "$build_dir"; shift;; - -i|--input-dir) input_dir="$2"; check_v "$1" "$input_dir"; shift;shift;; - -i=*|--input-dir=*) input_dir="${1#*=}"; check_v "$1" "$input_dir"; shift;; - -i*) input_dir=$(echo "$1" | sed -e's/-i//'); check_v "$1" "$input_dir"; shift;; - -s|--software-dir) software_dir="$2"; check_v "$1" "$software_dir"; shift;shift;; - -s=*|--software-dir=*) software_dir="${1#*=}"; check_v "$1" "$software_dir"; shift;; - -s*) software_dir=$(echo "$1" | sed -e's/-s//'); check_v "$1" "$software_dir"; shift;; - --source-dir) source_dir="$2"; check_v "$1" "$source_dir"; shift;shift;; - --source-dir=*) source_dir="${1#*=}"; check_v "$1" "$source_dir"; shift;; - - # Container options. - --base-name) base_name="$2"; check_v "$1" "$base_name"; shift;shift;; - --base-name=*) base_name="${1#*=}"; check_v "$1" "$base_name"; shift;; - - # Interactive shell. - --project-shell) project_shell=1; shift;; - --project_shell=*) on_off_option_error --project-shell;; - --container-shell) container_shell=1; shift;; - --container_shell=*) on_off_option_error --container-shell;; - - # Operating mode - --quiet) quiet=1; shift;; - --quiet=*) on_off_option_error --quiet;; - -j|--jobs) jobs="$2"; check_v "$1" "$jobs"; shift;shift;; - -j=*|--jobs=*) jobs="${1#*=}"; check_v "$1" "$jobs"; shift;; - -j*) jobs=$(echo "$1" | sed -e's/-j//'); check_v "$1" "$jobs"; shift;; - --build-only) build_only=1; shift;; - --build-only=*) on_off_option_error --build-only;; - --shm-size) shm_size="$2"; check_v "$1" "$shm_size"; shift;shift;; - --shm-size=*) shm_size="${1#*=}"; check_v "$1" "$shm_size"; shift;; - -'?'|--help) print_help; exit 0;; - -'?'*|--help=*) on_off_option_error --help -?;; - - # Output file - --image-file) image_file="$2"; check_v "$1" "$image_file"; shift;shift;; - --image-file=*) image_file="${1#*=}"; check_v "$1" "$image_file"; shift;; - - # Unrecognized option: - -*) echo "$scriptname: unknown option '$1'"; exit 1;; - esac -done - - - - - -# Sanity checks -# ------------- -# -# Make sure that the build directory is given and that it exists. -if [ x$build_dir = x ]; then - printf "$scriptname: '--build-dir' not provided, this is the location " - printf "that all built analysis files will be kept on the host OS\n" - exit 1; -else - if ! [ -d $build_dir ]; then - printf "$scriptname: '$build_dir' (value to '--build-dir') doesn't " - printf "exist\n"; exit 1; - fi -fi - -# The temporary directory to place the Dockerfile. -tmp_dir="$build_dir"/temporary-docker-container-dir - - - - -# Directory preparations -# ---------------------- -# -# If the host operating system has '/dev/shm', then give Docker access -# to it also for improved speed in some scenarios (like configuration). -if [ -d /dev/shm ]; then shm_mnt="-v /dev/shm:/dev/shm"; -else shm_mnt=""; fi - -# If the following directories do not exist within the build directory, -# create them to make sure the '--mount' commands always work and -# that any file. Ideally, the 'input' directory should not be under the 'build' -# directory, but if the user hasn't given it then they don't care about -# potentially deleting it later (Maneage will download the inputs), so put -# it in the build directory. -analysis_dir="$build_dir"/analysis -if ! [ -d $analysis_dir ]; then mkdir $analysis_dir; fi - -# If no '--source-dir' was given, set it to the output of 'pwd -P' (to get -# the path without potential symbolic links) in the running directory. -if [ x"$source_dir" = x ]; then source_dir=$(pwd -P); fi - -# Only when an an input directory is given, we need the respective 'mount' -# option for the 'docker run' command. -input_dir_mnt="" -if ! [ x"$input_dir" = x ]; then - input_dir_mnt="-v $input_dir:/home/maneager/input" -fi - -# If no '--jobs' has been specified, use the maximum available jobs to the -# operating system. -if [ x$jobs = x ]; then jobs=$(nproc); fi - -# [DOCKER-ONLY] Make sure the user is a member of the 'docker' group: -glist=$(groups $(whoami) | awk '/docker/') -if [ x"$glist" = x ]; then - printf "$scriptname: you are not a member of the 'docker' group " - printf "You can run the following command as root to fix this: " - printf "'usermod -aG docker $(whoami)'\n" - exit 1 -fi - -# [DOCKER-ONLY] Function to check the temporary directory for building the -# base operating system docker image. It is necessary that this directory -# be empty because Docker will inherit the sub-directories of the directory -# that the Dockerfile is located in. -tmp_dir_check () { - if [ -d $tmp_dir ]; then - printf "$scriptname: '$tmp_dir' already exists, please " - printf "delete it and re-run this script. This is a temporary " - printf "directory only necessary when building a Docker image " - printf "and gets deleted automatically after a successful " - printf "build. The fact that it remains hints at a problem " - printf "in a previous attempt to build a Docker image\n" - exit 1 - else - mkdir $tmp_dir - fi -} - - - - - -# Base operating system -# --------------------- -# -# If the base image does not exist, then create it. If it does, inform the -# user that it will be used. -if docker image list | grep $base_name &> /dev/null; then - if [ $quiet = 0 ]; then - printf "$scriptname: info: base OS docker image ('$base_name') " - printf "already exists and will be used. If you want to build a " - printf "new base OS image, give a new name to '--base-name'. " - printf "To remove this message run with '--quiet'\n" - fi -else - - # In case an image file is given, load the environment from that (no - # need to build the environment from scratch). - if ! [ x"$image_file" = x ] && [ -f "$image_file" ]; then - docker load --input $image_file - else - - # Build the temporary directory. - tmp_dir_check - - # Build the Dockerfile. - uid=$(id -u) - cat <<EOF > $tmp_dir/Dockerfile -FROM $base_os -RUN useradd -ms /bin/sh --uid $uid maneager; \\ - printf '123\n123' | passwd maneager; \\ - printf '456\n456' | passwd root -RUN apt update; apt install -y gcc g++ wget; echo 'export PS1="[\[\033[01;31m\]\u@\h \W\[\033[32m\]\[\033[00m\]]# "' >> ~/.bashrc -USER maneager -WORKDIR /home/maneager -RUN mkdir build; mkdir build/analysis; echo 'export PS1="[\[\033[01;35m\]\u@\h \W\[\033[32m\]\[\033[00m\]]$ "' >> ~/.bashrc -EOF - - # Build the base-OS container and delete the temporary directory. - curdir="$(pwd)" - cd $tmp_dir - docker build ./ \ - -t $base_name \ - --shm-size=$shm_size - cd "$curdir" - rm -rf $tmp_dir - fi -fi - - - - - -# Maneage software configuration -# ------------------------------ -# -# Having the base operating system in place, we can now construct the -# project's docker file. -if docker image list | grep $project_name &> /dev/null; then - if [ $quiet = 0 ]; then - printf "$scriptname: info: project's image ('$project_name') " - printf "already exists and will be used. If you want to build a " - printf "new project image, give a new name to '--project-name'. " - printf "To remove this message run with '--quiet'\n" - fi -else - - # Build the temporary directory. - tmp_dir_check - df=$tmp_dir/Dockerfile - - # The only way to mount a directory inside the Docker build environment - # is the 'RUN --mount' command. But Docker doesn't recognize things - # like symbolic links. So we need to copy the project's source under - # this temporary directory. - sdir=source - mkdir $tmp_dir/$sdir - dsr=/home/maneager/source-raw - cp -r $source_dir/* $source_dir/.git $tmp_dir/$sdir - - # Start constructing the Dockerfile. - # - # Note on the printf's '\x5C\n' part: this will print out as a - # backslash at the end of the line to allow easy human readability of - # the Dockerfile (necessary for debugging!). - echo "FROM $base_name" > $df - printf "RUN --mount=type=bind,source=$sdir,target=$dsr \x5C\n" >> $df - - # If a software directory was given, copy it and add its line. - tsdir=tarballs-software - dts=/home/maneager/tarballs-software - if ! [ x"$software_dir" = x ]; then - - # Make the directory to host the software and copy the contents - # that the user gave there. - mkdir $tmp_dir/$tsdir - cp -r "$software_dir"/* $tmp_dir/$tsdir/ - printf " --mount=type=bind,source=$tsdir,target=$dts \x5C\n" >> $df - fi - - # Construct the rest of the 'RUN' command. - printf " cp -r $dsr /home/maneager/source; \x5C\n" >> $df - printf " cd /home/maneager/source; \x5C\n" >> $df - printf " ./project configure --jobs=$jobs \x5C\n" >> $df - printf " --build-dir=/home/maneager/build \x5C\n" >> $df - printf " --input-dir=/home/maneager/input \x5C\n" >> $df - printf " --software-dir=$dts; \x5C\n" >> $df - - # We are deleting the '.build/software/tarballs' directory because this - # directory is not relevant for the analysis of the project. But in - # case any tarball was downloaded, it will consume space within the - # container. - printf " rm -rf .build/software/tarballs; \x5C\n" >> $df - - # We are deleting the source directory becaues later (at 'docker run' - # time), the 'source' will be mounted directly from the host operating - # system. - printf " cd /home/maneager; \x5C\n" >> $df - printf " rm -rf source\n" >> $df - - # Build the Maneage container and delete the temporary directory. The - # '--progress plain' option is for Docker to print all the outputs - # (otherwise, it will only print a very small part!). - cd $tmp_dir - docker build ./ -t $project_name \ - --progress=plain \ - --shm-size=$shm_size \ - --no-cache \ - 2>&1 | tee build.log - cd .. - rm -rf $tmp_dir -fi - -# If the user wants to save the container (into a file that does not -# exist), do it here. If the file exists, it will only be used for creating -# the container in the previous stages. -if ! [ x"$image_file" = x ] && ! [ -f "$image_file" ]; then - - # Save the image into a tarball - tarname=$(echo $image_file | sed -e's|.gz$||') - if [ $quiet = 0 ]; then - printf "$scriptname: info: saving docker image to '$tarname'" - fi - docker save -o $tarname $project_name - - # Compress the saved image - if [ $quiet = 0 ]; then - printf "$scriptname: info: compressing to '$image_file' (can " - printf "take +10 minutes, but volume decreases by more than half!)" - fi - gzip --best $tarname -fi - -# If the user just wanted to build the base operating system, abort the -# script here. -if ! [ x"$build_only" = x ]; then - if [ $quiet = 0 ]; then - printf "$scriptname: info: Maneaged project has been configured " - printf "successfully in the '$project_name' image" - fi - exit 0 -fi - - - - - -# Run the analysis within the Maneage'd container -# ----------------------------------------------- -# -# The startup command of the container is managed though the 'shellopt' -# variable that starts here. -shellopt="" -if [ $container_shell = 1 ] || [ $project_shell = 1 ]; then - - # If the user wants to start the project shell within the container, - # add the necessary command. - if [ $project_shell = 1 ]; then - shellopt="/bin/bash -c 'cd source; ./project shell;'" - fi - - # Finish the 'shellop' string with a single quote (necessary in any - # case) and run Docker. - interactiveopt="-it" - -# No interactive shell requested, just run the project. -else - interactiveopt="" - shellopt="/bin/bash -c 'cd source; ./project make --jobs=$jobs;'" -fi - -# Execute Docker. The 'eval' is because the 'shellopt' variable contains a -# single-quote that the shell should "evaluate". -eval docker run \ - -v "$analysis_dir":/home/maneager/build/analysis \ - -v "$source_dir":/home/maneager/source \ - $input_dir_mnt \ - $shm_mnt \ - $interactiveopt \ - $project_name \ - $shellopt |