diff options
-rw-r--r-- | .gitignore | 10 | ||||
-rw-r--r-- | README.md | 630 | ||||
-rwxr-xr-x | project | 64 | ||||
-rw-r--r-- | reproduce/analysis/make/initialize.mk | 6 | ||||
-rw-r--r-- | reproduce/analysis/make/paper.mk | 41 | ||||
-rw-r--r-- | reproduce/analysis/make/prepare.mk | 2 | ||||
-rw-r--r-- | reproduce/analysis/make/top-prepare.mk | 2 | ||||
-rw-r--r-- | reproduce/software/containers/README-apptainer.md | 69 | ||||
-rw-r--r-- | reproduce/software/containers/README-docker.md | 180 | ||||
-rwxr-xr-x | reproduce/software/containers/apptainer.sh | 441 | ||||
-rwxr-xr-x | reproduce/software/containers/docker.sh | 486 | ||||
-rw-r--r-- | reproduce/software/make/basic.mk | 46 | ||||
-rw-r--r-- | reproduce/software/make/build-rules.mk | 2 | ||||
-rw-r--r-- | reproduce/software/make/high-level.mk | 54 | ||||
-rw-r--r-- | reproduce/software/make/python.mk | 4 | ||||
-rwxr-xr-x | reproduce/software/shell/configure.sh | 33 | ||||
-rwxr-xr-x | reproduce/software/shell/pre-make-build.sh | 2 |
17 files changed, 1416 insertions, 656 deletions
@@ -18,13 +18,14 @@ *~ *\# -*.txt *.aux *.log -*.pdf *.out -*.zip +*.pdf +*.sif *.swp +*.txt +*.zip .nfs* mmap_* *.tar.gz @@ -32,6 +33,7 @@ mmap_* .tex build +run.sh .local .build Makefile @@ -40,7 +42,7 @@ tex/tikz .DS_Store .texlive* LOCAL.conf -docker-run +timing.txt tex/pipeline LOCAL_tmp.mk LOCAL_old.mk @@ -292,571 +292,81 @@ light and should be very fast. -### Building in Docker containers - -Docker containers are a common way to build projects in an independent -filesystem, and an almost independent operating system. Containers thus -allow using GNU/Linux operating systems within proprietary operating -systems like macOS or Windows. But without the overhead and huge file size -of virtual machines. Furthermore containers allow easy movement of built -projects from one system to another without rebuilding. Just note that -Docker images are large binary files (+1 Gigabytes) and may not be usable -in the future (for example with new Docker versions not reading old -images). Containers are thus good for temporary/testing phases of a -project, but shouldn't be what you archive for the long term! - -Hence if you want to save and move your maneaged project within a Docker -image, be sure to commit all your project's source files and push them to -your external Git repository (you can do these within the Docker image as -explained below). This way, you can always recreate the container with -future technologies too. Generally, if you are developing within a -container, its good practice to recreate it from scratch every once in a -while, to make sure you haven't forgot to include parts of your work in -your project's version-controlled source. In the sections below we also -describe how you can use the container **only for the software -environment** and keep your data and project source on your host. - -#### Dockerfile for a Maneaged project, and building a Docker image - -Below is a series of recommendations on the various components of a -`Dockerfile` optimized to store the *built state of a maneaged project* as -a Docker image. Each component is also accompanied with -explanations. Simply copy the code blocks under each item into a plain-text -file called `Dockerfile`, in the same order of the items. Don't forget to -implement the suggested corrections (in particular step 4). - -**NOTE: Internet for TeXLive installation:** If you have the project -software tarballs and input data (optional features described below) you -can disable internet. In this situation, the configuration and analysis -will be exactly reproduced, the final LaTeX macros will be created, and all -results will be verified successfully. However, no final `paper.pdf` will -be created to visualize/combine everything in one easy-to-read file. Until -[task 15267](https://savannah.nongnu.org/task/?15267) is complete, we need -internet to install TeXLive packages (using TeXLive's own package manager -`tlmgr`) in the `./project configure` phase. This won't stop the -configuration, and it will finish successfully (since all the analysis can -still be reproduced). We are working on completing this task as soon as -possible, but until then, if you want to disable internet *and* you want to -build the final PDF, please disable internet after the configuration -phase. Note that only the necessary TeXLive packages are installed (~350 -MB), not the full TeXLive collection! - - 0. **Summary:** If you are already familiar with Docker, then the full - Dockerfile to get the project environment setup is shown here (without - any comments or explanations, because explanations are done in the next - items). Note that the last two `COPY` lines (to copy the directory - containing software tarballs used by the project and the possible input - databases) are optional because they will be downloaded if not - available. You can also avoid copying over all, and simply mount your - host directories within the image, we have a separate section on doing - this below ("Only software environment in the Docker image"). Once you - build the Docker image, your project's environment is setup and you can - go into it to run `./project make` manually. - - ```shell - FROM debian:stable-slim - RUN apt update && apt install -y gcc g++ wget - RUN useradd -ms /bin/sh maneager - RUN printf '123\n123' | passwd root - USER maneager - WORKDIR /home/maneager - RUN mkdir build - RUN mkdir software - COPY --chown=maneager:maneager ./project-source /home/maneager/source - COPY --chown=maneager:maneager ./software-dir /home/maneager/software - COPY --chown=maneager:maneager ./data-dir /home/maneager/data - RUN cd /home/maneager/source \ - && ./project configure --build-dir=/home/maneager/build \ - --software-dir=/home/maneager/software \ - --input-dir=/home/maneager/data - ``` - - 1. **Choose the base operating system:** The first step is to select the - operating system that will be used in the docker image. Note that your - choice of operating system also determines the commands of the next - step to install core software. - - ```shell - FROM debian:stable-slim - ``` - - 2. **Maneage dependencies:** By default the "slim" versions of the - operating systems don't contain a compiler (needed by Maneage to - compile precise versions of all the tools). You thus need to use the - selected operating system's package manager to import them (below is - the command for Debian). Optionally, if you don't have the project's - software tarballs, and want the project to download them automatically, - you also need a downloader. - - ```shell - # C and C++ compiler. - RUN apt update && apt install -y gcc g++ - - # Uncomment this if you don't have 'software-XXXX.tar.gz' (below). - #RUN apt install -y wget - ``` - - 3. **Define a user:** Some core software packages will complain if you try - to install them as the default (root) user. Generally, it is also good - practice to avoid being the root user. Hence with the commands below we - define a `maneager` user and activate it for the next steps. But just - in case root access is necessary temporarily, with the `passwd` - command, we are setting the root password to `123`. - - ```shell - RUN useradd -ms /bin/sh maneager - RUN printf '123\n123' | passwd root - USER maneager - WORKDIR /home/maneager - ``` - - 4. **Copy project files into the container:** these commands make the - assumptions listed below. IMPORTANT: you can also avoid copying over - all, and simply mount your host directories within the image, we have a - separate section on doing this below ("Only software environment in the - Docker image"). - - * The project's source is in the `maneaged/` sub-directory and this - directory is in the same directory as the `Dockerfile`. The source - can either be from cloned from Git (highly recommended!) or from a - tarball. Both are described above (note that arXiv's tarball needs to - be corrected as mentioned above). - - * (OPTIONAL) By default the project's necessary software source - tarballs will be downloaded when necessary during the `./project - configure` phase. But if you already have the sources, its better to - use them and not waste network traffic (and resulting carbon - footprint!). Maneaged projects usually come with a - `software-XXXX.tar.gz` file that is published on Zenodo (link above). - If you have this file, put it in the same directory as your - `Dockerfile` and include the relevant lines below. - - * (OPTIONAL) The project's input data. The `INPUT-FILES` depends on the - project, please look into the project's - `reproduce/analysis/config/INPUTS.conf` for the URLs and the file - names of input data. Similar to the software source files mentioned - above, if you don't have them, the project will attempt to download - its necessary data automatically in the `./project make` phase. - - ```shell - # Make the project's build directory and copy the project source - RUN mkdir build - COPY --chown=maneager:maneager ./maneaged /home/maneager/source - - # Optional (for software) - COPY --chown=maneager:maneager ./software-XXXX.tar.gz /home/maneager/ - RUN tar xf software-XXXX.tar.gz && mv software-XXXX software && rm software-XXXX.tar.gz - - # Optional (for data) - RUN mkdir data - COPY --chown=maneager:maneager ./INPUT-FILES /home/maneager/data - ``` - - 5. **Configure the project:** With this line, the Docker image will - configure the project (build all its necessary software). This will - usually take about an hour on an 8-core system. You can also optionally - avoid putting this step (and the next) in the `Dockerfile` and simply - execute them in the Docker image in interactive mode (as explained in - the sub-section below, in this case don't forget to preserve the build - container after you are done). - - ```shell - # Configure project (build full software environment). - RUN cd /home/maneager/source \ - && ./project configure --build-dir=/home/maneager/build \ - --software-dir=/home/maneager/software \ - --input-dir=/home/maneager/data - ``` - - 6. **Project's analysis:** With this line, the Docker image will do the - project's analysis and produce the final `paper.pdf`. The time it takes - for this step to finish, and the storage/memory requirements highly - depend on the particular project. - - ```shell - # Run the project's analysis - RUN cd /home/maneager/source && ./project make - ``` - - 7. **Build the Docker image:** The `Dockerfile` is now ready! In the - terminal, go to its directory and run the command below to build the - Docker image. We recommend to keep the `Dockerfile` in **an empty - directory** and run it from inside that directory too. This is because - Docker considers that directories contents to be part of the - environment. Finally, just set a `NAME` for your project and note that - Docker only runs as root. - - ```shell - sudo su - docker build -t NAME ./ - ``` - - - -#### Interactive tests on built container - -If you later want to start a container with the built image and enter it in -interactive mode (for example for temporary tests), please run the -following command. Just replace `NAME` with the same name you specified -when building the project. You can always exit the container with the -`exit` command (note that all your changes will be discarded once you exit, -see below if you want to preserve your changes after you exit). - -```shell -docker run -it NAME -``` - - - -#### Running your own project's shell for same analysis environment - -The default operating system only has minimal features: not having many of -the tools you are accustomed to in your daily command-line operations. But -your maneaged project has a very complete (for the project!) environment -which is fully built and ready to use interactively with the commands -below. For example the project also builds Git within itself, as well as -many other high-level tools that are used in your project and aren't -present in the container's operating system. - -```shell -# Once you are in the docker container -cd source -./project shell -``` - - - -#### Preserving the state of a built container - -All interactive changes in a container will be deleted as soon as you exit -it. THIS IS A VERY GOOD FEATURE IN GENERAL! If you want to make persistent -changes, you should do it in the project's plain-text source and commit -them into your project's online Git repository. As described in the Docker -introduction above, we strongly recommend to **not rely on a built container -for archival purposes**. - -But for temporary tests it is sometimes good to preserve the state of an -interactive container. To do this, you need to `commit` the container (and -thus save it as a Docker "image"). To do this, while the container is still -running, open another terminal and run these commands: - -```shell -# These two commands should be done in another terminal -docker container list - -# Get 'XXXXXXX' of your desired container from the first column above. -# Give the new image a name by replacing 'NEW-IMAGE-NAME'. -docker commit XXXXXXX NEW-IMAGE-NAME -``` - - - -#### Copying files from the Docker image to host operating system - -The Docker environment's file system is completely indepenent of your host -operating system. One easy way to copy files to and from an open container -is to use the `docker cp` command (very similar to the shell's `cp` -command). - -```shell -docker cp CONTAINER:/file/path/within/container /host/path/target -``` - - - - - -#### Only software environment in the Docker image - -You can set the docker image to only contain the software environment and -keep the project source and built analysis files (data and PDF) on your -host operating system. This enables you to keep the size of the Docker -image to a minimum (only containing the built software environment) to -easily move it from one computer to another. Below we'll summarize the -steps. - - 1. Get your user ID with this command: `id -u`. - - 2. Make a new (empty) directory called `docker` temporarily (will be - deleted later). - - ```shell - mkdir docker-tmp - cd docker-tmp - ``` - - 3. Make a `Dockerfile` (within the new/empty directory) with the - following contents. Just replace `UID` with your user ID (found in - step 1 above). Note that we are manually setting the `maneager` (user) - password to `123` and the root password to '456' (both should be - repeated because they must be confirmed by `passwd`). To install other - operating systems, just change the contents on the `FROM` line. For - example, for CentOS 7 you can use `FROM centos:centos7`, for the - latest CentOS, you can use `FROM centos:latest` (you may need to add - this line `RUN yum install -y passwd` before the `RUN useradd ...` - line.). - - ``` - FROM debian:stable-slim - RUN useradd -ms /bin/sh --uid UID maneager; \ - printf '123\n123' | passwd maneager; \ - printf '456\n456' | passwd root - USER maneager - WORKDIR /home/maneager - RUN mkdir build; mkdir build/analysis - ``` - - 4. Create a Docker image based on the `Dockerfile` above. Just replace - `MANEAGEBASE` with your desired name (this won't be your final image, - so you can safely use a name like `maneage-base`). Note that you need - to have root/administrator previlages when running it, so - - ```shell - sudo docker build -t MANEAGEBASE ./ - ``` - - 5. You don't need the temporary directory any more (the docker image is - saved in Docker's own location, and accessible from anywhere). - - ```shell - cd .. - rm -rf docker-tmp - ``` - - 6. Put the following contents into a newly created plain-text file called - `docker-run`, while setting the mandatory variables based on your - system. The name `docker-run` is already inside Maneage's `.gitignore` - file, so you don't have to worry about mistakenly commiting this file - (which contains private information: directories in this computer). - - ``` - #!/bin/sh - # - # Create a Docker container from an existing image of the built - # software environment, but with the source, data and build (analysis) - # directories directly within the host file system. This script should - # be run in the top project source directory (that has 'README.md' and - # 'paper.tex'). If not, replace the '$(pwd)' part with the project - # source directory. - - # MANDATORY: Name of Docker container - docker_name=MANEAGEBASE - - # MANDATORY: Location of "build" directory on this system (to host the - # 'analysis' sub-directory for output data products and possibly others). - build_dir=/PATH/TO/THIS/PROJECT/S/BUILD/DIR - - # OPTIONAL: Location of project's input data in this system. If not - # present, a 'data' directory under the build directory will be created. - data_dir=/PATH/TO/THIS/PROJECT/S/DATA/DIR - - # OPTIONAL: Location of software tarballs to use in building Maneage's - # internal software environment. - software_dir=/PATH/TO/SOFTWARE/TARBALL/DIR - - - - - - # Internal proceessing - # -------------------- - # - # Sanity check: Make sure that the build directory actually exists. - if ! [ -d $build_dir ]; then - echo "ERROR: '$build_dir' doesn't exist"; exit 1; - fi - - # If the host operating system has '/dev/shm', then give Docker access - # to it also for improved speed in some scenarios (like configuration). - if [ -d /dev/shm ]; then shmopt="-v /dev/shm:/dev/shm"; - else shmopt=""; fi - - # If the 'analysis' and 'data' directories (that are mounted), don't exist, - # then create them (otherwise Docker will create them as 'root' before - # creating the container, and we won't have permission to write in them. - analysis_dir="$build_dir"/analysis - if ! [ -d $analysis_dir ]; then mkdir $analysis_dir; fi - - # If the data or software directories don't exist, put them in the build - # directory (they will remain empty, but this helps in simplifiying the - # mounting command!). - if ! [ x$data_dir = x ]; then - data_dir="$build_dir"/data - if ! [ -d $data_dir ]; then mkdir $data_dir; fi - fi - if ! [ x$software_dir = x ]; then - software_dir="$build_dir"/tarballs-software - if ! [ -d $software_dir ]; then mkdir $software_dir; fi - fi - - # Run the Docker image while setting up the directories. - sudo docker run -v "$software_dir":/home/maneager/tarballs-software \ - -v "$analysis_dir":/home/maneager/build/analysis \ - -v "$data_dir":/home/maneager/data \ - -v "$(pwd)":/home/maneager/source \ - $shmopt -it $docker_name - ``` - - 7. Make the `docker-run` script executable. - - ```shell - chmod +x docker-run - ``` - - 8. Start the Docker daemon (root permissions required). If the operating - system uses systemd you can use the command below. If you want the - Docker daemon to be available after a reboot also (so you don't have - to restart it after turning off your computer), run this command again - but replacing `start` with `enable`. - - ```shell - systemctl start docker - ``` - - 9. You can now start the Docker image by executing your newly added - script like below (it will ask for your root password). You will - notice that you are in the Docker container with the changed prompt. - - ```shell - ./docker-run - ``` - - 10. You are now within the container. First, we'll add the GNU C and C++ - compilers (which are necessary to build our own programs in Maneage) - and the GNU WGet downloader (which may be necessary if you don't have - a core software's tarball already). Maneage will build pre-defined - versions of both and will use them. But for the very first packages, - they are necessary. In the process, by setting the `PS1` environment - variable, we'll define a color-coding for the interactive shell prompt - (red for root and purple for the user). If you build another operating - system, replace the `apt` commands accordingly (for example on CentOS, - you don't need the `apt update` line and you should use `yum install - -y gcc gcc-c++ wget glibc-static` to install the three basic - dependencies). - - ```shell - su - echo 'export PS1="[\[\033[01;31m\]\u@\h \W\[\033[32m\]\[\033[00m\]]# "' >> ~/.bashrc - source ~/.bashrc - apt update - apt install -y gcc g++ wget - exit - echo 'export PS1="[\[\033[01;35m\]\u@\h \W\[\033[32m\]\[\033[00m\]]$ "' >> ~/.bashrc - source ~/.bashrc - ``` - - 11. Now that the compiler is ready, we can start Maneage's - configuration. So let's go into the project source directory and run - these commands to build the software environment. - - ```shell - cd source - ./project configure --input-dir=/home/maneager/data \ - --build-dir=/home/maneager/build \ - --software-dir=/home/maneager/tarballs-software - ``` - - 12. After the configuration finishes successfully, it will say so. It will - then ask you to run `./project make`. **But don't do that - yet**. Keep this Docker container open and don't exit the container or - terminal. Open a new terminal, and follow the steps described in the - sub-section above to preserve (or "commit") the built container as a - Docker image. Let's assume you call it `MY-PROJECT-ENV`. After the new - image is made, you should be able to see the new image in the list of - images with this command (in yet another terminal): - - ```shell - docker image list # In the other terminal. - ``` - - 13. Now that you have safely "committed" your current Docker container - into a separate Docker image, you can **exit the container** safely - with the `exit` command. Don't worry, you won't loose the built - software environment: it is all now saved separately within the Docker - image. - - 14. Re-open your `docker-run` script and change `MANEAGEBASE` to - `MY-PROJECT-ENV` (or any other name you set for the environment you - committed above). - - ```shell - emacs docker-run - ``` - - 15. That is it! You can now always easily enter your container (only for - the software environemnt) with the command below. Within the - container, any file you save/edit in the `source` directory of the - docker container is the same file on your host OS and any file you - build in your `build/analysis` directory (within the Maneage'd - project) will be on your host OS. You can even use your container's - Git to store the history of your project in your host OS. See the next - step in case you want to move your built software environment to - another computer. - - ```shell - ./docker-run - ``` - - 16. In case you want to store the image as a single file as backup or to - move to another computer, you can run the commands below. They will - produce a single `project-env.tar.gz` file. - - ```shell - docker save -o my-project-env.tar MY-PROJECT-ENV - gzip --best project-env.tar - ``` - - 17. To load the tarball above into a clean docker environment (for example - on another system) copy the `my-project-env.tar.gz` file there and run - the command below. You can then create the `docker-run` script for - that system and run it to enter. Just don't forget that if your - `analysis_dir` directory is empty on the new/clean system. So you - should first run the same `./project configure ...` command above in - the docker image so it connects the environment to your source. Don't - worry, it won't build any software and should finish in a second or - two. Afterwards, you can safely run `./project make` and continue - working like you did on the old system. - - ```shell - docker load --input my-project-env.tar.gz - ``` - - - - - -#### Deleting all Docker images - -After doing your tests/work, you may no longer need the multi-gigabyte -files images, so its best to just delete them. To do this, just run the two -commands below to first stop all running containers and then to delete all -the images: - -```shell -docker ps -a -q | xargs docker rm -docker images -a -q | xargs docker rmi -f -``` - - - - - -### Copyright information - -This file and `.file-metadata` (a binary file, used by Metastore to store -file dates when doing Git checkouts) are part of the reproducible project -mentioned above and share the same copyright notice (at the start of this -file) and license notice (below). - -This project is free software: you can redistribute it and/or modify it -under the terms of the GNU General Public License as published by the Free +### Building in containers + +Containers are a common way to build projects in an independent filesystem +and an almost independent operating system without the overhead (in size +and speed) of a virtual machine. As a result, containers allow easy +movement of built projects from one system to another without +rebuilding. However, they are still large binary files (+1 Gigabytes) and +may not be usable in the future (for example with new software versions not +reading old images or old/new kernel issues). Containers are thus good for +execution/testing phases of a project, but shouldn't be what you archive +for the long term! + +It is therefore very important that if you want to save and move your +maneaged project within containers, be sure to commit all your project's +source files and push them to your external Git repository (you can do +these within the container as explained below). This way, you can always +recreate the container with future technologies too. Generally, if you are +developing within a container, its good practice to recreate it from +scratch every once in a while, to make sure you haven't forgot to include +parts of your work in your project's version-controlled source. In the +sections below we also describe how you can use the container **only for +the software environment** and keep your data and project source on your +host. + +If you have the necessary software tarballs and input data (optional +features described below) you can disable internet. In this situation, the +configuration and analysis will be exactly reproduced, the final LaTeX +macros will be created, and all results will be verified +successfully. However, no final `paper.pdf` will be created to +visualize/combine everything in one easy-to-read file. Until [task +15267](https://savannah.nongnu.org/task/?15267) is complete, Maneage only +needs internet to install TeXLive packages (using TeXLive's own package +manager `tlmgr`) in the `./project configure` phase. This won't stop the +configuration (since all the analysis can still be reproduced). We are +working on completing this task as soon as possible, but until then, if you +want to disable internet *and* you want to build the final PDF, please +disable internet after the configuration phase. Note that only the +necessary TeXLive packages are installed (~350 MB), not the full TeXLive +collection! + +The container technologies that Maneage has been tested on an documentation +exists in this project (with the `reproduce/software/containers` directory) +are listed below. See the respective `README-*.md` file in that directory +for the details: + + - [Apptainer](https://apptainer.org): useful in high performance + computing (HPC) facilities (where you do not have root + permissions). Apptainer is fully free and open source software. + Apptainer containers can only be created and used on GNU/Linux + operating systems, but are stored as files (easy to manage). + + - [Docker](https://www.docker.com): requires root access, but useful on + virtual private servers (VPSs). Docker images are stored and managed by + a root-level daemon, so you can only manage them through its own + interface. A docker container build on a GNU/Linux host can also be + executed on Windows or macOS. However, while the Docker engine and its + command-line interface on GNU/Linux are free and open source software, + its desktop application (with a GUI and components necessary for + Windows or macOS) is not (requires payment for large companies). + + + + + +## Copyright information + +This file is free software: you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. -This project is distributed in the hope that it will be useful, but WITHOUT +This file is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along -with this project. If not, see <https://www.gnu.org/licenses/>. +with this file. If not, see <https://www.gnu.org/licenses/>. @@ -33,6 +33,7 @@ set -e jobs=0 # 0 is for the default for the 'configure.sh' script. group= debug= +timing=0 host_cc=0 offline= operation= @@ -89,7 +90,7 @@ RECOMMENDATION: If this is the first time you are configuring this template, please don't use the options and let the script explain each parameter in full detail by simply running './project configure'. -Project 'make' special features. +Project 'make' special tagets ./project make Build the project on one thread ./project make -jN Built the project in parallel on N threads. ./project make clean Clean all files generated by 'make' (not software). @@ -127,6 +128,7 @@ Configure and Make options: Make (analysis) options: -p, --prepare-redo Re-do preparation (only done automatically once). + -t, --timing Starting and ending times written in 'timing.txt'. Make (final PDF) options: --refresh-bib Force refresh the bibliography. @@ -216,6 +218,8 @@ do # Make options (analysis): -p|--prepare-redo) prepare_redo=1; shift;; -p=*|--prepare-redo=*) on_off_option_error --prepare-redo; shift;; + -t|--timing) timing=1; shift;; + -t=*|--timing=*) on_off_option_error --timing; shift;; # Make options (final PDF): --refresh-bib) [ -f tex/src/references.tex ] && touch tex/src/references.tex; shift;; @@ -389,6 +393,7 @@ EOF # Run operations in controlled environment # ---------------------------------------- +perms="u+r,u+w,g+r,g+w,o-r,o-w,o-x" controlled_env() { # Get the full address of the build directory: @@ -423,7 +428,6 @@ controlled_env() { # Do requested operation # ---------------------- -perms="u+r,u+w,g+r,g+w,o-r,o-w,o-x" configscript=./reproduce/software/shell/configure.sh case $operation in @@ -444,8 +448,11 @@ case $operation in # to make sure they have them, we are activating the executable # flags by default here every time './project configure' is run. If # any other file in your project needs such flags, add them here. - chmod +x reproduce/software/shell/* reproduce/software/config/*.sh \ - reproduce/analysis/bash/* + if ! [ -x reproduce/software/shell/configure.sh ]; then + chmod +x reproduce/analysis/bash/* \ + reproduce/software/shell/* \ + reproduce/software/config/*.sh + fi # If the user requested, clean the TeX directory from the extra # (to-be-built) directories that may already be there (and will not @@ -499,22 +506,62 @@ case $operation in configuration_necessary fi + # Make sure that the necessary analysis directories directory exist + # in the build directory. These will be necessary in various phases + # of hte analysis and having them inside the lower-level Make steps + # will require setting them as prerequisites for many basic jobs + # (thus making the Makefiles harder to read and add potentials for + # bugs: forgetting to add them for example). Also, we don't want + # the configure phase to make any edits in the analysis directory, + # so they are not built there. + badir=.build/analysis + texdir=$badir/tex + mtexdir=$texdir/macros + if ! [ -d $badir ]; then mkdir $badir; fi + if ! [ -d $texdir ]; then mkdir $texdir; fi + if ! [ -d $mtexdir ]; then mkdir $mtexdir; fi + + # TeX build directory. If built in a group scenario, the TeX build + # directory must be separate for each member (so they can work on their + # relevant parts of the paper without conflicting with each other). + if [ "x$maneage_group_name" = x ]; then + texbdir="$texdir"/build + else + user=$(whoami) + texbdir="$texdir"/build-$user + fi + tikzdir="$texbdir"/tikz + if ! [ -L tex/build ]; then ln -s "$(pwd -P)/$texdir" tex/build; fi + if ! [ -L tex/tikz ]; then ln -s "$(pwd -P)/$tikzdir" tex/tikz; fi + + # Register the start of this run (we are appending the new + # information so previous information is preserved until the user + # intentionally deletes/cleans it). + if [ $timing = 1 ]; then echo "start: $(date)" >> timing.txt; fi + # Run data preparation phase (optionally build Makefiles with # special values for optimizing the main 'top-make.mk'). But note # that data preparation is only done automatically the first time - # the project is built (when '.build/software/preparation-done.mk' + # the project is built (when '.build/analysis/preparation-done.mk' # doesn't yet exist). After that, if the user wants to re-do the # preparation they have to use the '--prepare-redo' option. - if ! [ -f .build/software/preparation-done.mk ] \ + if ! [ -f .build/analysis/preparation-done.mk ] \ || [ x"$prepare_redo" = x1 ]; then controlled_env reproduce/analysis/make/top-prepare.mk fi - # Run the actual project. + # Call top-make (highest level analysis Makefile). controlled_env reproduce/analysis/make/top-make.mk + + # Register the time of the project's ending. + if [ $timing = 1 ]; then echo "end: $(date)" >> timing.txt; fi ;; + + + + # Interactive shell of Maneage. shell) # Make sure the configure script has been completed properly @@ -550,6 +597,9 @@ case $operation in ;; + + + # Operation not specified. *) cat <<EOF diff --git a/reproduce/analysis/make/initialize.mk b/reproduce/analysis/make/initialize.mk index 92e5eff..c51f910 100644 --- a/reproduce/analysis/make/initialize.mk +++ b/reproduce/analysis/make/initialize.mk @@ -41,7 +41,7 @@ bsdir=$(BDIR)/software # Derived directories (the locks directory can be shared with software # which already has this directory.). texdir = $(badir)/tex -lockdir = $(bsdir)/locks +lockdir = $(badir)/.locks indir = $(badir)/inputs prepdir = $(badir)/prepare mtexdir = $(texdir)/macros @@ -73,7 +73,7 @@ pconfdir = reproduce/analysis/config ifeq (x$(project-phase),xprepare) $(prepdir):; mkdir $@ else --include $(bsdir)/preparation-done.mk +-include $(badir)/preparation-done.mk ifeq (x$(include-prepare-results),xyes) -include $(prepdir)/*.mk $(prepdir)/*.conf endif @@ -274,7 +274,7 @@ clean: rm -rf $(texdir)/macros/!(dependencies.tex|dependencies-bib.tex|hardware-parameters.tex) rm -rf $(badir)/!(tex) $(texdir)/!(macros|$(texbtopdir)) rm -rf $(texdir)/build/!(tikz) $(texdir)/build/tikz/* - rm -rf $(bsdir)/preparation-done.mk + rm -rf $(badir)/preparation-done.mk distclean: clean # Without cleaning the Git hooks, we won't be able to easily commit diff --git a/reproduce/analysis/make/paper.mk b/reproduce/analysis/make/paper.mk index 66c6859..3c06ce3 100644 --- a/reproduce/analysis/make/paper.mk +++ b/reproduce/analysis/make/paper.mk @@ -18,6 +18,7 @@ + # LaTeX macros for paper # ---------------------- # @@ -92,6 +93,40 @@ $(mtexdir)/project.tex: $(mtexdir)/verify.tex +# TeX build directory +# ------------------- +# +# If built in a group scenario, the TeX build directory must be separate +# for each member (so they can work on their relevant parts of the paper +# without conflicting with each other). +ifeq ($(strip $(maneage_group_name)),) +texbdir:=$(texdir)/build +else +texbdir:=$(texdir)/build-$(shell whoami) +endif +tikzdir:=$(texbdir)/tikz +$(texbdir):; mkdir $@ +$(tikzdir): | $(texbdir); mkdir $@ + + + + + +# Software info in TeX +# -------------------- +# +# The information of the installed software is placed in the +# '.build/software' directory (which the TeX build should not depend +# on). Therefore, we should copy those macros here in the LaTeX build +# directory, so the TeX directory is completely independent from each +# other. +$(mtexdir)/dependencies.tex: $(bsdir)/tex/dependencies.tex + cp $(bsdir)/tex/*.tex $(mtexdir)/ + + + + + # The bibliography # ---------------- # @@ -104,8 +139,9 @@ $(mtexdir)/project.tex: $(mtexdir)/verify.tex # recipe and the 'paper.pdf' recipe. But if 'tex/src/references.tex' hasn't # been modified, we don't want to re-build the bibliography, only the final # PDF. -$(texbdir)/paper.bbl: tex/src/references.tex $(mtexdir)/dependencies-bib.tex \ - | $(mtexdir)/project.tex +$(texbdir)/paper.bbl: tex/src/references.tex $(mtexdir)/dependencies.tex \ + | $(mtexdir)/project.tex $(tikzdir) + # If '$(mtexdir)/project.tex' is empty, don't build PDF. @macros=$$(cat $(mtexdir)/project.tex) if [ x"$$macros" != x ]; then @@ -135,7 +171,6 @@ $(texbdir)/paper.bbl: tex/src/references.tex $(mtexdir)/dependencies-bib.tex \ export LD_LIBRARY_PATH="$(sys_library_sh_path):$$LD_LIBRARY_PATH" pdflatex -shell-escape -halt-on-error "$$p"/paper.tex biber paper - fi diff --git a/reproduce/analysis/make/prepare.mk b/reproduce/analysis/make/prepare.mk index 2cc1187..ffb2a3c 100644 --- a/reproduce/analysis/make/prepare.mk +++ b/reproduce/analysis/make/prepare.mk @@ -25,7 +25,7 @@ # # We need to remove the 'prepare' word from the list of 'makesrc'. prepare-dep = $(filter-out prepare, $(makesrc)) -$(bsdir)/preparation-done.mk: \ +$(badir)/preparation-done.mk: \ $(foreach s, $(prepare-dep), $(mtexdir)/$(s).tex) # If you need to add preparations (mainly automatically generated diff --git a/reproduce/analysis/make/top-prepare.mk b/reproduce/analysis/make/top-prepare.mk index 7d92d72..ea40f39 100644 --- a/reproduce/analysis/make/top-prepare.mk +++ b/reproduce/analysis/make/top-prepare.mk @@ -36,7 +36,7 @@ include reproduce/software/config/LOCAL.conf # # See 'top-make.mk' for complete explanation. ifeq (x$(maneage_group_name),x$(GROUP-NAME)) -all: $(BDIR)/software/preparation-done.mk +all: $(BDIR)/analysis/preparation-done.mk @echo "Project preparation is complete."; else all: diff --git a/reproduce/software/containers/README-apptainer.md b/reproduce/software/containers/README-apptainer.md new file mode 100644 index 0000000..9608dc8 --- /dev/null +++ b/reproduce/software/containers/README-apptainer.md @@ -0,0 +1,69 @@ +# Maneage'd projects in Apptainer + +Copyright (C) 2025-2025 Mohammad Akhlaghi <mohammad@akhlaghi.org>\ +Copyright (C) 2025-2025 Giacomo Lorenzetti <glorenzetti@cefca.es>\ +See the end of the file for license conditions. + +For an introduction on containers, see the "Building in containers" section +of the `README.md` file within the top-level directory of this +project. Here, we focus on Apptainer with a simple checklist on how to use +the `apptainer-run.sh` script that we have already prepared in this +directory for easy usage in a Maneage'd project. + + + + + +## Building your Maneage'd project in Apptainer + +Through the steps below, you will create an Apptainer image that will only +contain the software environment and keep the project source and built +analysis files (data and PDF) on your host operating system. This enables +you to keep the size of the image to a minimum (only containing the built +software environment) to easily move it from one computer to another. + + 1. Using your favorite text editor, create a `apptainer-local.sh` in your + project's top directory that contains the usage command shown at the + top of the 'apptainer.sh' script and take the following steps: + * Set the respective directories based on your own preferences. + * The `--software-dir` is optional (if you don't have the source + tarballs, Maneage will download them automatically. But that requires + internet (which may not always be available). If you regularly build + Maneage'd projects, you can clone the repository containing all the + tarballs at https://gitlab.cefca.es/maneage/tarballs-software + * Add an extra `--build-only` for the first run so it doesn't go onto + doing the analysis and just builds the image. After it has completed, + remove the `--build-only` and it will only run the analysis of your + project. + + 2. Once step one finishes, the build directory will contain two + Singularity Image Format (SIF) files listed below. You can move them to + any other (more permanent) positions in your filesystem or to other + computers as needed. + * `maneage-base.sif`: image containing the base operating system that + was used to build your project. You can safely delete this unless you + need to keep it for future builds without internet (you can give it + to the `--base-name` option of this script). If you want a different + name for this, put the same option in your + * `maneaged.sif`: image with the full software environment of your + project. This file is necessary for future runs of your project + within the container. + + + + + +## Copyright information + +This file is free software: you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation, either version 3 of the License, or (at your option) +any later version. + +This file is distributed in the hope that it will be useful, but WITHOUT +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for +more details. + +You should have received a copy of the GNU General Public License along +with this file. If not, see <https://www.gnu.org/licenses/>. diff --git a/reproduce/software/containers/README-docker.md b/reproduce/software/containers/README-docker.md new file mode 100644 index 0000000..f86dceb --- /dev/null +++ b/reproduce/software/containers/README-docker.md @@ -0,0 +1,180 @@ +# Maneage'd projects in Docker + +Copyright (C) 2021-2025 Mohammad Akhlaghi <mohammad@akhlaghi.org>\ +See the end of the file for license conditions. + +For an introduction on containers, see the "Building in containers" section +of the `README.md` file within the top-level directory of this +project. Here, we focus on Docker with a simple checklist on how to use the +`docker.sh` script that we have already prepared in this directory for easy +usage in a Maneage'd project. + + + + + +## Building your Maneage'd project in Docker + +Through the steps below, you will create a Docker image that will only +contain the software environment and keep the project source and built +analysis files (data and PDF) on your host operating system. This enables +you to keep the size of the image to a minimum (only containing the built +software environment) to easily move it from one computer to another. + + 0. Add your user to the `docker` group: `usermod -aG docker + USERNAME`. This is only necessary once on an operating system. + + 1. Start the Docker daemon (root permissions required). If the operating + system uses systemd you can use the command below. If you want the + Docker daemon to be available after a reboot also (so you don't have to + restart it after turning off your computer), run this command again but + replacing `start` with `enable` (this is not recommended if you don't + regularly use Docker: it will slow the boot time of your OS). + + ```shell + systemctl start docker + ``` + + 2. Using your favorite text editor, create a `docker-local.sh` in your top + Maneage directory (as described in the comments at the start of the + `docker.sh` script in this directory). Just activate `--build-only` on + the first run so it doesn't go onto doing the analysis and just sets up + the software environment. + + 3. After the setup is complete, run the following command to confirm that + the `maneage-base` (the OS of the container) and `maneaged` (your + project's full Maneage'd environment) images are available. If you want + different names for these images, add the `--project-name` and + `--base-name` options to the `docker.sh` call. + + ```shell + docker image list + ``` + + 4. You are now ready to do your analysis by removing the `--build-only` + option. + + + + + +## Script usage tips + +The `docker.sh` script introduced above has many options allowing certain +customizations that you can see when running it with the `--help` +option. The tips below are some of the more useful scenarios that we have +encountered so far. + +### Docker image in a single file + +In case you want to store the image as a single file as backup or to move +to another computer. For such cases, run the `docker.sh` script with the +`--image-file` option (for example `--image-file=myproj.tar.gz`). After +moving the file to the other system, run `docker.sh` with the same option. + +When the given file to `docker.sh` already exists, it will only be used for +loading the environment. When it doesn't exist, the script will save the +image into it. + + + + + +## Docker usage tips + +Below are some useful Docker usage scenarios that have proved to be +relevant for us in Maneage'd projects. + +### Cleaning up + +Docker has stored many large files in your operating system that can drain +valuable storage space. The storage of the cached files are usually orders +of magnitudes larger than what you see in `docker image list`! So after +doing your work, it is best to clean up all those files. If you feel you +may need the image later, you can save it in a single file as mentioned +above and delete all the un-necessary cached files. Afterwards, when you +load the image, only that image will be present with nothing extra. + +The easiest and most powerful way to clean up everything in Docker is the +two commands below. The first will close all open containers. The second +will remove all stopped containers, all networks not used by at least one +container, all images without at least one container associated to them, +and all build cache. + +```shell +docker ps -a -q | xargs docker rm +docker system prune -a +``` + +If you only want to delete the existing used images, run the command +below. But be careful that the cache is the largest storage consumer! So +the command above is the solution if your OS's root partition is close to +getting filled. + +```shell +docker images -a -q | xargs docker rmi -f +``` + + +### Preserving the state of an open container + +All interactive changes in a container will be deleted as soon as you exit +it. This is a very good feature of Docker in general! If you want to make +persistent changes, you should do it in the project's plain-text source and +commit them into your project's online Git repository. But in certain +situations, it is necessary to preserve the state of an interactive +container. To do this, you need to `commit` the container (and thus save it +as a Docker "image"). To do this, while the container is still running, +open another terminal and run these commands: + +```shell +# These two commands should be done in another terminal +docker container list + +# Get the 'XXXXXXX' of your desired container from the first column above. +# Give the new image a name by replacing 'NEW-IMAGE-NAME'. +docker commit XXXXXXX NEW-IMAGE-NAME +``` + + +### Interactive tests on built container + +If you later want to start a container with the built image and enter it in +interactive mode (for example for temporary tests), run the following +command. Just replace `NAME` with the same name you specified when building +the project. You can always exit the container with the `exit` command +(note that all your changes will be discarded once you exit, see below if +you want to preserve your changes after you exit). + +```shell +docker run -it NAME +``` + + +### Copying files from the Docker image to host operating system + +Except for the mounted directories, the Docker environment's file system is +indepenent of your host operating system. One easy way to copy files to and +from an open container is to use the `docker cp` command (very similar to +the shell's `cp` command). + +```shell +docker cp CONTAINER:/file/path/within/container /host/path/target +``` + + + +## Copyright information + +This file is free software: you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation, either version 3 of the License, or (at your option) +any later version. + +This file is distributed in the hope that it will be useful, but WITHOUT +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for +more details. + +You should have received a copy of the GNU General Public License along +with this file. If not, see <https://www.gnu.org/licenses/>. diff --git a/reproduce/software/containers/apptainer.sh b/reproduce/software/containers/apptainer.sh new file mode 100755 index 0000000..52315f6 --- /dev/null +++ b/reproduce/software/containers/apptainer.sh @@ -0,0 +1,441 @@ +#!/bin/sh +# +# Create a Apptainer container from an existing image of the built software +# environment, but with the source, data and build (analysis) directories +# directly within the host file system. This script is assumed to be run in +# the top project source directory (that has 'README.md' and +# 'paper.tex'). If not, use the '--source-dir' option to specify where the +# Maneage'd project source is located. +# +# Usage: +# +# - When you are at the top Maneage'd project directory, you can run this +# script like the example below. Just set all the '/PATH/TO/...' +# directories. See the items below for optional values. +# +# ./reproduce/software/containers/apptainer.sh \ +# --build-dir=/PATH/TO/BUILD/DIRECTORY \ +# --software-dir=/PATH/TO/SOFTWARE/TARBALLS +# +# - Non-mandatory options: +# +# - If you already have the input data that is necessary for your +# project's, use the '--input-dir' option to specify its location +# on your host file system. Otherwise the necessary analysis +# files will be downloaded directly into the build +# directory. Note that this is only necessary when '--build-only' +# is not given. +# +# - The '--software-dir' is only useful if you want to build a +# container. Even in that case, it is not mandatory: if not +# given, the software tarballs will be downloaded (thus requiring +# internet). +# +# - To avoid having to set them every time you want to start the +# apptainer environment, you can put this command (with the proper +# directories) into a 'run.sh' script in the top Maneage'd project +# source directory and simply execute that. The special name 'run.sh' +# is in Maneage's '.gitignore', so it will not be included in your +# git history by mistake. +# +# Known problems: +# +# Copyright (C) 2025-2025 Mohammad Akhlaghi <mohammad@akhlaghi.org> +# Copyright (C) 2025-2025 Giacomo Lorenzetti <glorenzetti@cefca.es> +# +# This script is free software: you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by the +# Free Software Foundation, either version 3 of the License, or (at your +# option) any later version. +# +# This script is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General +# Public License for more details. +# +# You should have received a copy of the GNU General Public License along +# with this script. If not, see <http://www.gnu.org/licenses/>. + + + + + +# Script settings +# --------------- +# Stop the script if there are any errors. +set -e + + + + + +# Default option values +jobs= +quiet=0 +source_dir= +build_only= +base_name="" +shm_size=20gb +scriptname="$0" +project_name="" +project_shell=0 +container_shell=0 +base_os=debian:stable-slim + +print_help() { + # Print the output. + cat <<EOF +Usage: $scriptname [OPTIONS] + +Top-level script to build and run a Maneage'd project within Apptainer. + + Host OS directories (to be mounted in the container): + -b, --build-dir=STR Dir. to build in (only analysis in host). + -i, --input-dir=STR Dir. of input datasets (optional). + -s, --software-dir=STR Directory of necessary software tarballs. + --source-dir=STR Directory of source code (default: 'pwd -P'). + + Apptainer images + --base-os=STR Base OS name (default: '$base_os'). + --base-name=STR Base OS apptainer image (a '.sif' file). + --project-name=STR Project's apptainer image (a '.sif' file). + + Interactive shell + --project-shell Open the project's shell within the container. + --container-shell Open the container shell. + + Operating mode: + --quiet Do not print informative statements. + -?, --help Give this help list. + -j, --jobs=INT Number of threads to use in each phase. + --build-only Just build the container, don't run it. + +Mandatory or optional arguments to long options are also mandatory or +optional for any corresponding short options. + +Maneage URL: https://maneage.org + +Report bugs to mohammad@akhlaghi.org +EOF +} + +on_off_option_error() { + if [ "x$2" = x ]; then + echo "$scriptname: '$1' doesn't take any values" + else + echo "$scriptname: '$1' (or '$2') doesn't take any values" + fi + exit 1 +} + +check_v() { + if [ x"$2" = x ]; then + printf "$scriptname: option '$1' requires an argument. " + printf "Try '$scriptname --help' for more information\n" + exit 1; + fi +} + +while [ $# -gt 0 ] +do + case $1 in + + # OS directories + -b|--build-dir) build_dir="$2"; check_v "$1" "$build_dir"; shift;shift;; + -b=*|--build-dir=*) build_dir="${1#*=}"; check_v "$1" "$build_dir"; shift;; + -b*) build_dir=$(echo "$1" | sed -e's/-b//'); check_v "$1" "$build_dir"; shift;; + -i|--input-dir) input_dir="$2"; check_v "$1" "$input_dir"; shift;shift;; + -i=*|--input-dir=*) input_dir="${1#*=}"; check_v "$1" "$input_dir"; shift;; + -i*) input_dir=$(echo "$1" | sed -e's/-i//'); check_v "$1" "$input_dir"; shift;; + -s|--software-dir) software_dir="$2"; check_v "$1" "$software_dir"; shift;shift;; + -s=*|--software-dir=*) software_dir="${1#*=}"; check_v "$1" "$software_dir"; shift;; + -s*) software_dir=$(echo "$1" | sed -e's/-s//'); check_v "$1" "$software_dir"; shift;; + --source-dir) source_dir="$2"; check_v "$1" "$source_dir"; shift;shift;; + --source-dir=*) source_dir="${1#*=}"; check_v "$1" "$source_dir"; shift;; + + # Container options. + --base-name) base_name="$2"; check_v "$1" "$base_name"; shift;shift;; + --base-name=*) base_name="${1#*=}"; check_v "$1" "$base_name"; shift;; + --project-name) project_name="$2"; check_v "$1" "$project_name"; shift;shift;; + --project-name=*) project_name="${1#*=}"; check_v "$1" "$project_name"; shift;; + + # Interactive shell. + --project-shell) project_shell=1; shift;; + --project_shell=*) on_off_option_error --project-shell;; + --container-shell) container_shell=1; shift;; + --container_shell=*) on_off_option_error --container-shell;; + + # Operating mode + --quiet) quiet=1; shift;; + --quiet=*) on_off_option_error --quiet;; + -j|--jobs) jobs="$2"; check_v "$1" "$jobs"; shift;shift;; + -j=*|--jobs=*) jobs="${1#*=}"; check_v "$1" "$jobs"; shift;; + -j*) jobs=$(echo "$1" | sed -e's/-j//'); check_v "$1" "$jobs"; shift;; + --build-only) build_only=1; shift;; + --build-only=*) on_off_option_error --build-only;; + --shm-size) shm_size="$2"; check_v "$1" "$shm_size"; shift;shift;; + --shm-size=*) shm_size="${1#*=}"; check_v "$1" "$shm_size"; shift;; + -'?'|--help) print_help; exit 0;; + -'?'*|--help=*) on_off_option_error --help -?;; + + # Unrecognized option: + -*) echo "$scriptname: unknown option '$1'"; exit 1;; + esac +done + + + + + +# Sanity checks +# ------------- +# +# Make sure that the build directory is given and that it exists. +if [ x$build_dir = x ]; then + printf "$scriptname: '--build-dir' not provided, this is the location " + printf "that all built analysis files will be kept on the host OS\n" + exit 1; +else + if ! [ -d $build_dir ]; then + printf "$scriptname: '$build_dir' (value to '--build-dir') doesn't " + printf "exist\n" + exit 1; + fi +fi + +# Set the default project and base-OS image names (inside the build +# directory). +if [ x"$base_name" = x ]; then base_name=$build_dir/maneage-base.sif; fi +if [ x"$project_name" = x ]; then project_name=$build_dir/maneaged.sif; fi + + + + + +# Directory preparations +# ---------------------- +# +# If the host operating system has '/dev/shm', then give Apptainer access +# to it also for improved speed in some scenarios (like configuration). +if [ -d /dev/shm ]; then + shm_mnt="--mount type=bind,src=/dev/shm,dst=/dev/shm"; +else shm_mnt=""; +fi + +# If the following directories do not exist within the build directory, +# create them to make sure the '--mount' commands always work and +# that any file. Ideally, the 'input' directory should not be under the 'build' +# directory, but if the user hasn't given it then they don't care about +# potentially deleting it later (Maneage will download the inputs), so put +# it in the build directory. +analysis_dir="$build_dir"/analysis +if ! [ -d $analysis_dir ]; then mkdir $analysis_dir; fi +analysis_dir_mnt="--mount type=bind,src=$analysis_dir,dst=/home/maneager/build/analysis" + +# If no '--source-dir' was given, set it to the output of 'pwd -P' (to get +# the direct path without potential symbolic links) in the running directory. +if [ x"$source_dir" = x ]; then source_dir=$(pwd -P); fi +source_dir_mnt="--mount type=bind,src=$source_dir,dst=/home/maneager/source" + +# Only when an an input directory is given, we need the respective 'mount' +# option for the 'apptainer run' command. +input_dir_mnt="" +if ! [ x"$input_dir" = x ]; then + input_dir_mnt="--mount type=bind,src=$input_dir,dst=/home/maneager/input" +fi + +# If no '--jobs' has been specified, use the maximum available jobs to the +# operating system. +if [ x$jobs = x ]; then jobs=$(nproc); fi + +# [APPTAINER-ONLY] Optional mounting option for the software directory. +software_dir_mnt="" +if ! [ x"$software_dir" = x ]; then + software_dir_mnt="--mount type=bind,src=$software_dir,dst=/home/maneager/tarballs-software" +fi + +# [APPTAINER-ONLY] Since the container is read-only and is run with the +# '--contain' option (which makes an empty '/tmp'), we need to make a +# dedicated directory for the container to be able to write to. This is +# necessary because some software (Biber in particular on the default +# branch) need to write there! See https://github.com/plk/biber/issues/494. +# We'll keep the directory on the host OS within the build directory, but +# as a hidden file (since it is not necessary in other types of build and +# ultimately only contains temporary files of programs that need it). +toptmp=$build_dir/.apptainer-tmp-$(whoami) +if ! [ -d $toptmp ]; then mkdir $toptmp; fi +rm -rf $toptmp/* # So previous runs don't affect this run. + + + + + +# Maneage'd Apptainer SIF container +# --------------------------------- +# +# Build the base operating system using Maneage's './project configure' +# step. +if [ -f $project_name ]; then + if [ $quiet = 0 ]; then + printf "$scriptname: info: project's image ('$project_name') " + printf "already exists and will be used. If you want to build a " + printf "new project image, give a new name to '--project-name'. " + printf "To remove this message run with '--quiet'\n" + fi +else + + # Build the basic definition, with just Debian and gcc/g++ + if [ -f $base_name ]; then + if [ $quiet = 0 ]; then + printf "$scriptname: info: base OS docker image ('$base_name') " + printf "already exists and will be used. If you want to build a " + printf "new base OS image, give a new name to '--base-name'. " + printf "To remove this message run with '--quiet'\n" + fi + else + + base_def=$build_dir/base.def + cat <<EOF > $base_def +Bootstrap: docker +From: $base_os + +%post + apt-get update && apt-get install -y gcc g++ +EOF + # Build the base operating system container and delete the + # temporary definition file. + apptainer build $base_name $base_def + rm $base_def + fi + + # Build the Maneage definition file. + # - About the '$jobs' variable: this definition file is temporarily + # built and deleted immediately after the SIF file is created. So + # instead of using Apptainer's more complex '{{ jobs }}' format to + # pass an argument, we simply write the value of the configure + # script's '--jobs' option as a shell variable here when we are + # building that file. + # - About the removal of Maneage'd tarballs: we are doing this so if + # Maneage has downloaded tarballs during the build they do not + # unecessarily bloat the container. Even when the user has given a + # software tarball directory, they will all be symbolic links that + # aren't valid when the user runs the container (since we only + # mount the software tarballs at build time). + maneage_def=$build_dir/maneage.def + cat <<EOF > $maneage_def +Bootstrap: localimage +From: $base_name + +%setup + mkdir -p \${APPTAINER_ROOTFS}/home/maneager/input + mkdir -p \${APPTAINER_ROOTFS}/home/maneager/source + mkdir -p \${APPTAINER_ROOTFS}/home/maneager/build/analysis + mkdir -p \${APPTAINER_ROOTFS}/home/maneager/tarballs-software + +%post + cd /home/maneager/source + ./project configure --jobs=$jobs \\ + --input-dir=/home/maneager/input \\ + --build-dir=/home/maneager/build \\ + --software-dir=/home/maneager/tarballs-software + rm /home/maneager/build/software/tarballs/* + +%runscript + cd /home/maneager/source + if [ x"\$maneage_apptainer_stat" = xshell ]; then \\ + ./project shell; \\ + elif [ x"\$maneage_apptainer_stat" = xrun ]; then \\ + if [ x"\$maneage_jobs" = x ]; then \\ + ./project make; \\ + else \\ + ./project make --jobs=\$maneage_jobs; \\ + fi; \\ + else \\ + printf "$scriptname: '\$maneage_apptainer_stat' (value "; \\ + printf "to 'maneage_apptainer_stat' environment variable) "; \\ + printf "is not recognized: should be either 'shell' or 'run'"; \\ + exit 1; \\ + fi +EOF + + # Build the maneage container. The last two are arguments (where order + # matters). The first few are options where order does not matter (so + # we have sorted them by line length). + apptainer build \ + $shm_mnt \ + $input_dir_mnt \ + $source_dir_mnt \ + $analysis_dir_mnt \ + $software_dir_mnt \ + --ignore-fakeroot-command \ + \ + $project_name \ + $maneage_def + + # Clean up. + rm $maneage_def +fi + +# If the user just wanted to build the base operating system, abort the +# script here. +if ! [ x"$build_only" = x ]; then + if [ $quiet = 0 ]; then + printf "$scriptname: info: Maneaged project has been configured " + printf "successfully in the '$project_name' image" + fi + exit 0 +fi + + + + + +# Run the Maneage'd container +# --------------------------- +# +# Set the high-level Apptainer operational mode. +if [ $container_shell = 1 ]; then + aopt="shell" +elif [ $project_shell = 1 ]; then + aopt="run --env maneage_apptainer_stat=shell" +else + aopt="run --env maneage_apptainer_stat=run --env maneage_jobs=$jobs" +fi + +# Build the hostname from the name of the SIF file of the project name. +hstname=$(echo "$project_name" \ + | awk 'BEGIN{FS="/"}{print $NF}' \ + | sed -e's|.sif$||') + +# Execute Apptainer: +# +# - We are not using '--unsquash' (to run within a sandbox) because it +# loads the full multi-gigabyte container into RAM (which we usually +# need for data processing). The container is read-only and we are +# using the following two options instead to ensure that we have no +# influence from outside the container. (description of each is from +# the Apptainer manual) +# --contain: use minimal /dev and empty other directories (e.g. /tmp +# and $HOME) instead of sharing filesystems from your host. +# --cleanenv: clean environment before running container". +# +# - We are not mounting '/dev/shm' since Apptainer prints a warning that +# it is already mounted (apparently does not need it at run time). +# +# --no-home and --home: the first ensures that the 'HOME' variable is +# different from the user's home on the host operating system, the +# second sets it to a directory we specify (to keep things like +# '.bash_history'). +apptainer $aopt \ + --no-home \ + --contain \ + --cleanenv \ + --home $toptmp \ + $input_dir_mnt \ + $source_dir_mnt \ + $analysis_dir_mnt \ + --workdir $toptmp \ + --hostname $hstname \ + --cwd /home/maneager/source \ + \ + $project_name diff --git a/reproduce/software/containers/docker.sh b/reproduce/software/containers/docker.sh new file mode 100755 index 0000000..d5b5682 --- /dev/null +++ b/reproduce/software/containers/docker.sh @@ -0,0 +1,486 @@ +#!/bin/sh +# +# Create a Docker container from an existing image of the built software +# environment, but with the source, data and build (analysis) directories +# directly within the host file system. This script is assumed to be run in +# the top project source directory (that has 'README.md' and +# 'paper.tex'). If not, use the '--source-dir' option to specify where the +# Maneage'd project source is located. +# +# Usage: +# +# - When you are at the top Maneage'd project directory, you can run this +# script like the example below. Just set all the '/PATH/TO/...' +# directories (see below for '--tmp-dir'). See the items below for +# optional values. +# +# ./reproduce/software/containers/docker.sh --shm-size=15gb \ +# --software-dir=/PATH/TO/SOFTWARE/TARBALLS \ +# --build-dir=/PATH/TO/BUILD/DIRECTORY +# +# - Non-mandatory options: +# +# - If you already have the input data that is necessary for your +# project's, use the '--input-dir' option to specify its location +# on your host file system. Otherwise the necessary analysis +# files will be downloaded directly into the build +# directory. Note that this is only necessary when '--build-only' +# is not given. +# +# - The '--software-dir' is only useful if you want to build a +# container. Even in that case, it is not mandatory: if not +# given, the software tarballs will be downloaded (thus requiring +# internet). +# +# - To avoid having to set the directory(s) every time you want to +# start the docker environment, you can put this command (with the +# proper directories) into a 'run.sh' script in the top Maneage'd +# project source directory and simply execute that. The special name +# 'run.sh' is in Maneage's '.gitignore', so it will not be included +# in your git history by mistake. +# +# Known problems: +# +# - As of 2025-04-06 the log file containing the output of the 'docker +# build' command that configures the Maneage'd project does not keep +# all the output (which gets clipped by Docker). with a "[output +# clipped, log limit 2MiB reached]" message. We need to find a way to +# fix this (so nothing gets clipped: useful for debugging). +# +# Copyright (C) 2021-2025 Mohammad Akhlaghi <mohammad@akhlaghi.org> +# +# This script is free software: you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by the +# Free Software Foundation, either version 3 of the License, or (at your +# option) any later version. +# +# This script is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General +# Public License for more details. +# +# You should have received a copy of the GNU General Public License along +# with this script. If not, see <http://www.gnu.org/licenses/>. + + + + + +# Script settings +# --------------- +# Stop the script if there are any errors. +set -e + + + + + +# Default option values +jobs= +quiet=0 +source_dir= +build_only= +image_file="" +shm_size=20gb +scriptname="$0" +project_shell=0 +container_shell=0 +project_name=maneaged +base_name=maneage-base +base_os=debian:stable-slim + +print_help() { + # Print the output. + cat <<EOF +Usage: $scriptname [OPTIONS] + +Top-level script to build and run a Maneage'd project within Docker. + + Host OS directories (to be mounted in the container): + -b, --build-dir=STR Dir. to build in (only analysis in host). + -i, --input-dir=STR Dir. of input datasets (optional). + -s, --software-dir=STR Directory of necessary software tarballs. + --source-dir=STR Directory of source code (default: 'pwd -P'). + + Docker images + --base-os=STR Base OS name (default: '$base_os'). + --base-name=STR Base OS docker image (default: $base_name). + --project-name=STR Project's docker image (default: $project_name). + --image-file=STR [Docker only] Load (if given file exists), or + save (if given file does not exist), the image. + For saving, the given name has to have an + '.tar.gz' suffix. + + Interactive shell + --project-shell Open the project's shell within the container. + --container-shell Open the container shell. + + Operating mode: + --quiet Do not print informative statements. + -?, --help Give this help list. + --shm-size=STR Passed to 'docker build' (default: $shm_size). + -j, --jobs=INT Number of threads to use in each phase. + --build-only Just build the container, don't run it. + +Mandatory or optional arguments to long options are also mandatory or +optional for any corresponding short options. + +Maneage URL: https://maneage.org + +Report bugs to mohammad@akhlaghi.org +EOF +} + +on_off_option_error() { + if [ "x$2" = x ]; then + echo "$scriptname: '$1' doesn't take any values" + else + echo "$scriptname: '$1' (or '$2') doesn't take any values" + fi + exit 1 +} + +check_v() { + if [ x"$2" = x ]; then + printf "$scriptname: option '$1' requires an argument. " + printf "Try '$scriptname --help' for more information\n" + exit 1; + fi +} + +while [ $# -gt 0 ] +do + case $1 in + + # OS directories + -b|--build-dir) build_dir="$2"; check_v "$1" "$build_dir"; shift;shift;; + -b=*|--build-dir=*) build_dir="${1#*=}"; check_v "$1" "$build_dir"; shift;; + -b*) build_dir=$(echo "$1" | sed -e's/-b//'); check_v "$1" "$build_dir"; shift;; + -i|--input-dir) input_dir="$2"; check_v "$1" "$input_dir"; shift;shift;; + -i=*|--input-dir=*) input_dir="${1#*=}"; check_v "$1" "$input_dir"; shift;; + -i*) input_dir=$(echo "$1" | sed -e's/-i//'); check_v "$1" "$input_dir"; shift;; + -s|--software-dir) software_dir="$2"; check_v "$1" "$software_dir"; shift;shift;; + -s=*|--software-dir=*) software_dir="${1#*=}"; check_v "$1" "$software_dir"; shift;; + -s*) software_dir=$(echo "$1" | sed -e's/-s//'); check_v "$1" "$software_dir"; shift;; + --source-dir) source_dir="$2"; check_v "$1" "$source_dir"; shift;shift;; + --source-dir=*) source_dir="${1#*=}"; check_v "$1" "$source_dir"; shift;; + + # Container options. + --base-name) base_name="$2"; check_v "$1" "$base_name"; shift;shift;; + --base-name=*) base_name="${1#*=}"; check_v "$1" "$base_name"; shift;; + + # Interactive shell. + --project-shell) project_shell=1; shift;; + --project_shell=*) on_off_option_error --project-shell;; + --container-shell) container_shell=1; shift;; + --container_shell=*) on_off_option_error --container-shell;; + + # Operating mode + --quiet) quiet=1; shift;; + --quiet=*) on_off_option_error --quiet;; + -j|--jobs) jobs="$2"; check_v "$1" "$jobs"; shift;shift;; + -j=*|--jobs=*) jobs="${1#*=}"; check_v "$1" "$jobs"; shift;; + -j*) jobs=$(echo "$1" | sed -e's/-j//'); check_v "$1" "$jobs"; shift;; + --build-only) build_only=1; shift;; + --build-only=*) on_off_option_error --build-only;; + --shm-size) shm_size="$2"; check_v "$1" "$shm_size"; shift;shift;; + --shm-size=*) shm_size="${1#*=}"; check_v "$1" "$shm_size"; shift;; + -'?'|--help) print_help; exit 0;; + -'?'*|--help=*) on_off_option_error --help -?;; + + # Output file + --image-file) image_file="$2"; check_v "$1" "$image_file"; shift;shift;; + --image-file=*) image_file="${1#*=}"; check_v "$1" "$image_file"; shift;; + + # Unrecognized option: + -*) echo "$scriptname: unknown option '$1'"; exit 1;; + esac +done + + + + + +# Sanity checks +# ------------- +# +# Make sure that the build directory is given and that it exists. +if [ x$build_dir = x ]; then + printf "$scriptname: '--build-dir' not provided, this is the location " + printf "that all built analysis files will be kept on the host OS\n" + exit 1; +else + if ! [ -d $build_dir ]; then + printf "$scriptname: '$build_dir' (value to '--build-dir') doesn't " + printf "exist\n"; exit 1; + fi +fi + +# The temporary directory to place the Dockerfile. +tmp_dir="$build_dir"/temporary-docker-container-dir + + + + +# Directory preparations +# ---------------------- +# +# If the host operating system has '/dev/shm', then give Docker access +# to it also for improved speed in some scenarios (like configuration). +if [ -d /dev/shm ]; then shm_mnt="-v /dev/shm:/dev/shm"; +else shm_mnt=""; fi + +# If the following directories do not exist within the build directory, +# create them to make sure the '--mount' commands always work and +# that any file. Ideally, the 'input' directory should not be under the 'build' +# directory, but if the user hasn't given it then they don't care about +# potentially deleting it later (Maneage will download the inputs), so put +# it in the build directory. +analysis_dir="$build_dir"/analysis +if ! [ -d $analysis_dir ]; then mkdir $analysis_dir; fi + +# If no '--source-dir' was given, set it to the output of 'pwd -P' (to get +# the path without potential symbolic links) in the running directory. +if [ x"$source_dir" = x ]; then source_dir=$(pwd -P); fi + +# Only when an an input directory is given, we need the respective 'mount' +# option for the 'docker run' command. +input_dir_mnt="" +if ! [ x"$input_dir" = x ]; then + input_dir_mnt="-v $input_dir:/home/maneager/input" +fi + +# If no '--jobs' has been specified, use the maximum available jobs to the +# operating system. +if [ x$jobs = x ]; then jobs=$(nproc); fi + +# [DOCKER-ONLY] Make sure the user is a member of the 'docker' group: +glist=$(groups $(whoami) | awk '/docker/') +if [ x"$glist" = x ]; then + printf "$scriptname: you are not a member of the 'docker' group " + printf "You can run the following command as root to fix this: " + printf "'usermod -aG docker $(whoami)'\n" + exit 1 +fi + +# [DOCKER-ONLY] Function to check the temporary directory for building the +# base operating system docker image. It is necessary that this directory +# be empty because Docker will inherit the sub-directories of the directory +# that the Dockerfile is located in. +tmp_dir_check () { + if [ -d $tmp_dir ]; then + printf "$scriptname: '$tmp_dir' already exists, please " + printf "delete it and re-run this script. This is a temporary " + printf "directory only necessary when building a Docker image " + printf "and gets deleted automatically after a successful " + printf "build. The fact that it remains hints at a problem " + printf "in a previous attempt to build a Docker image\n" + exit 1 + else + mkdir $tmp_dir + fi +} + + + + + +# Base operating system +# --------------------- +# +# If the base image does not exist, then create it. If it does, inform the +# user that it will be used. +if docker image list | grep $base_name &> /dev/null; then + if [ $quiet = 0 ]; then + printf "$scriptname: info: base OS docker image ('$base_name') " + printf "already exists and will be used. If you want to build a " + printf "new base OS image, give a new name to '--base-name'. " + printf "To remove this message run with '--quiet'\n" + fi +else + + # In case an image file is given, load the environment from that (no + # need to build the environment from scratch). + if ! [ x"$image_file" = x ] && [ -f "$image_file" ]; then + docker load --input $image_file + else + + # Build the temporary directory. + tmp_dir_check + + # Build the Dockerfile. + uid=$(id -u) + cat <<EOF > $tmp_dir/Dockerfile +FROM $base_os +RUN useradd -ms /bin/sh --uid $uid maneager; \\ + printf '123\n123' | passwd maneager; \\ + printf '456\n456' | passwd root +RUN apt update; apt install -y gcc g++ wget; echo 'export PS1="[\[\033[01;31m\]\u@\h \W\[\033[32m\]\[\033[00m\]]# "' >> ~/.bashrc +USER maneager +WORKDIR /home/maneager +RUN mkdir build; mkdir build/analysis; echo 'export PS1="[\[\033[01;35m\]\u@\h \W\[\033[32m\]\[\033[00m\]]$ "' >> ~/.bashrc +EOF + + # Build the base-OS container and delete the temporary directory. + curdir="$(pwd)" + cd $tmp_dir + docker build ./ \ + -t $base_name \ + --shm-size=$shm_size + cd "$curdir" + rm -rf $tmp_dir + fi +fi + + + + + +# Maneage software configuration +# ------------------------------ +# +# Having the base operating system in place, we can now construct the +# project's docker file. +if docker image list | grep $project_name &> /dev/null; then + if [ $quiet = 0 ]; then + printf "$scriptname: info: project's image ('$project_name') " + printf "already exists and will be used. If you want to build a " + printf "new project image, give a new name to '--project-name'. " + printf "To remove this message run with '--quiet'\n" + fi +else + + # Build the temporary directory. + tmp_dir_check + df=$tmp_dir/Dockerfile + + # The only way to mount a directory inside the Docker build environment + # is the 'RUN --mount' command. But Docker doesn't recognize things + # like symbolic links. So we need to copy the project's source under + # this temporary directory. + sdir=source + mkdir $tmp_dir/$sdir + dsr=/home/maneager/source-raw + cp -r $source_dir/* $source_dir/.git $tmp_dir/$sdir + + # Start constructing the Dockerfile. + # + # Note on the printf's '\x5C\n' part: this will print out as a + # backslash at the end of the line to allow easy human readability of + # the Dockerfile (necessary for debugging!). + echo "FROM $base_name" > $df + printf "RUN --mount=type=bind,source=$sdir,target=$dsr \x5C\n" >> $df + + # If a software directory was given, copy it and add its line. + tsdir=tarballs-software + dts=/home/maneager/tarballs-software + if ! [ x"$software_dir" = x ]; then + + # Make the directory to host the software and copy the contents + # that the user gave there. + mkdir $tmp_dir/$tsdir + cp -r "$software_dir"/* $tmp_dir/$tsdir/ + printf " --mount=type=bind,source=$tsdir,target=$dts \x5C\n" >> $df + fi + + # Construct the rest of the 'RUN' command. + printf " cp -r $dsr /home/maneager/source; \x5C\n" >> $df + printf " cd /home/maneager/source; \x5C\n" >> $df + printf " ./project configure --jobs=$jobs \x5C\n" >> $df + printf " --build-dir=/home/maneager/build \x5C\n" >> $df + printf " --input-dir=/home/maneager/input \x5C\n" >> $df + printf " --software-dir=$dts; \x5C\n" >> $df + + # We are deleting the '.build/software/tarballs' directory because this + # directory is not relevant for the analysis of the project. But in + # case any tarball was downloaded, it will consume space within the + # container. + printf " rm -rf .build/software/tarballs; \x5C\n" >> $df + + # We are deleting the source directory becaues later (at 'docker run' + # time), the 'source' will be mounted directly from the host operating + # system. + printf " cd /home/maneager; \x5C\n" >> $df + printf " rm -rf source\n" >> $df + + # Build the Maneage container and delete the temporary directory. The + # '--progress plain' option is for Docker to print all the outputs + # (otherwise, it will only print a very small part!). + cd $tmp_dir + docker build ./ -t $project_name \ + --progress=plain \ + --shm-size=$shm_size \ + --no-cache \ + 2>&1 | tee build.log + cd .. + rm -rf $tmp_dir +fi + +# If the user wants to save the container (into a file that does not +# exist), do it here. If the file exists, it will only be used for creating +# the container in the previous stages. +if ! [ x"$image_file" = x ] && ! [ -f "$image_file" ]; then + + # Save the image into a tarball + tarname=$(echo $image_file | sed -e's|.gz$||') + if [ $quiet = 0 ]; then + printf "$scriptname: info: saving docker image to '$tarname'" + fi + docker save -o $tarname $project_name + + # Compress the saved image + if [ $quiet = 0 ]; then + printf "$scriptname: info: compressing to '$image_file' (can " + printf "take +10 minutes, but volume decreases by more than half!)" + fi + gzip --best $tarname +fi + +# If the user just wanted to build the base operating system, abort the +# script here. +if ! [ x"$build_only" = x ]; then + if [ $quiet = 0 ]; then + printf "$scriptname: info: Maneaged project has been configured " + printf "successfully in the '$project_name' image" + fi + exit 0 +fi + + + + + +# Run the analysis within the Maneage'd container +# ----------------------------------------------- +# +# The startup command of the container is managed though the 'shellopt' +# variable that starts here. +shellopt="" +if [ $container_shell = 1 ] || [ $project_shell = 1 ]; then + + # If the user wants to start the project shell within the container, + # add the necessary command. + if [ $project_shell = 1 ]; then + shellopt="/bin/bash -c 'cd source; ./project shell;'" + fi + + # Finish the 'shellop' string with a single quote (necessary in any + # case) and run Docker. + interactiveopt="-it" + +# No interactive shell requested, just run the project. +else + interactiveopt="" + shellopt="/bin/bash -c 'cd source; ./project make --jobs=$jobs;'" +fi + +# Execute Docker. The 'eval' is because the 'shellopt' variable contains a +# single-quote that the shell should "evaluate". +eval docker run \ + -v "$analysis_dir":/home/maneager/build/analysis \ + -v "$source_dir":/home/maneager/source \ + $input_dir_mnt \ + $shm_mnt \ + $interactiveopt \ + $project_name \ + $shellopt diff --git a/reproduce/software/make/basic.mk b/reproduce/software/make/basic.mk index 745aeca..40c5a4e 100644 --- a/reproduce/software/make/basic.mk +++ b/reproduce/software/make/basic.mk @@ -144,10 +144,17 @@ backupservers_all = $(user_backup_urls) $(maneage_backup_urls) topbackupserver = $(word 1, $(backupservers_all)) backupservers = $(filter-out $(topbackupserver),$(backupservers_all)) - - - - +# When building in Apptainer containers, as of 2025-04-18, we need to +# configure Maneage as root (within the container). In such cases, we need +# to activate the 'FORCE_UNSAFE_CONFIGURE' environment variable to build +# some of the software. The 'if' statement is here to make sure we are in +# Apptainer: in other situations, the "unsafe" configure script shouldn't +# be activated. Note that this doesn't happen in Docker (where the Maneage +# source is in the same directory) because we build a non-root ('maneager' +# user there who executes the configure command. +unsafe-config = if [ $$(pwd) = "/home/maneager/source" ] \ + && [ $$(whoami) = root ]; then \ + export FORCE_UNSAFE_CONFIGURE=1; fi @@ -261,11 +268,8 @@ $(ibidir)/low-level-links: $(ibidir)/grep-$(grep-version) \ # # The first set of programs to be built are those that we need to unpack # the source code tarballs of each program. We have already installed Lzip -# before calling 'basic.mk', so it is present and working. Hence we first -# build the Lzipped tarball of Gzip, then use our own Gzip to unpack the -# tarballs of the other compression programs. Once all the compression -# programs/libraries are complete, we build our own GNU Tar and continue -# with other software. +# before calling 'basic.mk', so it is present and working. So the only +# prerequisites of these (until reaching Tar) is the necessary directories. $(lockdir): | $(BDIR); mkdir $@ $(ibidir)/gzip-$(gzip-version): | $(ibdir) $(ildir) $(lockdir) tarball=gzip-$(gzip-version).tar.lz @@ -273,13 +277,13 @@ $(ibidir)/gzip-$(gzip-version): | $(ibdir) $(ildir) $(lockdir) $(call gbuild, gzip-$(gzip-version), static, , V=1) echo "GNU Gzip $(gzip-version)" > $@ -$(ibidir)/xz-$(xz-version): $(ibidir)/gzip-$(gzip-version) +$(ibidir)/xz-$(xz-version): | $(ibdir) $(ildir) $(lockdir) tarball=xz-$(xz-version).tar.lz $(call import-source, $(xz-url), $(xz-checksum)) $(call gbuild, xz-$(xz-version), static) echo "XZ Utils $(xz-version)" > $@ -$(ibidir)/bzip2-$(bzip2-version): $(ibidir)/gzip-$(gzip-version) +$(ibidir)/bzip2-$(bzip2-version): | $(ibdir) $(ildir) $(lockdir) # Download the tarball. tarball=bzip2-$(bzip2-version).tar.lz @@ -308,7 +312,7 @@ $(ibidir)/bzip2-$(bzip2-version): $(ibidir)/gzip-$(gzip-version) fi cd $(ddir) rm -rf $$tdir - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd $$tdir $(shsrcdir)/prep-source.sh $(ibdir) sed -e 's@\(ln -s -f \)$$(PREFIX)/bin/@\1@' Makefile \ @@ -330,7 +334,7 @@ $(ibidir)/bzip2-$(bzip2-version): $(ibidir)/gzip-$(gzip-version) # # Note for a static-only build: Zlib's './configure' doesn't use Autoconf's # configure script, it just accepts a direct '--static' option. -$(ibidir)/zlib-$(zlib-version): $(ibidir)/gzip-$(gzip-version) +$(ibidir)/zlib-$(zlib-version): | $(ibdir) $(ildir) $(lockdir) tarball=zlib-$(zlib-version).tar.lz $(call import-source, $(zlib-url), $(zlib-checksum)) $(call gbuild, zlib-$(zlib-version)) @@ -351,6 +355,7 @@ $(ibidir)/tar-$(tar-version): \ # a bottleneck here: only making Tar. So its more efficient to built # it on multiple threads (even when the user's Make doesn't pass down # the number of threads). + $(call unsafe-config) tarball=tar-$(tar-version).tar.lz $(call import-source, $(tar-url), $(tar-checksum)) $(call gbuild, tar-$(tar-version), , , -j$(numthreads) V=1) @@ -615,7 +620,7 @@ $(ibidir)/perl-$(perl-version): $(ibidir)/patchelf-$(patchelf-version) | awk '{printf("%d.%d", $$1, $$2)}') cd $(ddir) rm -rf perl-$(perl-version) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd perl-$(perl-version) $(shsrcdir)/prep-source.sh $(ibdir) ./Configure -des \ @@ -675,14 +680,15 @@ $(ibidir)/coreutils-$(coreutils-version): \ $(ibidir)/perl-$(perl-version) \ $(ibidir)/openssl-$(openssl-version) -# Import the source tarball. +# Import, unpack and enter the source directory. + $(call unsafe-config) tarball=coreutils-$(coreutils-version).tar.lz $(call import-source, $(coreutils-url), $(coreutils-checksum)) # Unpack and enter the source. cd $(ddir) rm -rf coreutils-$(coreutils-version) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd coreutils-$(coreutils-version) $(shsrcdir)/prep-source.sh $(ibdir) @@ -696,7 +702,7 @@ $(ibidir)/coreutils-$(coreutils-version): \ # Fix RPATH if necessary. if [ -f $(ibdir)/patchelf ]; then make SHELL=$(ibdir)/bash install DESTDIR=junkinst - unalias ls || true # avoid decorated 'ls' commands with extra characters + unalias ls || true # Not decorated 'ls' (with extra characters). instprogs=$$(ls junkinst/$(ibdir)) for f in $$instprogs; do $(ibdir)/patchelf --set-rpath $(ildir) $(ibdir)/$$f @@ -721,7 +727,7 @@ $(ibidir)/podlators-$(podlators-version): $(ibidir)/perl-$(perl-version) $(call import-source, $(podlators-url), $(podlators-checksum)) cd $(ddir) rm -rf podlators-$(podlators-version) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd podlators-$(podlators-version) $(shsrcdir)/prep-source.sh $(ibdir) perl Makefile.PL @@ -1400,7 +1406,7 @@ $(ibidir)/gcc-$(gcc-version): $(ibidir)/binutils-$(binutils-version) # Unpack GCC and prepare the 'build' directory inside it for all # the built files. rm -rf gcc-$(gcc-version) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions if [ $$odir != $(ddir) ]; then ln -s $$odir/gcc-$(gcc-version) $(ddir)/gcc-$(gcc-version) fi @@ -1530,7 +1536,7 @@ $(ibidir)/lzip-$(lzip-version): $(ibidir)/gcc-$(gcc-version) unpackdir=lzip-$(lzip-version) cd $(ddir) rm -rf $$unpackdir - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd $$unpackdir $(shsrcdir)/prep-source.sh $(ibdir) ./configure --build --check --installdir="$(ibdir)" diff --git a/reproduce/software/make/build-rules.mk b/reproduce/software/make/build-rules.mk index 62cb6d5..463fbbf 100644 --- a/reproduce/software/make/build-rules.mk +++ b/reproduce/software/make/build-rules.mk @@ -160,7 +160,7 @@ uncompress = csuffix=$$(echo $$utarball \ intarrm=0; \ intar=$$utarball; \ fi; \ - if tar -xf $$intar; then \ + if tar -xf $$intar --no-same-owner --no-same-permissions; then \ if [ x$$intarrm = x1 ]; then rm $$intar; fi; \ else \ echo; echo "Tar error"; exit 1; \ diff --git a/reproduce/software/make/high-level.mk b/reproduce/software/make/high-level.mk index c81f0b9..4ed5d62 100644 --- a/reproduce/software/make/high-level.mk +++ b/reproduce/software/make/high-level.mk @@ -348,7 +348,8 @@ $(ibidir)/atlas-$(atlas-version): # 'rpath_command'. export LDFLAGS=-L$(ildir) cd $(ddir) - tar -xf $(tdir)/atlas-$(atlas-version).tar.lz + tar -xf $(tdir)/atlas-$(atlas-version).tar.lz \ + --no-same-owner --no-same-permissions cd ATLAS $(shsrcdir)/prep-source.sh $(ibdir) rm -rf build @@ -405,7 +406,7 @@ $(ibidir)/boost-$(boost-version): \ rm -rf $(ddir)/$$unpackdir topdir=$(pwd) cd $(ddir) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd $$unpackdir $(shsrcdir)/prep-source.sh $(ibdir) ./bootstrap.sh --prefix=$(idir) --with-libraries=all \ @@ -427,7 +428,8 @@ $(ibidir)/cfitsio-$(cfitsio-version): # systems. So we need to change it to our library installation # path. It doesn't affect GNU/Linux, so we'll just do it in any case # to keep things clean. - topdir=$(pwd); cd $(ddir); tar -xf $(tdir)/$$tarball + topdir=$(pwd); cd $(ddir) + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions customtar=cfitsio-$(cfitsio-version)-custom.tar.gz cd cfitsio-$(cfitsio-version) sed -i -e's|@rpath|$(ildir)|g' configure @@ -467,7 +469,8 @@ $(ibidir)/eigen-$(eigen-version): tarball=eigen-$(eigen-version).tar.lz $(call import-source, $(eigen-url), $(eigen-checksum)) rm -rf $(ddir)/eigen-eigen-* - topdir=$(pwd); cd $(ddir); tar -xf $(tdir)/$$tarball + topdir=$(pwd); cd $(ddir) + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd eigen-$(eigen-version) if ! [ -d $(iidir)/eigen3 ]; then mkdir $(iidir)/eigen3; fi cp -r Eigen/* $(iidir)/eigen3/ # Some expect 'eigen3'. @@ -597,7 +600,7 @@ $(ibidir)/healpix-$(healpix-version): $(healpix-python-dep) \ fi rm -rf $(ddir)/Healpix_$(healpix-version) topdir=$(pwd); cd $(ddir); - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd Healpix_$(healpix-version) $(shsrcdir)/prep-source.sh $(ibdir) cd src/C/autotools @@ -689,7 +692,7 @@ $(ibidir)/libpaper-$(libpaper-version): \ # Unpack, build the configure system, build and install. cd $(ddir) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions unpackdir=libpaper-$(libpaper-version) cd $$unpackdir $(shsrcdir)/prep-source.sh $(ibdir) @@ -824,7 +827,7 @@ $(ibidir)/ninjabuild-$(ninjabuild-version): $(ibidir)/cmake-$(cmake-version) tarball=ninjabuild-$(ninjabuild-version).tar.lz $(call import-source, $(ninjabuild-url), $(ninjabuild-checksum)) cd $(ddir) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd ninjabuild-$(ninjabuild-version) cmake -Bbuild-cmake cmake --build build-cmake -j$(numthreads) @@ -839,7 +842,7 @@ $(ibidir)/openblas-$(openblas-version): $(call import-source, $(openblas-url), $(openblas-checksum)) if [ x$(on_mac_os) = xyes ]; then export CC=clang; fi cd $(ddir) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd openblas-$(openblas-version) $(shsrcdir)/prep-source.sh $(ibdir) make -j$(numthreads) @@ -1048,7 +1051,7 @@ $(ibidir)/astrometrynet-$(astrometrynet-version): \ # 'astrometrynet' cd $(ddir) rm -rf astrometry.net-$(astrometrynet-version) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd astrometry.net-$(astrometrynet-version) $(shsrcdir)/prep-source.sh $(ibdir) sed -e 's|cat /proc/cpuinfo|echo "Ignoring CPU info"|' \ @@ -1087,7 +1090,7 @@ $(ibidir)/cdsclient-$(cdsclient-version): tarball=cdsclient-$(cdsclient-version).tar.lz $(call import-source, $(cdsclient-url), $(cdsclient-checksum)) cd $(ddir) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd cdsclient-$(cdsclient-version) $(shsrcdir)/prep-source.sh $(ibdir) touch * @@ -1120,7 +1123,7 @@ $(ibidir)/cmake-$(cmake-version): # Go into the unpacked directory and prepare CMake. cd $(ddir) rm -rf cmake-$(cmake-version) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd cmake-$(cmake-version) $(shsrcdir)/prep-source.sh $(ibdir) @@ -1186,7 +1189,7 @@ $(ibidir)/ghostscript-$(ghostscript-version): \ # '-DPNG_ARM_NEON_OPT=0' prevents an arm64 'neon' library from being # required at compile time. cd $(ddir) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd ghostscript-$(ghostscript-version) $(shsrcdir)/prep-source.sh $(ibdir) ./configure --prefix=$(idir) \ @@ -1209,9 +1212,9 @@ $(ibidir)/ghostscript-$(ghostscript-version): \ # Install the fonts. tar -xvf $(tdir)/ghostscript-fonts-std-$(ghostscript-fonts-std-version).tar.lz \ - -C $(idir)/share/ghostscript + -C $(idir)/share/ghostscript --no-same-owner --no-same-permissions tar -xvf $(tdir)/ghostscript-fonts-gnu-$(ghostscript-fonts-gnu-version).tar.lz \ - -C $(idir)/share/ghostscript + -C $(idir)/share/ghostscript --no-same-owner --no-same-permissions fc-cache -v $(idir)/share/ghostscript/fonts/ echo; echo "Ghostscript fonts added to Fontconfig."; echo; @@ -1248,7 +1251,7 @@ $(ibidir)/icu-$(icu-version): $(ibidir)/python-$(python-version) tarball=icu-$(icu-version).tar.lz $(call import-source, $(icu-url), $(icu-checksum)) cd $(ddir) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions unpackdir=icu-$(icu-version) cd $$unpackdir $(shsrcdir)/prep-source.sh $(ibdir) @@ -1317,7 +1320,7 @@ $(ibidir)/imfit-$(imfit-version): \ cd $(ddir) unpackdir=imfit-$(imfit-version) rm -rf $$unpackdir - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd $$unpackdir $(shsrcdir)/prep-source.sh $(ibdir) sed -i 's|/usr/local|$(idir)|g' SConstruct @@ -1365,7 +1368,8 @@ $(ibidir)/minizip-$(minizip-version): $(ibidir)/automake-$(automake-version) unpackdir=minizip-$(minizip-version) rm -rf $$unpackdir mkdir $$unpackdir - tar -xf $(tdir)/$$tarball -C$$unpackdir --strip-components=1 + tar -xf $(tdir)/$$tarball -C$$unpackdir --strip-components=1 \ + --no-same-owner --no-same-permissions cd $$unpackdir $(shsrcdir)/prep-source.sh $(ibdir) ./configure --prefix=$(idir) @@ -1427,7 +1431,7 @@ $(ibidir)/netpbm-$(netpbm-version): \ cd $(ddir) unpackdir=netpbm-$(netpbm-version) rm -rf $$unpackdir - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd $$unpackdir $(shsrcdir)/prep-source.sh $(ibdir) @@ -1547,7 +1551,7 @@ $(ibidir)/scons-$(scons-version): $(ibidir)/python-$(python-version) cd $(ddir) unpackdir=scons-$(scons-version) rm -rf $$unpackdir - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd $$unpackdir $(shsrcdir)/prep-source.sh $(ibdir) @@ -1593,7 +1597,7 @@ $(ibidir)/sextractor-$(sextractor-version): \ unpackdir=sextractor-$(sextractor-version) cd $(ddir) rm -rf $$unpackdir - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd $$unpackdir # See comment above 'missfits' for '-fcommon'. @@ -1694,7 +1698,7 @@ $(ibidir)/util-linux-$(util-linux-version): \ # explained above). As shown below, later, we'll put a symbolic link # of all the necessary binaries in the main '$(idir)/bin'. cd $(ddir) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd util-linux-$(util-linux-version) $(shsrcdir)/prep-source.sh $(ibdir) @@ -1799,7 +1803,7 @@ $(ibidir)/vim-$(vim-version): tarball=vim-$(vim-version).tar.lz $(call import-source, $(vim-url), $(vim-checksum)) cd $(ddir) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions unpackdir=vim-$(vim-version) cd $(ddir)/$$unpackdir $(shsrcdir)/prep-source.sh $(ibdir) @@ -1869,7 +1873,8 @@ $(itidir)/texlive-ready-tlmgr: reproduce/software/config/texlive.conf @topdir=$$(pwd) cd $(ddir) rm -rf install-tl-* - tar -xf $(tdir)/install-tl-unx.tar.gz + tar -xf $(tdir)/install-tl-unx.tar.gz \ + --no-same-owner --no-same-permissions cd install-tl-* $(shsrcdir)/prep-source.sh $(ibdir) sed -e's|@installdir[@]|$(idir)|g' \ @@ -1956,7 +1961,8 @@ $(itidir)/texlive-ready-tlmgr: reproduce/software/config/texlive.conf "$(backupservers)"; then cd $(ddir) rm -rf install-tl-* - tar -xf $(tdir)/install-tl-unx.tar.gz + tar -xf $(tdir)/install-tl-unx.tar.gz \ + --no-same-owner --no-same-permissions cd install-tl-* $(shsrcdir)/prep-source.sh $(ibdir) sed -e's|@installdir[@]|$(idir)|g' \ diff --git a/reproduce/software/make/python.mk b/reproduce/software/make/python.mk index c017fd2..c994e3f 100644 --- a/reproduce/software/make/python.mk +++ b/reproduce/software/make/python.mk @@ -99,7 +99,7 @@ $(ibidir)/python-$(python-version): $(ibidir)/libffi-$(libffi-version) # Unpack the tarball (see below for the necessary modification). cd $(ddir) unpackdir=python-$(python-version) - tar -xf $(tdir)/$$tarball + tar -xf $(tdir)/$$tarball --no-same-owner --no-same-permissions cd $$unpackdir $(shsrcdir)/prep-source.sh $(ibdir) @@ -215,7 +215,7 @@ pybuild = cd $(ddir); \ packagedir=$(strip $(2)); \ if (printf "$$packagedir" | grep "[a-z][a-z]"); then rm -rf $$packagedir; fi; \ printf "\nStarting to install python package with maneage pybuild rule: $(4)\n ..."; \ - if ! $(1) $(tdir)/$$tarball; then \ + if ! $(1) $(tdir)/$$tarball --no-same-owner --no-same-permissions; then \ echo; echo "Tar error"; exit 1; \ fi; \ cd $$packagedir; \ diff --git a/reproduce/software/shell/configure.sh b/reproduce/software/shell/configure.sh index 517e1ed..e291f7b 100755 --- a/reproduce/software/shell/configure.sh +++ b/reproduce/software/shell/configure.sh @@ -1273,33 +1273,10 @@ chmod +x $makewshell # Project's top-level built analysis directories # ---------------------------------------------- -# Top-level built analysis directories. -badir="$bdir"/analysis -if ! [ -d "$badir" ]; then mkdir "$badir"; fi - # Top-level LaTeX. -texdir="$badir"/tex +texdir="$sdir"/tex if ! [ -d "$texdir" ]; then mkdir "$texdir"; fi -# LaTeX macros. -mtexdir="$texdir"/macros -if ! [ -d "$mtexdir" ]; then mkdir "$mtexdir"; fi - -# TeX build directory. If built in a group scenario, the TeX build -# directory must be separate for each member (so they can work on their -# relevant parts of the paper without conflicting with each other). -if [ "x$maneage_group_name" = x ]; then - texbdir="$texdir"/build -else - user=$(whoami) - texbdir="$texdir"/build-$user -fi -if ! [ -d "$texbdir" ]; then mkdir "$texbdir"; fi - -# TiKZ (for building figures within LaTeX). -tikzdir="$texbdir"/tikz -if ! [ -d "$tikzdir" ]; then mkdir "$tikzdir"; fi - # If 'tex/build' and 'tex/tikz' are symbolic links then 'rm -f' will delete # them and we can continue. However, when the project is being built from # the tarball, these two are not symbolic links but actual directories with @@ -1328,8 +1305,6 @@ rm -f .build .local ln -s "$bdir" .build ln -s "$instdir" .local -ln -s "$texdir" tex/build -ln -s "$tikzdir" tex/tikz # --------- Delete for no Gnuastro --------- rm -f .gnuastro @@ -1875,7 +1850,7 @@ pymodules=$(prepare_name_version $verdir/python/*) texpkg=$(prepare_name_version $verdir/tex/texlive) # Acknowledge these software packages in a LaTeX paragraph. -pkgver=$mtexdir/dependencies.tex +pkgver=$texdir/dependencies.tex # Add the text to the ${pkgver} file. .local/bin/echo "$thank_software_introduce " > $pkgver @@ -1892,7 +1867,7 @@ bibfiles="$ictdir/*" for f in $bibfiles; do if [ -f $f ]; then hasentry=1; break; fi; done; # Make sure we start with an empty output file. -pkgbib=$mtexdir/dependencies-bib.tex +pkgbib=$texdir/dependencies-bib.tex echo "" > $pkgbib # Fill it in with all the BibTeX entries in this directory. We'll just @@ -1912,7 +1887,7 @@ fi # --------------------------- # # Report hardware -hwparam="$mtexdir/hardware-parameters.tex" +hwparam="$texdir/hardware-parameters.tex" # Add the text to the ${hwparam} file. Since harware class might include # underscore, it must be replaced with '\_', otherwise pdftex would diff --git a/reproduce/software/shell/pre-make-build.sh b/reproduce/software/shell/pre-make-build.sh index 93d3266..28b7385 100755 --- a/reproduce/software/shell/pre-make-build.sh +++ b/reproduce/software/shell/pre-make-build.sh @@ -177,7 +177,7 @@ build_program() { fi # Unpack the tarball and go into it. - tar xf "$intar" + tar xf "$intar" --no-same-owner --no-same-permissions if [ x$intarrm = x1 ]; then rm "$intar"; fi cd "$unpackdir" |