aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMohammad Akhlaghi <mohammad@akhlaghi.org>2021-01-09 23:36:58 +0000
committerMohammad Akhlaghi <mohammad@akhlaghi.org>2021-01-09 23:44:32 +0000
commit55d6570aecc5f442399262b7faa441d16ccd4556 (patch)
tree0e23073eadd7f5169b3ce38e829763a65e71b5cb
parente3f4be66020538e3ab641f91405b8c07582e5862 (diff)
parentd9a6855948fad17fa0fbc2017ab2be0238ca8b72 (diff)
Imported recent changes in Maneage, minor single conflict fixed
There was a single conflict in the comments of one part of 'configure.sh' that has been fixed. There was also a single place that needed to convert 'BDIR' to 'badir' in this project (so after the merge, it also built easily).
-rw-r--r--README.md145
-rw-r--r--reproduce/analysis/make/format.mk2
-rw-r--r--reproduce/analysis/make/initialize.mk53
-rw-r--r--reproduce/analysis/make/prepare.mk2
-rw-r--r--reproduce/software/make/basic.mk4
-rw-r--r--reproduce/software/make/high-level.mk2
-rwxr-xr-xreproduce/software/shell/configure.sh86
7 files changed, 213 insertions, 81 deletions
diff --git a/README.md b/README.md
index 294efe4..b8248c9 100644
--- a/README.md
+++ b/README.md
@@ -205,15 +205,18 @@ projects from one system to another without rebuilding. Just note that
Docker images are large binary files (+1 Gigabytes) and may not be usable
in the future (for example with new Docker versions not reading old
images). Containers are thus good for temporary/testing phases of a
-project, but shouldn't be what you archive! Hence if you want to save and
-move your maneaged project within a Docker image, be sure to commit all
-your project's source files and push them to your external Git repository
-(you can do these within the Docker image as explained below). This way,
-you can always recreate the container with future technologies
-too. Generally, if you are developing within a container, its good practice
-to recreate it from scratch every once in a while, to make sure you haven't
-forgot to include parts of your work in your project's version-controlled
-source.
+project, but shouldn't be what you archive for the long term!
+
+Hence if you want to save and move your maneaged project within a Docker
+image, be sure to commit all your project's source files and push them to
+your external Git repository (you can do these within the Docker image as
+explained below). This way, you can always recreate the container with
+future technologies too. Generally, if you are developing within a
+container, its good practice to recreate it from scratch every once in a
+while, to make sure you haven't forgot to include parts of your work in
+your project's version-controlled source. In the sections below we also
+describe how you can use the container **only for the software
+environment** and keep your data and project source on your host.
#### Dockerfile for a Maneaged project, and building a Docker image
@@ -246,8 +249,11 @@ MB), not the full TeXLive collection!
items). Note that the last two `COPY` lines (to copy the directory
containing software tarballs used by the project and the possible input
databases) are optional because they will be downloaded if not
- available. Once you build the Docker image, your project's environment
- is setup and you can go into it to run `./project make` manually.
+ available. You can also avoid copying over all, and simply mount your
+ host directories within the image, we have a separate section on doing
+ this below ("Only software environment in the Docker image"). Once you
+ build the Docker image, your project's environment is setup and you can
+ go into it to run `./project make` manually.
```shell
FROM debian:stable-slim
@@ -306,7 +312,10 @@ MB), not the full TeXLive collection!
```
4. **Copy project files into the container:** these commands make the
- following assumptions:
+ assumptions listed below. IMPORTANT: you can also avoid copying over
+ all, and simply mount your host directories within the image, we have a
+ separate section on doing this below ("Only software environment in the
+ Docker image").
* The project's source is in the `maneaged/` sub-directory and this
directory is in the same directory as the `Dockerfile`. The source
@@ -383,6 +392,8 @@ MB), not the full TeXLive collection!
docker build -t NAME ./
```
+
+
#### Interactive tests on built container
If you later want to start a container with the built image and enter it in
@@ -396,6 +407,8 @@ see below if you want to preserve your changes after you exit).
docker run -it NAME
```
+
+
#### Running your own project's shell for same analysis environment
The default operating system only has minimal features: not having many of
@@ -412,6 +425,8 @@ cd source
./project shell
```
+
+
#### Preserving the state of a built container
All interactive changes in a container will be deleted as soon as you exit
@@ -435,6 +450,8 @@ docker container list
docker commit XXXXXXX NEW-IMAGE-NAME
```
+
+
#### Copying files from the Docker image to host operating system
The Docker environment's file system is completely indepenent of your host
@@ -446,6 +463,110 @@ command).
docker cp CONTAINER:/file/path/within/container /host/path/target
```
+
+
+#### Only software environment in the Docker image
+
+You can set the docker image to only contain the software environment and
+keep the project source and built analysis files (data and PDF) on your
+host operating system. This enables you to keep the size of the Docker
+image to a minimum (only containing the built software environment) to
+easily move it from one computer to another. Below we'll summarize the
+steps.
+
+1. Get your user ID with this command: `id -u`.
+
+2. Put the following lines into a `Dockerfile` of an otherwise empty
+directory. Just replacing `UID` with your user ID (found in the step
+above). This will build the basic directory structure. for the next steps.
+
+```shell
+FROM debian:stable-slim
+RUN apt-get update && apt-get install -y gcc g++ wget
+RUN useradd -ms /bin/sh --uid UID maneager
+USER maneager
+WORKDIR /home/maneager
+RUN mkdir build
+```
+
+3. Create an image based on the `Dockerfile` above. Just replace `PROJECT`
+with your desired name.
+
+```shell
+docker build -t PROJECT ./
+```
+
+4. Run the command below to create a container based on the image and mount
+the desired directories on your host into the special directories of your
+container. Just don't forget to replace `PROJECT` and set the `/PATH`s to
+the respective paths in your host operating system.
+
+```shell
+docker run -v /PATH/TO/PROJECT/SOURCE:/home/maneager/source \
+ -v /PATH/TO/PROJECT/ANALYSIS/OUTPUTS:/home/maneager/build/analysis \
+ -v /PATH/TO/SOFTWARE/SOURCE/CODE/DIR:/home/maneager/software \
+ -v /PATH/TO/RAW/INPUT/DATA:/home/maneager/data \
+ -it PROJECT
+```
+
+5. After running the command above, you are within the container. Go into
+the project source directory and run these commands to build the software
+environment.
+
+```shell
+cd /home/maneager/source
+./project configure --build-dir=/home/maneager/build \
+ --software-dir=/home/maneager/software \
+ --input-dir=/home/maneager/data
+```
+
+6. After the configuration finishes successfully, it will say so and ask
+you to run `./project make`. But don't do that yet. Keep this Docker
+container open and don't exit the container or terminal. Open a new
+terminal, and follow the steps described in the sub-section above to
+preserve the built container as a Docker image. Let's assume you call it
+`PROJECT-ENV`. After the new image is made, you should be able to see the
+new image in the list of images with this command (in the same terminal
+that you created the image):
+
+```shell
+docker image list # In the other terminal.
+```
+
+7. Now you can run `./project make` in the initial container. You will see
+that all the built products (temporary or final datasets or PDFs), will be
+written in the `/PATH/TO/PROJECT/ANALYSIS/OUTPUTS` directory of your
+host. You can even change the source of your project on your host operating
+system an re-run Make to see the effect on the outputs and add/commit the
+changes to your Git history within your host. You can also exit the
+container any time. You can later load the `PROJECT-ENV` environment image
+into a new container with the same `docker run -v ...` command above, just
+use `PROJECT-ENV` instead of `PROJECT`.
+
+8. In case you want to store the image as a single file as backup or to
+move to another computer, you can run the commands below. They will produce
+a single `project-env.tar.gz` file.
+
+```shell
+docker save -o project-env.tar PROJECT-ENV
+gzip --best project-env.tar
+```
+
+9. To load the tarball above into a clean docker environment (either on the
+same system or in another system), and create a new container from the
+image like above (the `docker run -v ...` command). Just don't forget that
+if your `/PATH/TO/PROJECT/ANALYSIS/OUTPUTS` directory is empty on the
+new/clean system, you should first run `./project configure -e` in the
+docker image so it builds the core file structure there. Don't worry, it
+won't build any software and should finish in a second or two. Afterwards,
+you can safely run `./project make`.
+
+```shell
+docker load --input project-env.tar.gz
+```
+
+
+
#### Deleting all Docker images
After doing your tests/work, you may no longer need the multi-gigabyte
diff --git a/reproduce/analysis/make/format.mk b/reproduce/analysis/make/format.mk
index efd9918..fd4060a 100644
--- a/reproduce/analysis/make/format.mk
+++ b/reproduce/analysis/make/format.mk
@@ -23,7 +23,7 @@
# Save the "Table 3" spreadsheet from the downloaded `.xlsx' file into a
# simple plain-text file that is easy to use.
-a1dir = $(BDIR)/analysis1
+a1dir = $(badir)/analysis1
mk20tab3 = $(a1dir)/table-3.txt
$(a1dir):; mkdir $@
$(mk20tab3): $(indir)/menke20.xlsx | $(a1dir)
diff --git a/reproduce/analysis/make/initialize.mk b/reproduce/analysis/make/initialize.mk
index 8f1769e..168010f 100644
--- a/reproduce/analysis/make/initialize.mk
+++ b/reproduce/analysis/make/initialize.mk
@@ -30,14 +30,24 @@
# parallel. Also, some programs may not be thread-safe, therefore it will
# be necessary to put a lock on them. This project uses the `flock' program
# to achieve this.
-texdir = $(BDIR)/tex
-lockdir = $(BDIR)/locks
-indir = $(BDIR)/inputs
-prepdir = $(BDIR)/prepare
+#
+# To help with modularity and clarity of the build directory (not mixing
+# software-environment built-products with products built by the analysis),
+# it is recommended to put all your analysis outputs in the 'analysis'
+# subdirectory of the top-level build directory.
+badir=$(BDIR)/analysis
+bsdir=$(BDIR)/software
+
+# Derived directories (the locks directory can be shared with software
+# which already has this directory.).
+texdir = $(badir)/tex
+lockdir = $(bsdir)/locks
+indir = $(badir)/inputs
+prepdir = $(padir)/prepare
mtexdir = $(texdir)/macros
+installdir = $(bsdir)/installed
bashdir = reproduce/analysis/bash
pconfdir = reproduce/analysis/config
-installdir = $(BDIR)/software/installed
@@ -56,7 +66,7 @@ installdir = $(BDIR)/software/installed
ifeq (x$(project-phase),xprepare)
$(prepdir):; mkdir $@
else
-include $(BDIR)/software/preparation-done.mk
+include $(bsdir)/preparation-done.mk
ifeq (x$(include-prepare-results),xyes)
include $(prepdir)/*.mk
endif
@@ -184,7 +194,7 @@ export MPI_PYTHON3_SITEARCH :=
# option: they add too many extra checks that make it hard to find what you
# are looking for in the outputs.
.SUFFIXES:
-$(lockdir): | $(BDIR); mkdir $@
+$(lockdir): | $(bsdir); mkdir $@
@@ -215,8 +225,8 @@ project-package-contents = $(texdir)/$(project-package-name)
texclean:
rm *.pdf
- rm -rf $(BDIR)/tex/build/*
- mkdir $(BDIR)/tex/build/tikz # 'tikz' is assumed to already exist.
+ rm -rf $(texdir)/build/*
+ mkdir $(texdir)/build/tikz # 'tikz' is assumed to already exist.
clean:
# Delete the top-level PDF file.
@@ -228,10 +238,10 @@ clean:
# features like ignoring the listing of a file with `!()' that we
# are using afterwards.
shopt -s extglob
- rm -rf $(BDIR)/tex/macros/!(dependencies.tex|dependencies-bib.tex|hardware-parameters.tex)
- rm -rf $(BDIR)/!(software|tex) $(BDIR)/tex/!(macros|$(texbtopdir))
- rm -rf $(BDIR)/tex/build/!(tikz) $(BDIR)/tex/build/tikz/*
- rm -rf $(BDIR)/software/preparation-done.mk
+ rm -rf $(texdir)/macros/!(dependencies.tex|dependencies-bib.tex|hardware-parameters.tex)
+ rm -rf $(badir)/!(tex) $(texdir)/!(macros|$(texbtopdir))
+ rm -rf $(texdir)/build/!(tikz) $(texdir)/build/tikz/*
+ rm -rf $(bsdir)/preparation-done.mk
distclean: clean
# Without cleaning the Git hooks, we won't be able to easily
@@ -398,14 +408,15 @@ dist-zip: $(project-package-contents)
dist-software:
curdir=$$(pwd)
dirname=software-$(project-commit-hash)
- cd $(BDIR)
+ cd $(bsdir)
+ if [ -d $$dirname ]; then rm -rf $$dirname; fi
mkdir $$dirname
- cp -L software/tarballs/* $$dirname/
+ cp -L tarballs/* $$dirname/
tar -cf $$dirname.tar $$dirname
gzip -f --best $$dirname.tar
rm -rf $$dirname
cd $$curdir
- mv $(BDIR)/$$dirname.tar.gz ./
+ mv $(bsdir)/$$dirname.tar.gz ./
@@ -422,9 +433,11 @@ dist-software:
#
# 1. Those data that also go into LaTeX (for example to give to LateX's
# PGFPlots package to create the plot internally) should be under the
-# '$(BDIR)/tex' directory (because other LaTeX producers may also need
-# it for example when using './project make dist'). The contents of
-# this directory are directly taken into the tarball.
+# '$(texdir)' directory (because other LaTeX producers may also need it
+# for example when using './project make dist', or you may want to
+# publish the raw data behind the plots, like:
+# https://zenodo.org/record/4291207/files/tools-per-year.txt). The
+# contents of this directory are also directly taken into the tarball.
#
# 2. The data that aren't included directly in the LaTeX run of the paper,
# can be seen as supplements. A good place to keep them is under your
@@ -436,7 +449,7 @@ dist-software:
# (or paper's tex/appendix), you will put links to the dataset on servers
# like Zenodo (see the "Publication checklist" in 'README-hacking.md').
tex-publish-dir = $(texdir)/to-publish
-data-publish-dir = $(BDIR)/data-to-publish
+data-publish-dir = $(badir)/data-to-publish
$(tex-publish-dir):; mkdir $@
$(data-publish-dir):; mkdir $@
diff --git a/reproduce/analysis/make/prepare.mk b/reproduce/analysis/make/prepare.mk
index 995132c..d0b61d9 100644
--- a/reproduce/analysis/make/prepare.mk
+++ b/reproduce/analysis/make/prepare.mk
@@ -23,7 +23,7 @@
#
# Without this file, `./project make' won't work.
prepare-dep = $(subst prepare, ,$(makesrc))
-$(BDIR)/software/preparation-done.mk: \
+$(bsdir)/preparation-done.mk: \
$(foreach s, $(prepare-dep), $(mtexdir)/$(s).tex)
# If you need to add preparations define targets above to do the
diff --git a/reproduce/software/make/basic.mk b/reproduce/software/make/basic.mk
index 2a28e76..9217ee9 100644
--- a/reproduce/software/make/basic.mk
+++ b/reproduce/software/make/basic.mk
@@ -48,7 +48,7 @@ include reproduce/software/config/checksums.conf
include reproduce/software/config/urls.conf
# Basic directories
-lockdir = $(BDIR)/locks
+lockdir = $(BDIR)/software/locks
tdir = $(BDIR)/software/tarballs
ddir = $(BDIR)/software/build-tmp
idir = $(BDIR)/software/installed
@@ -1274,7 +1274,7 @@ $(ibidir)/binutils-$(binutils-version): \
if ! [ x"$(sys_library_path)" = x ]; then
for f in $(sys_library_path)/*crt*.o; do
b=$$($(ibdir)/basename $$f)
- ln -s $$f $(ildir)/$$b
+ ln -sf $$f $(ildir)/$$b
done
fi
diff --git a/reproduce/software/make/high-level.mk b/reproduce/software/make/high-level.mk
index 948b23a..d69722e 100644
--- a/reproduce/software/make/high-level.mk
+++ b/reproduce/software/make/high-level.mk
@@ -43,7 +43,7 @@ include reproduce/software/config/TARGETS.conf
include reproduce/software/config/texlive-packages.conf
# Basic directories (similar to 'basic.mk').
-lockdir = $(BDIR)/locks
+lockdir = $(BDIR)/software/locks
tdir = $(BDIR)/software/tarballs
ddir = $(BDIR)/software/build-tmp
idir = $(BDIR)/software/installed
diff --git a/reproduce/software/shell/configure.sh b/reproduce/software/shell/configure.sh
index 219b335..812f3d3 100755
--- a/reproduce/software/shell/configure.sh
+++ b/reproduce/software/shell/configure.sh
@@ -44,8 +44,8 @@ need_gfortran=0
-# Internal directories
-# --------------------
+# Internal source directories
+# ---------------------------
#
# These are defined to help make this script more readable.
topdir="$(pwd)"
@@ -679,14 +679,14 @@ EOF
fi
# Then, see if the Fortran compiler works
- testsource=$compilertestdir/test.f
+ testsourcef=$compilertestdir/test.f
echo; echo; echo "Checking host Fortran compiler...";
- echo " PRINT *, \"... Fortran Compiler works.\"" > $testsource
- echo " END" >> $testsource
- if gfortran $testsource -o$testprog && $testprog; then
- rm $testsource $testprog
+ echo " PRINT *, \"... Fortran Compiler works.\"" > $testsourcef
+ echo " END" >> $testsourcef
+ if gfortran $testsourcef -o$testprog && $testprog; then
+ rm $testsourcef $testprog
else
- rm $testsource
+ rm $testsourcef
cat <<EOF
______________________________________________________
@@ -1165,8 +1165,8 @@ rm -f "$finaltarget"
-# Project's top-level directories
-# -------------------------------
+# Project's top-level built software directories
+# ----------------------------------------------
#
# These directories are possibly needed by many steps of process, so to
# avoid too many directory dependencies throughout the software and
@@ -1200,15 +1200,41 @@ if ! [ -d "$ictdir" ]; then mkdir "$ictdir"; fi
itidir="$verdir"/tex
if ! [ -d "$itidir" ]; then mkdir "$itidir"; fi
+# Temporary software un-packing/build directory: if the host has the
+# standard `/dev/shm' mounting-point, we'll do it in shared memory (on the
+# RAM), to avoid harming/over-using the HDDs/SSDs. The RAM of most systems
+# today (>8GB) is large enough for the parallel building of the software.
+#
+# For the name of the directory under `/dev/shm' (for this project), we'll
+# use the names of the two parent directories to the current/running
+# directory, separated by a `-' instead of `/'. We'll then appended that
+# with the user's name (in case multiple users may be working on similar
+# project names). Maybe later, we can use something like `mktemp' to add
+# random characters to this name and make it unique to every run (even for
+# a single user).
+tmpblddir="$sdir"/build-tmp
+rm -rf "$tmpblddir"/* "$tmpblddir" # If its a link, we need to empty its
+ # contents first, then itself.
+
+
+
+
+
+# Project's top-level built analysis directories
+# ----------------------------------------------
+
+# Top-level built analysis directories.
+badir="$bdir"/analysis
+if ! [ -d "$badir" ]; then mkdir "$badir"; fi
+
# Top-level LaTeX.
-texdir="$bdir"/tex
+texdir="$badir"/tex
if ! [ -d "$texdir" ]; then mkdir "$texdir"; fi
# LaTeX macros.
mtexdir="$texdir"/macros
if ! [ -d "$mtexdir" ]; then mkdir "$mtexdir"; fi
-
# TeX build directory. If built in a group scenario, the TeX build
# directory must be separate for each member (so they can work on their
# relevant parts of the paper without conflicting with each other).
@@ -1224,20 +1250,6 @@ if ! [ -d "$texbdir" ]; then mkdir "$texbdir"; fi
tikzdir="$texbdir"/tikz
if ! [ -d "$tikzdir" ]; then mkdir "$tikzdir"; fi
-# If 'tex/build' and 'tex/tikz' aren't symbolic links, then we are in the
-# tarball (not the Git repository), so we'll give them another name and let
-# the script continue normally.
-if rm -f tex/build; then
- rm -f tex/tikz
-else
- mv tex/tikz tex/tikz-from-tarball
- mv tex/build tex/build-from-tarball
-fi
-
-
-
-
-
# If 'tex/build' and 'tex/tikz' are symbolic links then 'rm -f' will delete
# them and we can continue. However, when the project is being built from
# the tarball, these two are not symbolic links but actual directories with
@@ -1252,7 +1264,6 @@ else
mv tex/build tex/build-from-tarball
fi
-
# Set the symbolic links for easy access to the top project build
# directories. Note that these are put in each user's source/cloned
# directory, not in the build directory (which can be shared between many
@@ -1260,7 +1271,9 @@ fi
#
# Note: if we don't delete them first, it can happen that an extra link
# will be created in each directory that points to its parent. So to be
-# safe, we are deleting all the links on each re-configure of the project.
+# safe, we are deleting all the links on each re-configure of the
+# project. Note that at this stage, we are using the host's 'ln', not our
+# own, so its best not to assume anything (like 'ln -sf').
rm -f .build .local
ln -s "$bdir" .build
@@ -1273,21 +1286,6 @@ rm -f .gnuastro
# ------------------------------------------
-# Temporary software un-packing/build directory: if the host has the
-# standard `/dev/shm' mounting-point, we'll do it in shared memory (on the
-# RAM), to avoid harming/over-using the HDDs/SSDs. The RAM of most systems
-# today (>8GB) is large enough for the parallel building of the software.
-#
-# For the name of the directory under `/dev/shm' (for this project), we'll
-# use the names of the two parent directories to the current/running
-# directory, separated by a `-' instead of `/'. We'll then appended that
-# with the user's name (in case multiple users may be working on similar
-# project names). Maybe later, we can use something like `mktemp' to add
-# random characters to this name and make it unique to every run (even for
-# a single user).
-tmpblddir="$sdir"/build-tmp
-rm -rf "$tmpblddir"/* "$tmpblddir" # If its a link, we need to empty its
- # contents first, then itself.
# Set the top-level shared memory location.
if [ -d /dev/shm ]; then shmdir=/dev/shm
@@ -1313,7 +1311,7 @@ fi
# symbolic link to it. Otherwise, just build the temporary build
# directory under the project build directory.
if [ x"$tbshmdir" = x ]; then mkdir "$tmpblddir";
-else ln -s "$tbshmdir" "$tmpblddir";
+else ln -s "$tbshmdir" "$tmpblddir";
fi