diff options
author | Mohammad Akhlaghi <mohammad@akhlaghi.org> | 2021-01-09 01:34:15 +0000 |
---|---|---|
committer | Mohammad Akhlaghi <mohammad@akhlaghi.org> | 2021-01-09 03:00:15 +0000 |
commit | d9a6855948fad17fa0fbc2017ab2be0238ca8b72 (patch) | |
tree | 790a2e5f6474958bd5865ecb35a2d61edcd74adf /reproduce | |
parent | b91af98dada5a33215d87325a651f3e836c02ebd (diff) |
IMPORTANT: analysis outputs written in BDIR/analysis
Until now, the build directory contained a 'software/' directory (that
hosted all the built software), a 'tex/' subdirectory for the final
building of the paper, and many other directories containing
intermediate/final data of the specific project. But this mixing of built
software and data is against our modularity and minimal complexity
principles: built software and built data are separate things and keeping
them separate will enable many optimizations.
With this commit, the build directory of the core Maneage branch will only
contain two sub-directories: 'software/' and 'analysis/'. The 'software/'
directory has the same contents as before and is not touched in this
commit. However, the 'analysis/' directory is new and everything created in
the './project make' phase of the project will be created inside of this
directory.
To facilitate easy access to these top-level built directories, two new
variables are defined at the top of 'initialize.mk': 'badir', which is
short for "built-analysis directory" and 'bsdir', which is short for
"built-software directory".
HOW TO IMPLEMENT THIS CHANGE IN YOUR PROJECT. It is easy: simply replace
all occurances of '$(BDIR)' in your project's subMakefiles (except the ones
below) to '$(badir)'. To confirm if everything is fine before building your
project from scratch after merging, you can run the following command to
see where 'BDIR' is used and confirm the only remaning cases.
$ grep -r BDIR reproduce/analysis/*
--> make/verify.mk: innobdir=$$(echo $$infile | sed -e's|$(BDIR)/||g'); \
--> make/initialize.mk:badir=$(BDIR)/analysis
--> make/initialize.mk:bsdir=$(BDIR)/software
--> make/initialize.mk: $$sys_rm -rf $(BDIR)
--> make/top-prepare.mk:all: $(BDIR)/software/preparation-done.mk
'BDIR' should only be present in lines of the files above. If you see
'$(BDIR)' used anywhere else, simply change it to '$(badir)'. Ofcourse, if
your project assumes BDIR in other contexts, feel free to keep it, it will
not conflict. If anything un-expected happens, please post a comment on the
link below (you need to be registered on Savannah to post a comment):
https://savannah.nongnu.org/task/?15855
One consequence of this change is that the 'analysis/' subdirectory can be
optionally mounted on a separate partition. The need for this actually came
up for some new users of Maneage in a Docker image. Docker can fix
portability problems on systems that we haven't yet supported (even
Windows!), or had a chance to fix low-level issues on. However, Docker
doesn't have a GUI interface. So to see the built PDF or intermediate data,
it was necessary to copy the built data to the host system after every
change, which is annoying during working on a project. It would also need
two copies of the source: one in the host, one in the container. All these
frustrations can be fixed with this new feature.
To describe this scenario, README.md now has a new section titled "Only
software environment in the Docker image". It explains step-by-step how you
can make a Docker image to only host the built software environment. While
your project's source, software tarballs and 'BDIR/analysis' directories
are on your host operating system. It has been tested before this commit
and works very nicely.
Diffstat (limited to 'reproduce')
-rw-r--r-- | reproduce/analysis/make/initialize.mk | 53 | ||||
-rw-r--r-- | reproduce/analysis/make/prepare.mk | 2 | ||||
-rw-r--r-- | reproduce/software/make/basic.mk | 2 | ||||
-rw-r--r-- | reproduce/software/make/high-level.mk | 2 | ||||
-rwxr-xr-x | reproduce/software/shell/configure.sh | 73 |
5 files changed, 78 insertions, 54 deletions
diff --git a/reproduce/analysis/make/initialize.mk b/reproduce/analysis/make/initialize.mk index a5d5b92..3b1ffe5 100644 --- a/reproduce/analysis/make/initialize.mk +++ b/reproduce/analysis/make/initialize.mk @@ -30,14 +30,24 @@ # parallel. Also, some programs may not be thread-safe, therefore it will # be necessary to put a lock on them. This project uses the `flock' program # to achieve this. -texdir = $(BDIR)/tex -lockdir = $(BDIR)/locks -indir = $(BDIR)/inputs -prepdir = $(BDIR)/prepare +# +# To help with modularity and clarity of the build directory (not mixing +# software-environment built-products with products built by the analysis), +# it is recommended to put all your analysis outputs in the 'analysis' +# subdirectory of the top-level build directory. +badir=$(BDIR)/analysis +bsdir=$(BDIR)/software + +# Derived directories (the locks directory can be shared with software +# which already has this directory.). +texdir = $(badir)/tex +lockdir = $(bsdir)/locks +indir = $(badir)/inputs +prepdir = $(padir)/prepare mtexdir = $(texdir)/macros +installdir = $(bsdir)/installed bashdir = reproduce/analysis/bash pconfdir = reproduce/analysis/config -installdir = $(BDIR)/software/installed @@ -56,7 +66,7 @@ installdir = $(BDIR)/software/installed ifeq (x$(project-phase),xprepare) $(prepdir):; mkdir $@ else -include $(BDIR)/software/preparation-done.mk +include $(bsdir)/preparation-done.mk ifeq (x$(include-prepare-results),xyes) include $(prepdir)/*.mk endif @@ -193,7 +203,7 @@ export MPI_PYTHON3_SITEARCH := # option: they add too many extra checks that make it hard to find what you # are looking for in the outputs. .SUFFIXES: -$(lockdir): | $(BDIR); mkdir $@ +$(lockdir): | $(bsdir); mkdir $@ @@ -228,8 +238,8 @@ clean-mmap:; rm -f reproduce/config/gnuastro/mmap* texclean: rm *.pdf - rm -rf $(BDIR)/tex/build/* - mkdir $(BDIR)/tex/build/tikz # 'tikz' is assumed to already exist. + rm -rf $(texdir)/build/* + mkdir $(texdir)/build/tikz # 'tikz' is assumed to already exist. clean: clean-mmap # Delete the top-level PDF file. @@ -241,10 +251,10 @@ clean: clean-mmap # features like ignoring the listing of a file with `!()' that we # are using afterwards. shopt -s extglob - rm -rf $(BDIR)/tex/macros/!(dependencies.tex|dependencies-bib.tex|hardware-parameters.tex) - rm -rf $(BDIR)/!(software|tex) $(BDIR)/tex/!(macros|$(texbtopdir)) - rm -rf $(BDIR)/tex/build/!(tikz) $(BDIR)/tex/build/tikz/* - rm -rf $(BDIR)/software/preparation-done.mk + rm -rf $(texdir)/macros/!(dependencies.tex|dependencies-bib.tex|hardware-parameters.tex) + rm -rf $(badir)/!(tex) $(texdir)/!(macros|$(texbtopdir)) + rm -rf $(texdir)/build/!(tikz) $(texdir)/build/tikz/* + rm -rf $(bsdir)/preparation-done.mk distclean: clean # Without cleaning the Git hooks, we won't be able to easily @@ -403,14 +413,15 @@ dist-zip: $(project-package-contents) dist-software: curdir=$$(pwd) dirname=software-$(project-commit-hash) - cd $(BDIR) + cd $(bsdir) + if [ -d $$dirname ]; then rm -rf $$dirname; fi mkdir $$dirname - cp -L software/tarballs/* $$dirname/ + cp -L tarballs/* $$dirname/ tar -cf $$dirname.tar $$dirname gzip -f --best $$dirname.tar rm -rf $$dirname cd $$curdir - mv $(BDIR)/$$dirname.tar.gz ./ + mv $(bsdir)/$$dirname.tar.gz ./ @@ -427,9 +438,11 @@ dist-software: # # 1. Those data that also go into LaTeX (for example to give to LateX's # PGFPlots package to create the plot internally) should be under the -# '$(BDIR)/tex' directory (because other LaTeX producers may also need -# it for example when using './project make dist'). The contents of -# this directory are directly taken into the tarball. +# '$(texdir)' directory (because other LaTeX producers may also need it +# for example when using './project make dist', or you may want to +# publish the raw data behind the plots, like: +# https://zenodo.org/record/4291207/files/tools-per-year.txt). The +# contents of this directory are also directly taken into the tarball. # # 2. The data that aren't included directly in the LaTeX run of the paper, # can be seen as supplements. A good place to keep them is under your @@ -441,7 +454,7 @@ dist-software: # (or paper's tex/appendix), you will put links to the dataset on servers # like Zenodo (see the "Publication checklist" in 'README-hacking.md'). tex-publish-dir = $(texdir)/to-publish -data-publish-dir = $(BDIR)/data-to-publish +data-publish-dir = $(badir)/data-to-publish $(tex-publish-dir):; mkdir $@ $(data-publish-dir):; mkdir $@ diff --git a/reproduce/analysis/make/prepare.mk b/reproduce/analysis/make/prepare.mk index 995132c..d0b61d9 100644 --- a/reproduce/analysis/make/prepare.mk +++ b/reproduce/analysis/make/prepare.mk @@ -23,7 +23,7 @@ # # Without this file, `./project make' won't work. prepare-dep = $(subst prepare, ,$(makesrc)) -$(BDIR)/software/preparation-done.mk: \ +$(bsdir)/preparation-done.mk: \ $(foreach s, $(prepare-dep), $(mtexdir)/$(s).tex) # If you need to add preparations define targets above to do the diff --git a/reproduce/software/make/basic.mk b/reproduce/software/make/basic.mk index 58ebdb2..9217ee9 100644 --- a/reproduce/software/make/basic.mk +++ b/reproduce/software/make/basic.mk @@ -48,7 +48,7 @@ include reproduce/software/config/checksums.conf include reproduce/software/config/urls.conf # Basic directories -lockdir = $(BDIR)/locks +lockdir = $(BDIR)/software/locks tdir = $(BDIR)/software/tarballs ddir = $(BDIR)/software/build-tmp idir = $(BDIR)/software/installed diff --git a/reproduce/software/make/high-level.mk b/reproduce/software/make/high-level.mk index 948b23a..d69722e 100644 --- a/reproduce/software/make/high-level.mk +++ b/reproduce/software/make/high-level.mk @@ -43,7 +43,7 @@ include reproduce/software/config/TARGETS.conf include reproduce/software/config/texlive-packages.conf # Basic directories (similar to 'basic.mk'). -lockdir = $(BDIR)/locks +lockdir = $(BDIR)/software/locks tdir = $(BDIR)/software/tarballs ddir = $(BDIR)/software/build-tmp idir = $(BDIR)/software/installed diff --git a/reproduce/software/shell/configure.sh b/reproduce/software/shell/configure.sh index 24e8409..812f3d3 100755 --- a/reproduce/software/shell/configure.sh +++ b/reproduce/software/shell/configure.sh @@ -44,8 +44,8 @@ need_gfortran=0 -# Internal directories -# -------------------- +# Internal source directories +# --------------------------- # # These are defined to help make this script more readable. topdir="$(pwd)" @@ -679,14 +679,14 @@ EOF fi # Then, see if the Fortran compiler works - testsource=$compilertestdir/test.f + testsourcef=$compilertestdir/test.f echo; echo; echo "Checking host Fortran compiler..."; - echo " PRINT *, \"... Fortran Compiler works.\"" > $testsource - echo " END" >> $testsource - if gfortran $testsource -o$testprog && $testprog; then - rm $testsource $testprog + echo " PRINT *, \"... Fortran Compiler works.\"" > $testsourcef + echo " END" >> $testsourcef + if gfortran $testsourcef -o$testprog && $testprog; then + rm $testsourcef $testprog else - rm $testsource + rm $testsourcef cat <<EOF ______________________________________________________ @@ -1165,8 +1165,8 @@ rm -f "$finaltarget" -# Project's top-level directories -# ------------------------------- +# Project's top-level built software directories +# ---------------------------------------------- # # These directories are possibly needed by many steps of process, so to # avoid too many directory dependencies throughout the software and @@ -1200,15 +1200,41 @@ if ! [ -d "$ictdir" ]; then mkdir "$ictdir"; fi itidir="$verdir"/tex if ! [ -d "$itidir" ]; then mkdir "$itidir"; fi +# Temporary software un-packing/build directory: if the host has the +# standard `/dev/shm' mounting-point, we'll do it in shared memory (on the +# RAM), to avoid harming/over-using the HDDs/SSDs. The RAM of most systems +# today (>8GB) is large enough for the parallel building of the software. +# +# For the name of the directory under `/dev/shm' (for this project), we'll +# use the names of the two parent directories to the current/running +# directory, separated by a `-' instead of `/'. We'll then appended that +# with the user's name (in case multiple users may be working on similar +# project names). Maybe later, we can use something like `mktemp' to add +# random characters to this name and make it unique to every run (even for +# a single user). +tmpblddir="$sdir"/build-tmp +rm -rf "$tmpblddir"/* "$tmpblddir" # If its a link, we need to empty its + # contents first, then itself. + + + + + +# Project's top-level built analysis directories +# ---------------------------------------------- + +# Top-level built analysis directories. +badir="$bdir"/analysis +if ! [ -d "$badir" ]; then mkdir "$badir"; fi + # Top-level LaTeX. -texdir="$bdir"/tex +texdir="$badir"/tex if ! [ -d "$texdir" ]; then mkdir "$texdir"; fi # LaTeX macros. mtexdir="$texdir"/macros if ! [ -d "$mtexdir" ]; then mkdir "$mtexdir"; fi - # TeX build directory. If built in a group scenario, the TeX build # directory must be separate for each member (so they can work on their # relevant parts of the paper without conflicting with each other). @@ -1224,7 +1250,6 @@ if ! [ -d "$texbdir" ]; then mkdir "$texbdir"; fi tikzdir="$texbdir"/tikz if ! [ -d "$tikzdir" ]; then mkdir "$tikzdir"; fi - # If 'tex/build' and 'tex/tikz' are symbolic links then 'rm -f' will delete # them and we can continue. However, when the project is being built from # the tarball, these two are not symbolic links but actual directories with @@ -1239,7 +1264,6 @@ else mv tex/build tex/build-from-tarball fi - # Set the symbolic links for easy access to the top project build # directories. Note that these are put in each user's source/cloned # directory, not in the build directory (which can be shared between many @@ -1247,7 +1271,9 @@ fi # # Note: if we don't delete them first, it can happen that an extra link # will be created in each directory that points to its parent. So to be -# safe, we are deleting all the links on each re-configure of the project. +# safe, we are deleting all the links on each re-configure of the +# project. Note that at this stage, we are using the host's 'ln', not our +# own, so its best not to assume anything (like 'ln -sf'). rm -f .build .local ln -s "$bdir" .build @@ -1260,21 +1286,6 @@ rm -f .gnuastro # ------------------------------------------ -# Temporary software un-packing/build directory: if the host has the -# standard `/dev/shm' mounting-point, we'll do it in shared memory (on the -# RAM), to avoid harming/over-using the HDDs/SSDs. The RAM of most systems -# today (>8GB) is large enough for the parallel building of the software. -# -# For the name of the directory under `/dev/shm' (for this project), we'll -# use the names of the two parent directories to the current/running -# directory, separated by a `-' instead of `/'. We'll then appended that -# with the user's name (in case multiple users may be working on similar -# project names). Maybe later, we can use something like `mktemp' to add -# random characters to this name and make it unique to every run (even for -# a single user). -tmpblddir="$sdir"/build-tmp -rm -rf "$tmpblddir"/* "$tmpblddir" # If its a link, we need to empty its - # contents first, then itself. # Set the top-level shared memory location. if [ -d /dev/shm ]; then shmdir=/dev/shm @@ -1300,7 +1311,7 @@ fi # symbolic link to it. Otherwise, just build the temporary build # directory under the project build directory. if [ x"$tbshmdir" = x ]; then mkdir "$tmpblddir"; -else ln -s "$tbshmdir" "$tmpblddir"; +else ln -s "$tbshmdir" "$tmpblddir"; fi |