aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMohammad Akhlaghi <mohammad@akhlaghi.org>2019-10-01 16:17:59 +0100
committerMohammad Akhlaghi <mohammad@akhlaghi.org>2019-10-01 16:23:39 +0100
commit7caa2845304c40540a336f840b3ca468bf6c8697 (patch)
tree2ee7942a848f6880e5e2f9c2252e365bc20b7e65
parent6f86ba0c1f84b9c349666254c2a9716ba2058a3b (diff)
Preparation phase added before final building
In many real-world scenarios, `./project make' can really benefit from having some basic information about the data before being run. For example when quering a server. If we know how many datasets were downloaded and their general properties, it can greatly optmize the process when we are designing the solution to be run in `./project make'. Therefore with this commit, a new phase has been added to the template's design: `./project prepare'. In the raw template this is empty, because the simple analysis done in the template doesn't warrant it. But everything is ready for projects using the template to add preparation phases prior to the analysis.
-rw-r--r--.file-metadatabin6250 -> 6573 bytes
-rw-r--r--README-hacking.md88
-rw-r--r--README.md23
-rwxr-xr-xproject128
-rw-r--r--reproduce/analysis/make/prepare.mk35
-rw-r--r--reproduce/analysis/make/top-make.mk (renamed from reproduce/analysis/make/top.mk)0
-rw-r--r--reproduce/analysis/make/top-prepare.mk91
-rwxr-xr-xreproduce/software/bash/configure.sh6
8 files changed, 297 insertions, 74 deletions
diff --git a/.file-metadata b/.file-metadata
index b9fb074..f77bb41 100644
--- a/.file-metadata
+++ b/.file-metadata
Binary files differ
diff --git a/README-hacking.md b/README-hacking.md
index 30065c2..338f03a 100644
--- a/README-hacking.md
+++ b/README-hacking.md
@@ -245,11 +245,11 @@ In order to customize this template to your research, it is important to
first understand its architecture so you can navigate your way in the
directories and understand how to implement your research project within
its framework: where to add new files and which existing files to modify
-for what purpose. But before reading this theoretical discussion, please
-run the template (described in `README.md`: first run `./project
-configure`, then `./project make -j8`) without any change, just to see how
-it works (note that the configure step builds all necessary software, so it
-can take long, but you can read along while its working).
+for what purpose. But if this the first time you are using this template,
+before reading this theoretical discussion, please run the template once
+from scratch without any chages (described in `README.md`). You will see
+how it works (note that the configure step builds all necessary software,
+so it can take long, but you can continue reading while its working).
The project has two top-level directories: `reproduce` and
`tex`. `reproduce` hosts all the software building and analysis
@@ -266,28 +266,44 @@ do your project's analysis.
After it finishes, `./project configure` will create the following symbolic
links in the project's top source directory: `.build` which points to the
top build directory and `.local` for easy access to the custom built
-software installation directory.
-
-Once the project is configured for your system, `./project make` will doing
-the project's analysis with its own custom version of software. The process
-is managed through Make and `./project make` will start with
-`reproduce/analysis/make/top.mk` (called `top.mk` from now on).
-
-Let's continue the template's architecture with this file. `top.mk` is
-relatively short and heavily commented so hopefully the descriptions in
-each comment will be enough to understand the general details. As you read
-this section, please also look at the contents of the mentioned files and
-directories to fully understand what is going on.
-
-Before starting to look into the top `Makefile`, it is important to recall
-that Make defines dependencies by files. Therefore, the input/prerequisite
-and output of every step/rule must be a file. Also recall that Make will
-use the modification date of the prerequisite(s) and target files to see if
-the target must be re-built or not. Therefore during the processing, _many_
-intermediate files will be created (see the tips section below on a good
-strategy to deal with large/huge files).
-
-To keep the source and (intermediate) built files separate, you _must_
+software installation directory. With these you can easily access the build
+directory and project-specific software from your top source directory. For
+example if you run `.local/bin/ls` you will be using the `ls` of the
+template, which is problably different from your system's `ls` (run them
+both with `--version` to check).
+
+Once the project is configured for your system, `./project prepare` and
+`./project make` will do the basic preparations and run the project's
+analysis with the custom version of software. The `project` script is just
+a wrapper, and with the commands above, it will call `top-prepare.mk` and
+`top-make.mk` (both are in the `reproduce/analysis/make` directory).
+
+In the template, no particular preparation is necessary, so it will
+immediately finish and instruct you to run `./project make`. But in some
+projects, it can be very useful to do some very basic preparatory steps on
+the input data that can greatly optimize running of `./project make`. For
+example, you may need to query a server, to find how many input files there
+are. Once that number is known in the preparation phase, `./project make`
+can parallelize the analysis much more effectively.
+
+In terms of organization, `top-prepare.mk` and `top-make.mk` have an
+identical design, only a minor difference. So, let's continue the
+template's architecture with `top-make.mk`. Once you understand that,
+you'll clearly understand `top-prepare.mk` also. These very high-level
+files are relatively short and heavily commented so hopefully the
+descriptions in each comment will be enough to understand the general
+details. As you read this section, please also look at the contents of the
+mentioned files and directories to fully understand what is going on.
+
+Before starting to look into the top `top-make.mk`, it is important to
+recall that Make defines dependencies by files. Therefore, the
+input/prerequisite and output of every step/rule must be a file. Also
+recall that Make will use the modification date of the prerequisite(s) and
+target files to see if the target must be re-built or not. Therefore during
+the processing, _many_ intermediate files will be created (see the tips
+section below on a good strategy to deal with large/huge files).
+
+To keep the source and (intermediate) built files separate, the user _must_
define a top-level build directory variable (or `$(BDIR)`) to host all the
intermediate files (you defined it during `./project configure`). This
directory doesn't need to be version controlled or even synchronized, or
@@ -295,7 +311,9 @@ backed-up in other servers: its contents are all products, and can be
easily re-created any time. As you define targets for your new rules, it is
thus important to place them all under sub-directories of `$(BDIR)`. As
mentioned above, you always have fast access to this "build"-directory with
-the `.build` symbolic link.
+the `.build` symbolic link. Also, beware to *never* make any manual change
+in the files of the build-directory, just delete them (so they are
+re-built).
In this architecture, we have two types of Makefiles that are loaded into
the top `Makefile`: _configuration-Makefiles_ (only independent
@@ -350,10 +368,10 @@ other users access to the contents. Therefore the `./project configure` and
`./project make` steps must be called with special conditions which are
managed in the `--group` option.
-Let's see how this design is implemented. Please open and inspect `top.mk`
-it as we go along here. The first step (un-commented line) is to import the
-local configuration (your answers to the questions of `./project
-configure`). They are defined in the configuration-Makefile
+Let's see how this design is implemented. Please open and inspect
+`top-make.mk` it as we go along here. The first step (un-commented line) is
+to import the local configuration (your answers to the questions of
+`./project configure`). They are defined in the configuration-Makefile
`reproduce/software/config/installation/LOCAL.mk` which was also built by
`./project configure` (based on the `LOCAL.mk.in` template of the same
directory).
@@ -607,9 +625,9 @@ First custom commit
grants. Since you are using it in your work, it is necessary to
acknowledge them in your work also.
- - `reproduce/analysis/make/top.mk`: Delete the `delete-me` line in the
- `makesrc` definition. Just make sure there is no empty line between
- the `download \` and `paper` lines.
+ - `reproduce/analysis/make/top-make.mk`: Delete the `delete-me` line
+ in the `makesrc` definition. Just make sure there is no empty line
+ between the `download \` and `paper` lines.
- Delete all `delete-me*` files in the following directories:
diff --git a/README.md b/README.md
index f0f6acc..7b319aa 100644
--- a/README.md
+++ b/README.md
@@ -21,6 +21,7 @@ received this source from arXiv, please see the respective section below.
$ git clone XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
$ cd XXXXXXXXXXXXXXXXXX
$ ./project configure
+$ ./project prepare
$ ./project make
```
@@ -76,11 +77,23 @@ requiring root/administrator permissions.
$ ./project configure
```
-3. Run the following command (local build of the Make software) to
- reproduce all the analysis and build the final `paper.pdf` on `8`
- threads. If your CPU has a different number of threads, change the
- number (you can see the number of threads available to your operating
- system by running `./.local/bin/nproc`)
+3. In some cases, the project's analysis may need some preparations to
+ optimize its processing. This is usually mainly related to input data,
+ and some very basic calculations that can help the management of the
+ overall lproject in the main/next step. To do the basic preparations,
+ please run this command to do the preparation on `8` threads. If your
+ CPU has a different number of threads, change the number (you can see
+ the number of threads available to your operating system by running
+ `./.local/bin/nproc`)
+
+ ```shell
+ $ ./project prepare -j8
+ ```
+
+4. Run the following command to reproduce all the analysis and build the
+ final `paper.pdf` on `8` threads. If your CPU has a different number of
+ threads, change the number (you can see the number of threads available
+ to your operating system by running `./.local/bin/nproc`)
```shell
$ ./project make -j8
diff --git a/project b/project
index 14fc272..fcf32fd 100755
--- a/project
+++ b/project
@@ -65,12 +65,14 @@ print_help() {
# Print the output.
cat <<EOF
Usage: $scriptname configure [OPTIONS]
+ $scriptname prepare [OPTIONS]
$scriptname make [OPTIONS]
Top-level script to manage the reproducible project. The high-level
operation is defined by the (mandatory) second argument:
configure - Configure project for this machine (e.g., build software).
+ prepare - Low-level preparations to optimize building with 'make'.
make - Run the project (do analysis and build outputs).
RECOMMENDATION: If this is the first time you are configuring this
@@ -147,6 +149,7 @@ do
case $1 in
# Main operation.
configure) func_operation_set $1; shift;;
+ prepare) func_operation_set $1; shift;;
make) func_operation_set $1; shift;;
@@ -218,6 +221,74 @@ fi
+# Run operations in controlled environment
+# ----------------------------------------
+controlled_env() {
+ # Get the full address of the build directory:
+ bdir=`.local/bin/realpath .build`
+
+ # Remove all existing environment variables (with `env -i') and only
+ # use some pre-defined environment variables, then build the project.
+ envmake=".local/bin/env -i HOME=$bdir sys_rm=$(which rm) $gopt"
+ envmake="$envmake .local/bin/make -f $1"
+ if ! [ x"$debug" = x ]; then envmake="$envmake --debug=$debug"; fi
+
+ # Set the number of jobs. Note that for the `configure.sh' script the
+ # default value has to be 0, so the default is the maximum number of
+ # threads. But here, the default value is 1.
+ if ! [ x"$jobs" = x0 ]; then envmake="$envmake -j$jobs"; fi
+
+ # Run the project
+ if [ x"$group" = x ]; then
+ $envmake $make_targets
+ else
+ # Set the group and permission flags.
+ sg "$group" "umask $perms && $envmake $make_targets"
+ fi
+}
+
+
+
+
+
+# Error messages
+# --------------
+#
+# Having the error messages here helps the over-all process be more
+# readable.
+print_error_abort() {
+ case $1 in
+ prepare)
+ cat <<EOF
+
+The project isn't configured for this system, or the configuration wasn't
+successful. To configure the project, please use this command:
+
+ $ ./project configure
+
+(TIP: if you have already ran this command once, run it with '-e' to use
+the previous configuration, run with '--help' for more info)
+
+EOF
+ exit 1;
+ ;;
+ make)
+ cat <<EOF
+
+The project preparation hasn't been completed, or it wasn't successful. To
+prepare the project prior to building it, please use this command:
+
+ $ ./project prepare
+
+EOF
+ exit 1;
+ ;;
+ esac
+}
+
+
+
+
# Do requested operation
# ----------------------
perms="u+r,u+w,g+r,g+w,o-r,o-w,o-x"
@@ -280,54 +351,49 @@ case $operation in
fi
;;
- # Run the project
- make)
+
+
+
+
+ # Run the input management.
+ prepare)
# Make sure the configure script has been completed properly
# (`configuration-done.txt' exists).
if ! [ -f .build/software/configuration-done.txt ]; then
- cat <<EOF
+ print_error_abort $operation
+ fi
-The project isn't configured for this system, or the configuration wasn't
-successful. To configure the project, please use this command:
+ # Run input-preparations in control environment
+ controlled_env reproduce/analysis/make/top-prepare.mk
+ ;;
- '$ ./project configure'
-(TIP: if you have already ran this command once, run it with '-e' to use
-the previous configuration, run with '--help' for more info)
-EOF
- exit 1
+
+
+ # Run the project
+ make)
+
+ # Make sure the configure script has been completed properly
+ # (`configuration-done.txt' exists).
+ if ! [ -f .build/software/preparation-done.txt ]; then
+ print_error_abort $operation
fi
- # Get the full address of the build directory:
- bdir=`.local/bin/realpath .build`
+ # Run the actual project.
+ controlled_env reproduce/analysis/make/top-make.mk
+ ;;
- # Remove all existing environment variables (with `env -i') and
- # only use some pre-defined environment variables, then build the
- # project.
- envmake=".local/bin/env -i HOME=$bdir sys_rm=$(which rm) $gopt"
- envmake="$envmake .local/bin/make -f reproduce/analysis/make/top.mk"
- if ! [ x"$debug" = x ]; then envmake="$envmake --debug=$debug"; fi
- # Set the number of jobs. Note that for the `configure.sh' script
- # the default value has to be 0, so the default is the maximum
- # number of threads. But here, the default value is 1.
- if ! [ x"$jobs" = x0 ]; then envmake="$envmake -j$jobs"; fi
- # Run the project
- if [ x"$group" = x ]; then
- $envmake $make_targets
- else
- # Set the group and permission flags.
- sg "$group" "umask $perms && $envmake $make_targets"
- fi
- ;;
# Operation not specified.
*)
- echo "No operation defined (you can give 'configure' or 'make')."
+ echo "No operation defined."
+ echo "Please run with '--help' for more information."
+ echo "Available operations are: 'configure', 'prepare', or 'make')."
exit 1
;;
esac
diff --git a/reproduce/analysis/make/prepare.mk b/reproduce/analysis/make/prepare.mk
new file mode 100644
index 0000000..3e41fa9
--- /dev/null
+++ b/reproduce/analysis/make/prepare.mk
@@ -0,0 +1,35 @@
+# Basic preparations, called by `./project prepare'.
+#
+# Copyright (C) 2019 Mohammad Akhlaghi <mohammad@akhlaghi.org>
+#
+# This Makefile is free software: you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by the
+# Free Software Foundation, either version 3 of the License, or (at your
+# option) any later version.
+#
+# This Makefile is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
+# Public License for more details. See <http://www.gnu.org/licenses/>.
+
+
+
+
+
+# Final-target
+#
+# Without this file, `./project make' won't work.
+$(BDIR)/software/preparation-done.txt:
+
+ # If you need to add preparations define targets above to do the
+ # preparations. Recall that before this file, `top-prepare.mk'
+ # loads `initialize.mk' and `download.mk', so you can safely assume
+ # everything that is defined there in this Makefile.
+ #
+ # TIP: the targets can actually be automatically generated
+ # Makefiles that are used by `./project make'. They can include
+ # variables, or actual rules. Just make sure that those Makefiles
+ # aren't written in the source directory! Even though they are
+ # Makefiles, they are automatically built, so they should be
+ # somewhere under $(BDIR).
+ @touch $@
diff --git a/reproduce/analysis/make/top.mk b/reproduce/analysis/make/top-make.mk
index 7d20800..7d20800 100644
--- a/reproduce/analysis/make/top.mk
+++ b/reproduce/analysis/make/top-make.mk
diff --git a/reproduce/analysis/make/top-prepare.mk b/reproduce/analysis/make/top-prepare.mk
new file mode 100644
index 0000000..3353638
--- /dev/null
+++ b/reproduce/analysis/make/top-prepare.mk
@@ -0,0 +1,91 @@
+# Do basic preparations to optimize the project's running.
+#
+# NOTE: This file is very similar to `top-make.mk', so the large comments
+# are not included here. Please see that file for thorough comments on each
+# step.
+#
+# Copyright (C) 2019 Mohammad Akhlaghi <mohammad@akhlaghi.org>
+#
+# This Makefile is free software: you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by the
+# Free Software Foundation, either version 3 of the License, or (at your
+# option) any later version.
+#
+# This Makefile is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
+# Public License for more details.
+#
+# A copy of the GNU General Public License is available at
+# <http://www.gnu.org/licenses/>.
+
+
+
+
+
+# Load the local configuration (created after running
+# `./project configure').
+include reproduce/software/config/installation/LOCAL.mk
+
+
+
+
+
+# Ultimate target of this project
+# -------------------------------
+#
+# See `top-make.mk' for complete explanation.
+ifeq (x$(reproducible_paper_group_name),x$(GROUP-NAME))
+all: $(BDIR)/software/preparation-done.txt
+ @echo "";
+ echo "----------------"
+ echo "Project preparation has been completed without any errors."
+ echo ""
+ echo "Please run the following command to start building the project."
+ echo "(Replace '8' with the number of CPU threads on your system)"
+ echo ""
+ if [ "x$(GROUP-NAME)" = x ]; then
+ echo " $$ ./project make"
+ else
+ echo " $$ ./project make --group=$(GROUP-NAME) -j8"
+ fi
+ echo ""
+else
+all:
+ @if [ "x$(GROUP-NAME)" = x ]; then
+ echo "Project is NOT configured for groups, please run"
+ echo " $$ ./project prepare"
+ else
+ echo "Project is configured for groups, please run"
+ echo " $$ ./project prepare --group=$(GROUP-NAME) -j8"
+ fi
+endif
+
+
+
+
+
+# Define source Makefiles
+# -----------------------
+#
+# See `top-make.mk' for complete explanation.
+#
+# To ensure that `prepare' and `make' have the same basic definitions and
+# environment and that all `downloads' are managed in one place, both
+# `./project prepare' and `./project make' will first read `initialize.mk'
+# and `downloads.mk'.
+makesrc = initialize \
+ download \
+ prepare
+
+
+
+
+
+# Include all analysis Makefiles
+# ------------------------------
+#
+# See `top-make.mk' for complete explanation.
+include reproduce/analysis/config/*.mk
+include reproduce/software/config/installation/versions.mk
+include $(foreach s,$(makesrc), reproduce/analysis/make/$(s).mk)
diff --git a/reproduce/software/bash/configure.sh b/reproduce/software/bash/configure.sh
index 5c46496..7ef576a 100755
--- a/reproduce/software/bash/configure.sh
+++ b/reproduce/software/bash/configure.sh
@@ -1387,9 +1387,9 @@ echo `.local/bin/date` > $finaltarget
# The configuration is now complete, we can inform the user on the next
# step(s) to take.
if [ x$reproducible_paper_group_name = x ]; then
- buildcommand="./project make -j8"
+ buildcommand="./project prepare -j8"
else
- buildcommand="./project make --group=$reproducible_paper_group_name -j8"
+ buildcommand="./project prepare --group=$reproducible_paper_group_name -j8"
fi
cat <<EOF
@@ -1397,7 +1397,7 @@ cat <<EOF
The project and its environment are configured with no errors.
Please run the following command to start.
-(Replace '8' with the number of CPU threads)
+(Replace '8' with the number of CPU threads on your system)
$buildcommand