Suggestion on good usage of /dev/shm in README-hacking.md

When you are working with large files and there is some good RAM in the system (large/powerful computers), it is beneficial to work in the shared memory directory and not the actual persistent storage (like HDD or SSD). With this commit, a fully working demo has been added to `README-hacking.md' (under the tips of "Make programming") to show how to effectively work in situations like this.
author: Mohammad Akhlaghi <mohammad@akhlaghi.org> 2019-10-29 13:23:42 +0000
committer: Mohammad Akhlaghi <mohammad@akhlaghi.org> 2019-10-29 13:27:51 +0000
commit: 97b511e71f4f63dfe2551460a6d3a04729948613 (patch)
tree: 612cb2218c55a64e1e95e30e3a3284d08dbe28cc /README-hacking.md
parent: 1d4c587489cbb17c7ae9f6c0d0d5d919690fafde (diff)
1 files changed, 57 insertions, 6 deletions
diff --git a/README-hacking.md b/README-hacking.md
index 338f03a..9ecda33 100644
--- a/README-hacking.md
+++ b/README-hacking.md
@@ -913,12 +913,13 @@ for the benefit of others.
 
    - *Large files*: If you are dealing with very large files (thus having
       multiple copies of them for intermediate steps is not possible), one
-      solution is the following strategy. Set a small plain text file as
-      the actual target and delete the large file when it is no longer
-      needed by the project (in the last rule that needs it). Below is a
-      simple demonstration of doing this. In it, we use Gnuastro's
-      Arithmetic program to add all pixels of the input image with 2 and
-      create `large1.fits`. We then subtract 2 from `large1.fits` to create
+      solution is the following strategy (Also see the next item on "Fast
+      access to temporary files"). Set a small plain text file as the
+      actual target and delete the large file when it is no longer needed
+      by the project (in the last rule that needs it). Below is a simple
+      demonstration of doing this. In it, we use Gnuastro's Arithmetic
+      program to add all pixels of the input image with 2 and create
+      `large1.fits`. We then subtract 2 from `large1.fits` to create
       `large2.fits` and delete `large1.fits` in the same rule (when its no
       longer needed). We can later do the same with `large2.fits` when it
       is no longer needed and so on.
@@ -938,6 +939,56 @@ for the benefit of others.
      possible to greatly simplify this repetitive statement and make the
      code even more readable throughout the whole project.
 
+   - *Fast access to temporary files*: Most Unix-like operating systems
+      will give you a special shared-memory device (directory) called
+      `/dev/shm`. This directory is actually in your RAM, not in your
+      persistance storage like the HDD or SSD. Reading and writing from/to
+      the RAM is much faster than persistant storage, so if you have enough
+      ram available, it can be very beneficial for large temporary
+      files. You can make random file names in `/dev/shm` (that don't exist
+      at the time they are created) with the `mktemp` command and use text
+      files as targets to keep the temporary name (as described in the item
+      above under "Large files") for later deletion. For example this fully
+      working Makefile (which you can actually put in a `Makefile` and run
+      if you have an `input.fits` in the same directory).
+        ```
+        .ONESHELL:
+        .SHELLFLAGS = -ec
+        all: mean-std.txt
+        shm-template = /dev/shm/$(shell whoami)-XXXXXXXXXX
+        large1.txt: input.fits
+                out=$$(mktemp $(shm-template))
+                astarithmetic $< 2 + --output=$$out.fits
+                echo "$$out" > $@
+        large2.txt: large1.txt
+                input=$$(cat $<)
+                out=$$(mktemp $(shm-template))
+                astarithmetic $$input.fits 2 - --output=$$out.fits
+                rm $$input.fits $$input
+                echo "$$out" > $@
+        mean-std.txt: large2.txt
+                input=$$(cat $<)
+                aststatistics $$input.fits --mean --std > $@
+                rm $$input.fits $$input
+        ```
+      The important point here is that the template has no suffix. So you
+      can add the suffix corresponding to your desired format. But more
+      importantly, when `mktemp` sets the random name, it also checks if no
+      file exists with that name and creates a file with that exact name at
+      that moment. So at the end of each recipe above, you'll have two
+      files in your `/dev/shm`, one empty file with no suffix one with a
+      suffix. The role of the file without suffix is just to ensure that
+      string of random characters will not be set by other calls to
+      `mktemp` and it should be deleted with the file containing a
+      suffix. This is the reason behind the `rm $$input.fits $$input`
+      command above: to make sure that first the file with a suffix is
+      deleted, then the core random file (note that when working in
+      parallel on powerful systems, other things can happen even in the
+      time between deleting two files of a single `rm` command). When using
+      this template, you can put the definition of `shm-template` in
+      `reproduce/analysis/make/initialize.mk` to be usable in all the
+      different Makefiles of your analysis.
+
 
  - **Software tarballs and raw inputs**: It is critically important to
      document the raw inputs to your project (software tarballs and raw
author	Mohammad Akhlaghi <mohammad@akhlaghi.org>	2019-10-29 13:23:42 +0000
committer	Mohammad Akhlaghi <mohammad@akhlaghi.org>	2019-10-29 13:27:51 +0000
commit	97b511e71f4f63dfe2551460a6d3a04729948613 (patch)
tree	612cb2218c55a64e1e95e30e3a3284d08dbe28cc /README-hacking.md
parent	1d4c587489cbb17c7ae9f6c0d0d5d919690fafde (diff)