From 991f4c25729ac2526f30abf3d68111dd820fbfed Mon Sep 17 00:00:00 2001 From: Mohammad Akhlaghi Date: Sun, 28 Jun 2020 03:18:54 +0100 Subject: README.md now has descriptions to build a Dockerfile Docker is a very commonly used program these days for building projects in an almost independent operating system. So the instructions to build a Dockerfile for the project were added in README.md. --- README.md | 183 ++++++++++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 154 insertions(+), 29 deletions(-) (limited to 'README.md') diff --git a/README.md b/README.md index bb82d3c..3de9300 100644 --- a/README.md +++ b/README.md @@ -7,21 +7,30 @@ See the end of the file for license conditions. This is the reproducible project source for the paper titled "**Towards Long-term and Archivable Reproducibility**", by Mohammad Akhlaghi, Raúl Infante-Sainz, Boudewijn F. Roukema, David Valls-Gabaud, Roberto -Baena-Gallé. - -To reproduce the results and final paper, the only dependency is a minimal -Unix-based building environment including a C compiler (already available -on your system if you have ever built and installed a software from source) -and a downloader (Wget or cURL). Note that **Git is not mandatory**: if you -don't have Git to run the first command below, go to the URL given in the -command on your browser, and download the project's source (there is a -button to download a compressed tarball of the project). If you have -received this source from arXiv, please see the respective section below. - -*IMPORTANT NOTE*: If you want to build using a distributed tarball that -isn't under Git's version control, see the points below under building -project tarball, a few minor modifications need to be made before starting -the project configuration and build. +Baena-Gallé, see [arXiv:2006.03018](https://arxiv.org/abs/2006.03018) or +[zenodo.3872247](https://doi.org/10.5281/zenodo.3872247). + +To learn more about the purpose, principles and technicalities of this +reproducible paper, please see `README-hacking.md`. In the "Quick start" +section below we show a minimal set of commands to clone, and reproduce the +full project using Git. In the next section the commands are explained +more. The following section describes how to deal with a tarball of the +project's source (not using Git). In the last section building the project +within a Docker container is described. + + + + + +### Quick start (using Git, with internet access) + +Run these commands to clone this project's history, enter it, configure it +(let it build and install its own software) and "make it (let it do +reproduce its analysis). If you already have the project on your system, +you can ignore the first step (cloning). In the core Maneage branch, all +operations will be done in the build-directory that you specify at +configure time, no root permissions are required and no other part of your +filesystem is affected. ```shell $ git clone https://gitlab.com/makhlaghi/maneage-paper @@ -30,12 +39,6 @@ $ ./project configure $ ./project make ``` -To learn more about the purpose, principles and technicalities of this -reproducible paper, please see `README-hacking.md`. For a general -introduction to reproducible science as implemented in this project -(through Maneage), please see Maneage project's webpage at -https://maneage.org. - @@ -47,11 +50,16 @@ requiring root/administrator permissions. 1. Necessary dependencies: - 1.1: Minimal software building tools like C compiler, Make, and other - tools found on any Unix-like operating system (GNU/Linux, BSD, Mac - OS, and others). All necessary dependencies will be built from - source (for use only within this project) by the `./project - configure` script (next step). + 1.1: Minimal software building tools like a C compiler and other very + basic POSIX tools found on any Unix-like operating system + (GNU/Linux, BSD, Mac OS, and others). All necessary dependencies + will be built from source (for use only within this project) by the + `./project configure` script (next step). Note that **Git is not + mandatory**: if you don't have Git to run the first command above, + go to the URL given in the command on your browser, and download + the project's source (there is a button to download a compressed + tarball of the project). You can also get project's source as a + tarball from arXiv or Zenodo. 1.2: (OPTIONAL) Tarball of dependencies. If they are already present (in a directory given at configuration time), they will be @@ -60,8 +68,7 @@ requiring root/administrator permissions. collected in the archived project on Zenodo (link below). Just unpack that tarball, and when `./project configure` asks for the "software tarball directory", give the address of the unpacked - directory that has all the tarballs. - https://doi.org/10.5281/zenodo.3872248 + directory: https://doi.org/10.5281/zenodo.3911395 2. Configure the environment (top-level directories in particular) and build all the necessary software for use in the next step. It is @@ -93,7 +100,7 @@ requiring root/administrator permissions. -### Building project tarball (possibly from arXiv) +### Building project tarball (without Git) If the paper is also published on arXiv, it is highly likely that the authors also uploaded/published the full project there along with the LaTeX @@ -186,6 +193,124 @@ finally create the final paper). +### Building in Docker containers + +Docker containers are a common way to build projects in an almost +independent filesystem and operating system. They also allow using a +minimal GNU/Linux operating system for each project within proprietary +operating systems like macOS or Windows. The steps below describe the +necessary components of a `Dockerfile` to build this project in a Docker +image with some explanations on each. You can just copy the code parts of +each item into a plain-text file called `Dockerfile` and apply the +necessary corrections in the copying phase (step 4), then run this command +to build the Docker image (note that Docker only runs as root!): + +```shell +docker build ./ +``` + +**NOTE: Internet necessary for TeXLive:** With the commands below in your +`Dockerfile`, you can disable the image's internet just after downloading +the necessary packages (step 2). However, until [task +15267](https://savannah.nongnu.org/task/?15267) is complete, the project +will need internet access to download the necessary TeXLive packages (in +the `./project configure` phase) to build the final PDF. Without TeXLive, +the analysis will be exactly reproduced, LaTeX macros will be created and +everything will be verified successfully (all in the build directory), +however, no PDF will be built to visualize/combine them in one file. + + 1. **Choose the base operating system:** The first step is to select the + operating system that will be used in the docker image. Note that your + choice of operating system also determines the commands of the next + step. + + ``` + FROM debian:stable-slim + ``` + + 2. **Necessary packages:** By default the "slim" versions of the operating + systems don't contain a compiler, so you need to use their package + managers to get them. Also, currently (until [task + 15481](https://savannah.nongnu.org/task/?15481) is complete), Maneage + doesn't yet build Xorg libraries that are necessary in tools like + Ghostscript to build PDFs (not related to the project's analysis). + + ``` + # C and C++ compiler. + RUN apt-get update && apt-get install -y gcc g++ + + # Necessary Xorg libraries (which aren't yet installed, see task 15481). + RUN apt-get install -y libxext-dev libxt-dev libsm-dev libice-dev + + # Uncomment this for a text editor (to modify files after image is built). + #RUN apt-get install -y nano + ``` + + 3. **Define a user:** Some packages will complain if you try to install + them as the default (root) container user. Generally, its also good + practice to avoid being the root user. After building the Docker image, + you can always run it as root with this command: `docker run -u 0 -it + XXXXXXX` (where `XXXXXXX` is the image identifier). + + ``` + RUN useradd -ms /bin/sh maneager + USER maneager + WORKDIR /home/maneager + ``` + + 4. **Copy project files into the container:** these commands make the + following assumptions: + + * The project's source is in the `maneaged-project/` subdirectory of + the directory that you will run `docker build` in. The source can + either from Git or from a tarball, both described above (note that + arXiv's tarball needs to be corrected as mentioned above). + + * (OPTIONAL, with internet) The project's software tarball (packaged in + `software-XXXX.tar.gz` and downloadable from the Zenodo link above, + just correct the `XXXX` part manually) is the same directory that you + will run `docker build` in. This is not mandatory: if you have + internet, the project will download its necessary software + automatically. + + * (OPTIONAL, with internet) The project's input data. The `INPUT-FILES` + depends on the project, please look into the project's + `reproduce/analysis/config/INPUTS.conf` for the URLs and file + names. This is not mandatory: if you have internet, the project will + download its necessary software automatically. + + ``` + RUN mkdir build + COPY --chown=maneager:maneager ./maneaged-project /home/maneager/source + + # Optional (for software and data, if internet is available) + RUN mkdir data + COPY --chown=maneager:maneager ./INPUT-FILES /home/maneager/data + COPY --chown=maneager:maneager ./software-XXXX.tar.gz /home/maneager/ + RUN tar xf software-XXXX.tar.gz && mv software-XXXX software && rm software-XXXX.tar.gz + ``` + + 5. **Configure the project:** The Docker image will configure the project + (let the project build all its necessary software). + + ``` + RUN cd /home/maneager/source \ + && ./project configure --build-dir=/home/maneager/build \ + --software-dir=/home/maneager/software \ + --input-dir=/home/maneager/data + ``` + + 6. **Do the project's analysis:** You are now ready to add the instruction + to automatically reproduce the project's analysis. + + ``` + RUN cd /home/maneager/source && ./project make + ``` + + + + + ### Copyright information This file and `.file-metadata` (a binary file, used by Metastore to store -- cgit v1.2.1