diff options
author | Mohammad Akhlaghi <mohammad@akhlaghi.org> | 2021-01-04 03:47:07 +0000 |
---|---|---|
committer | Mohammad Akhlaghi <mohammad@akhlaghi.org> | 2021-01-04 03:47:07 +0000 |
commit | 402070be9b15e731f463be1758ca8eaab56cd7cc (patch) | |
tree | 4c94373cf3f3b8dc229b06b78417bddc3acdbaf6 | |
parent | 624ccc1326a5b9e86561bedcb97a9f04851e7067 (diff) | |
parent | a1a966a598eb3693463aa5b0153f37ba22cfee6d (diff) |
Imported recent updates in Maneage, no conflicts
There weren't any conflicts in this merge; either technical conflicts that
can be found by Git, or logical conflicts (that will cause a crash in the
project).
-rw-r--r-- | README-hacking.md | 63 | ||||
-rw-r--r-- | README.md | 26 | ||||
-rw-r--r-- | reproduce/software/make/basic.mk | 5 |
3 files changed, 70 insertions, 24 deletions
diff --git a/README-hacking.md b/README-hacking.md index 656a965..475f2ca 100644 --- a/README-hacking.md +++ b/README-hacking.md @@ -1010,14 +1010,17 @@ future. * *In plain-text*: If the data are in tabular form (for example the X and Y values in your plots), store them as a simple plain-text file - (for example with columns separated by white-space characters or in + (for example with columns separated by white-space characters) or in the more formal [Comma-separated - values](https://en.wikipedia.org/wiki/Comma-separated_values), or CSV, - format). If you have other types of data (for example images, or very - large tables with millions of rows/columns that can be inconvenient in - plain-text), feel free to use custom binary formats, but later, in the - description of your project on the server, tell people what software - they should use to open them. + values](https://en.wikipedia.org/wiki/Comma-separated_values) or CSV, + format). In the former case, its best to set the suffixes to `.txt` + (because most browsers/OSs will automatically know they are plain-text + and open them without needing any other software. If you have other + types of data (for example images, or very large tables with millions + of rows/columns that can be inconvenient in plain-text), feel free to + use custom binary formats, but later, in the description of your + project on the server, add a note, explaining what software they + should use to open them. * *Descriptive names*: In some papers there are many files and having cryptic names will only confuse your readers (actually, yourself in @@ -1052,7 +1055,16 @@ future. is defined in `initialize.mk`. So you can use it anywhere in your project. - * *Copyright as metadata*: people need to know if they can use the + * *Same commit hashes*: each dataset may have been created at + different phases of your project's history. If you simply upload the + produced datasets, they may therefore have different commits on + them. To avoid confusing your readers (and your self in the future), + it is best that they all have the same commit hash (which will also + be the commit hash printed in the paper). So upon publication, we + recommend deleting all of them and running `./project make` to build + them all with the same commit hash. + + * *Copyright as metadata*: people need to know if they can "use" the dataset (i.e., modify it), or possibly re-distribute it and their derived products. They also need to know how they can contact the creator of the datset (who is usually also the copyright owner). So @@ -1065,10 +1077,11 @@ future. the plots should be uploaded directly to Zenodo so they can be viewed/downloaded with a simple link in the caption. For example see the last sentence of the caption of Figure 1 in - [arXiv:2006.03018](https://arxiv.org/pdf/2006.03018.pdf), it points to - [the data](https://zenodo.org/record/3872248/files/tools-per-year.txt) - that was used to create that figure's top plot. As you see, this will - allow your paper's readers (again, most probably your future-self!) to + [arXiv:2006.03018v1](https://arxiv.org/pdf/2006.03018v1.pdf), it points + to [the + data](https://zenodo.org/record/3872248/files/tools-per-year.txt) that + was used to create that figure's top plot. As you see, this will allow + your paper's readers (again, most probably your future-self!) to directly access the numbers of each visualization (plot/figure) with a simple click in a trusted server. This also shows the major advantage of having your data as simple plain-text where possible, as described @@ -1104,20 +1117,24 @@ future. - **Fill `README.md`**: The `README.md` is *the first place* your readers are going to look into. It already has a default text with place-holders - in the form of `XXXXXX`. Please go through it and replace the - place-holders with the relevant information/links or feel free to - add/remove anything else. Just don't forget to tell your readers in - `README.md` that they can learn about this system in the - `README-hacking.md` file (ideally close to the top, like it is now). + in the form of `XXXXXX`. Please go through its first few paragraphs and + replace the place-holders with the relevant information/links or feel + free to add/remove anything else. The rest is just basic information + that is useful for any Maneage'd project. Just don't forget to tell your + readers in `README.md` that they can learn about this system in the + `README-hacking.md` file (ideally close to the top). - **Confirm if your project builds from scratch**: Before publishing anything, you should see if your project can indeed reproduce itself! - So, go to a temporary directory, clone your project from its repository - and try configuring and building it from scratch in a new-temporary - build-directory. It is important to ignore the directory you developed - your project on (source and build): you may have files there that you - forgot to import into Git or depended on in the build (it - happens!). Ideally, it would be good to try it on a different computer. + You may be mistakenly using temporarily created files that aren't built + when teh project is built from scratch (this happens a lot and is very + dangerous for the integrity of your project!). So, go to a temporary + directory, clone your project from its repository and try configuring + and building it from scratch in a new-temporary build-directory. It is + important to ignore the original directory you developed your project on + (source and build): you may have files there that you forgot to import + into Git or depended on in the build (it happens!). Ideally, it would be + good to try it on a different computer. - **Confirm if `./project make dist` works**: The special target `dist` tells the project to build a tarball that is ready to compile the LaTeX @@ -240,6 +240,32 @@ build the final PDF, please disable internet after the configuration phase. Note that only the necessary TeXLive packages are installed (~350 MB), not the full TeXLive collection! + 0. **Summary:** If you are already familiar with Docker, then the full + Dockerfile to get the project environment setup is shown here (without + any comments or explanations, because explanations are done in the next + items). Note that the last two `COPY` lines (to copy the directory + containing software tarballs used by the project and the possible input + databases) are optional because they will be downloaded if not + available. Once you build the Docker image, your project's environment + is setup and you can go into it to run `./project make` manually. + + ```shell + FROM debian:stable-slim + RUN apt-get update && apt-get install -y gcc g++ wget + RUN useradd -ms /bin/sh maneager + USER maneager + WORKDIR /home/maneager + RUN mkdir build + RUN mkdir software + COPY --chown=maneager:maneager ./project-source /home/maneager/source + COPY --chown=maneager:maneager ./software-dir /home/maneager/software + COPY --chown=maneager:maneager ./data-dir /home/maneager/data + RUN cd /home/maneager/source \ + && ./project configure --build-dir=/home/maneager/build \ + --software-dir=/home/maneager/software \ + --input-dir=/home/maneager/data + ``` + 1. **Choose the base operating system:** The first step is to select the operating system that will be used in the docker image. Note that your choice of operating system also determines the commands of the next diff --git a/reproduce/software/make/basic.mk b/reproduce/software/make/basic.mk index c4f0a16..2a28e76 100644 --- a/reproduce/software/make/basic.mk +++ b/reproduce/software/make/basic.mk @@ -974,10 +974,13 @@ $(ibidir)/gmp-$(gmp-version): \ # Less is useful with Git (to view the diffs within a minimal container) # and generally to view large files easily when the project is built in a # container with a minimal OS. -$(ibidir)/less-$(less-version): $(ibidir)/patchelf-$(patchelf-version) +$(ibidir)/less-$(less-version): $(ibidir)/ncurses-$(ncurses-version) tarball=less-$(less-version).tar.gz $(call import-source, $(less-url), $(less-checksum)) $(call gbuild, less-$(less-version), static,,-j$(numthreads)) + if [ -f $(ibdir)/patchelf ]; then + $(ibdir)/patchelf --set-rpath $(ildir) $(ibdir)/less; + fi echo "Less $(less-version)" > $@ # On Mac OS, libtool does different things, so to avoid confusion, we'll |