diff options
-rw-r--r-- | .file-metadata | bin | 3721 -> 4007 bytes | |||
-rw-r--r-- | README-hacking.md | 73 | ||||
-rw-r--r-- | reproduce/src/make/dependencies-basic.mk | 63 |
3 files changed, 75 insertions, 61 deletions
diff --git a/.file-metadata b/.file-metadata Binary files differindex 1176014..d58756a 100644 --- a/.file-metadata +++ b/.file-metadata diff --git a/README-hacking.md b/README-hacking.md index 073138b..e7a3f44 100644 --- a/README-hacking.md +++ b/README-hacking.md @@ -926,38 +926,51 @@ Future improvements =================== This is an evolving project and as time goes on, it will evolve and become -more robust. Here are the list of features that we plan to add in the -future. - - - *Containers*: It is important to have better/full control of the - environment of the reproduction pipeline. Our current reproducible - paper pipeline builds the higher-level programs (for example GNU Bash, - GNU Make, GNU AWK and etc) it needs and sets `PATH` to prefer its own - builds. It currently doesn't build and use its own version of - lower-level tools (like the C library and compiler). We plan to add the - build steps of these low level tools so the system's `PATH` can be - completely ignored within the pipeline and we are in full control of - the whole build process. Another solution is based on [an interesting - tutorial](https://mozillafoundation.github.io/2017-fellows-sf/re-papers/index.html) - by the Mozilla science lab to build reproducible papers. It suggests - using the [Nix package manager](https://nixos.org/nix/about.html) to - build the necessary software for the pipeline and run the pipeline in - its completely closed environment. This is an interesting solution - because using Nix or [Guix](https://www.gnu.org/software/guix/) (which - is based on Nix, but uses the [Scheme - language](https://en.wikipedia.org/wiki/Scheme_(programming_language)), - not a custom language for the management) will allow a fully working - closed environment on the host system which contains the instructions - on how to build the environment. The availability of the instructions - to build the programs and environment with Nix or Guix, makes them a - better solution than binary containers like - [docker](https://www.docker.com/) which are essentially just a binary - (not human readable) black box and only usable on the given CPU - architecture. However, one limitation of using these is their own - installation (which usually requires root access). - +more robust. Some of the most prominent issues we plan to implement in the +future are listed below, please join us if you are interested. + +Package management +------------------ + +It is important to have control of the environment of the reproduction +pipeline. The current reproducible paper template builds the higher-level +programs (for example GNU Bash, GNU Make, GNU AWK and domain-specific +software) it needs, then sets `PATH` so the analysis is done only with the +pipeline's built software. But currently the configuration of each program +is in the Makefile rules that build it. This is not good because a change +in the build configuration does not automatically cause a re-build. Also, +each separate project on a system needs to have its own built tools (that +can waste a lot of space). + +A good solution is based on the [Nix package +manager](https://nixos.org/nix/about.html): a separate file is present for +each software, containing all the necessary info to build it (including its +URL, its tarball MD5 hash, dependencies, configuration parameters, build +steps and etc). Using this file, a script can automatically generate the +Make rules to download, build and install program and its dependencies +(along with the dependencies of those dependencies and etc). + +All the software are installed in a "store". Each installed file (library +or executable) is prefixed by a hash of this configuration (and the OS +architecture) and the standard program name. For example (from the Nix +webpage): +``` +/nix/store/b6gvzjyb2pg0kjfwrjmg1vfhh54ad73z-firefox-33.1/ +``` +The important thing is that the "store" is *not* in the pipeline's search +path. After the complete installation of the software, symbolic links are +made to populate the pipeline's program and library search paths without a +hash. This hash will be unique to that particular software and its +particular configuration. So simply by searching for this hash in the +installed directory, we can find the installed files of that software to +generate the links. + +This scenario has several advantages: 1) a change in a software's build +configuration triggers a rebuild. 2) a single "store" can be used in many +projects, thus saving space and configuration time for new projects (that +commonly have large overlaps in lower-level programs). diff --git a/reproduce/src/make/dependencies-basic.mk b/reproduce/src/make/dependencies-basic.mk index fefda6f..cb3a7f2 100644 --- a/reproduce/src/make/dependencies-basic.mk +++ b/reproduce/src/make/dependencies-basic.mk @@ -718,7 +718,9 @@ $(ibdir)/ld: $(tdir)/binutils-$(binutils-version).tar.lz # We want to build GCC after building all the basic tools that are often # used in a configure script to enable GCC's configure script to work as # smoothly/robustly as possible. -# Including `objc, obj-c++' is necessary for installing matplotlib. +# +# Including Objective C and Objective C++ is necessary for installing +# `matplotlib'. $(ibdir)/gcc: $(tdir)/gcc-$(gcc-version).tar.xz \ $(ibdir)/ls \ $(ibdir)/sed \ @@ -737,34 +739,33 @@ $(ibdir)/gcc: $(tdir)/gcc-$(gcc-version).tar.xz \ $(ildir)/libgfortran* $(ildir)/libstdc* rm $(idir)/x86_64* # Un-pack all the necessary tools in the top building directory - cd $(ddir); \ - rm -rf gcc-build gcc-$(gcc-version); \ - tar xf $< && \ - mkdir $(ddir)/gcc-build && \ - cd $(ddir)/gcc-build && \ - ../gcc-$(gcc-version)/configure SHELL=$(ibdir)/bash \ - --prefix=$(idir) \ - --with-mpc=$(idir) \ - --with-mpfr=$(idir) \ - --with-gmp=$(idir) \ - --with-isl=$(idir) \ - --with-build-time-tools=$(idir) \ - --enable-shared \ - --disable-multilib \ - --disable-multiarch \ - --enable-threads=posix \ - --with-local-prefix=$(idir) \ - --enable-linker-build-id \ - --enable-lto \ - --enable-languages=c,c++,fortran,objc,obj-c++ \ - --disable-libada \ - --disable-nls \ - --enable-default-pie \ - --enable-default-ssp \ - --enable-cet=auto \ - --enable-decimal-float && \ - make SHELL=$(ibdir)/bash -j$$(nproc) && \ - make SHELL=$(ibdir)/bash install && \ - cd .. && \ + cd $(ddir); \ + rm -rf gcc-build gcc-$(gcc-version); \ + tar xf $< && \ + mkdir $(ddir)/gcc-build && \ + cd $(ddir)/gcc-build && \ + ../gcc-$(gcc-version)/configure SHELL=$(ibdir)/bash \ + --prefix=$(idir) \ + --with-mpc=$(idir) \ + --with-mpfr=$(idir) \ + --with-gmp=$(idir) \ + --with-isl=$(idir) \ + --with-build-time-tools=$(idir) \ + --enable-shared \ + --disable-multilib \ + --disable-multiarch \ + --enable-threads=posix \ + --with-local-prefix=$(idir) \ + --enable-linker-build-id \ + --enable-lto \ + --enable-languages=c,c++,fortran,objc,obj-c++ \ + --disable-libada \ + --disable-nls \ + --enable-default-pie \ + --enable-default-ssp \ + --enable-cet=auto \ + --enable-decimal-float && \ + make SHELL=$(ibdir)/bash -j$$(nproc) && \ + make SHELL=$(ibdir)/bash install && \ + cd .. && \ rm -rf gcc-build gcc-$(gcc-version) - |