aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorLines
2020-12-01Imported recent work in Maneage, minor conflicts fixedMohammad Akhlaghi-182/+144
Some minor conflicts that came up during the merge were fixed.
2020-12-01Default paper: macros available for date of commits citedMohammad Akhlaghi-5/+12
Until now, Maneage only provided the commit hashes (of the project and Maneage) as LaTeX macros to use in your paper. However, they are too cryptic and not really human friendly (unless you have access to the Git history on a computer). With this commit, to make things easier for the readers, the date of both commits are also available as LaTeX macros for use in the paper. The date of the Maneage commit is also included in the acknowledgements. Also, the paragraph above the acknowledgements has been updated with better explanation on why adding this acknowledgement in the science papers is good/necessary.
2020-12-01IMPORTANT: organizational improvements in Maneage TeX sourcesMohammad Akhlaghi-334/+298
This only concerns the TeX sources in the default branch. In case you don't use them, there should only be a clean conflict in 'paper.tex' (that is obvious and easy to fix). Conflicts may only happen in some of the 'tex/src/preamble-*.tex' files if you have actually changed them for your project. But generally any conflict that does arise by this commit with your project branch should be very clear and easy to fix and test. In short, from now on things will even be easier: any LaTeX configuration that you want to do for your project can be done in 'tex/src/preamble-project.tex', so you don't have to worry about any other LaTeX preamble file. They are either templates (like the ones for PGFPlots and BibLaTeX) or low-level things directly related to Maneage. Until now, this distinction wasn't too clear. Here is a summary of the improvements: - Two new options to './project make': with '--highlight-new' and '--highlight-notes' it is now possible to activate highlighting on the command-line. Until now, there was a LaTeX macro for this at the start of 'paper.tex' (\highlightchanges). But changing that line would change the Git commit hash, making it hard for the readers to trust that this is the same PDF. With these two new run-time options, the printed commit hash will not changed. - paper.tex: the sentences are formatted as one sentence per line (and one line per sentence). This helps in version controlling narrative and following the changes per sentence. A description of this format (and its advantages) is also included in the default text. - The internal Maneage preambles have been modified: - 'tex/src/preamble-header.tex' and 'tex/src/preamble-style.tex' have been merged into one preamble file called 'tex/src/preamble-maneage-default-style.tex'. This helps a lot in simply removing it when you use a journal style file for example. - Things like the options to highlight parts of the text are now put in a special 'tex/src/preamble-maneage.tex'. This helps highlight that these are Maneage-specific features that are independent of the style used in the paper. - There is a new 'tex/src/preamble-project.tex' that is the place you can add your project-specific customizations.
2020-11-30Comments to help clarify the roles of input files in paper.texMohammad Akhlaghi-1/+4
These can help a first-time reader of 'paper.tex'.
2020-11-30New tex/src/preamble-maneage.tex for Maneage-only TeX customizationMohammad Akhlaghi-23/+55
Until now, the Maneage-only features of LaTeX where mixed with 'tex/src/preamble-project.tex' (which is reserved for project-specific things). But we want to move the highlighting features (that have started here) into the core Maneage branch, so its best for these Maneage-specific features to be in a Maneage-specific preamble file. With this commit, a hew 'tex/src/preamble-maneage.tex' has been created for this purpose and the highlighting modes have been put in there. In the process, I noticed that 'tex/src/preamble-project.tex' doesn't have a copyright! This has been corrected.
2020-11-30Summarized Roberto's CV, further summarized Raul's and Mohammad'sMohammad Akhlaghi-9/+6
Roberto sent me his summarized CV which is now being included and I also removed the extra statements about non-degree things from Raul and my own biography (like mentioning Gnuastro, and scientific interests). To be short, we are only mentioning degrees and positions. For Raul, I added his M.Sc institute.
2020-11-30Imported improved definition that is made better after discussionMohammad Akhlaghi-5/+5
After Mohammad-reza sent me his commit on an improved definition for longevity, we had an indepth discussion (through a video-conference) to avoid complexities in the terminology, while staying on point and word-count. In this commit/merge, I am including the improved version of the definition of longevity, and the newly added term "functionality" (instead of "usability" that Mohammad-reza was originally complaining to).
2020-11-30Minor edit in paragraph on execution timeMohammad Akhlaghi-6/+6
The paragraph was slightly shortened, while keeping the main points.
2020-11-30Rephrased longetivity definitionMohammadreza Khellat-3/+4
Before this commit, Longetivity was defined on the basis of the term usability. Although the scope and context of the term has been mentioned right after its use, this could have caused confusion with the keyword "usability" in the field of software engineering. With this commit, Longetivity definition has been rephrased in a way that it would not require "usability". Furthermore, since longetivity would logically require the availability of the machines and platforms during the time of re-use, this has been explicitly mentioned in the definition.
2020-11-28README-hacking.md: updated paper to cite for using ManeageMohammad Akhlaghi-4/+3
Until now, we were asking the users of Maneage to cite the first paper that used its primoridal version (arXiv:1505:01664). But there is now a paper that fully describes the concept (arXiv:2006.03018). With this commit, in the 'citation' section of 'README-hacking.md' we now ask to cite the new paper.
2020-11-28Shorter biography for RaulRaul Infante-Sainz-3/+2
Following Boud's great suggestion, I also summarized my CV to be less than 40 words.
2020-11-27Shorter biography for MohammadMohammad Akhlaghi-8/+7
Following Boud's great suggestion, I also summarized my CV to be less than 40 words.
2020-11-27Shorter CVs for boud+davidBoud Roukema-10/+4
This commit provides shorter CVs for me (Boud) + David in order to get closer to the 6500 word limit. Our CVs are the least significant part of the paper.
2020-11-27Clickable links in appendicesBoud Roukema-6/+7
This commit makes the numbered links to references such as [13] [14] [15] in the appendices clickable in the pdf. The solution was to call the "\newcites" command from the "multilibs" package *after* loading "hyperref". First do "rm -fv .build/tex/build/*.bbl .build/tex/build/*.aux" and then "./project make" a few times.
2020-11-27Fix paper.bbl flaw; reduce long author lists: save 70 wordsBoud Roukema-10/+21
This commit fixes the error of trying to run bibtex on appendix.tex when the --no-appendix option is selected. A hardwired hack, appropriate only for this specific paper, replaces the more-than-three-author parts of two long author lists by "et al." To test this without having to redownload the menke file, first do "rm -fv .build/tex/build/*.aux .build/tex/build/*.bbl" and then "./project make --no-appendix" a few times. This commit should reduce the word length by about 70 words.
2020-11-27Merged with Boud's corrected answers (generally very similar)Mohammad Akhlaghi-114/+163
The only issue that still remains is how to address statistical reproducibility, and I am in touch with Boud to do this in the best way possible (it has been highlighted with '#####'s in the answers.
2020-11-26All the referee points have been answeredMohammad Akhlaghi-52/+88
There is an answer for all the referee points now. I also did some minor edits in the paper. But we are still over the limit by around 250 words. The only remaining point that is not yet addressed (and has '####' around it) is the discussion on parallelization and its effect on reproducibility.
2020-11-26All questions have now been responded toBoud Roukema-112/+179
This commit is intended to be submittable quality. Point 56 was removed, and the later points renumbered, because it was a point of Reviewer 5 described what we have done - it was not a criticism to respond do. :) The current word count (without abstract and references) is 6091.
2020-11-25Points 33-35 handled in answer to reviewersBoud Roukema-14/+27
This commit only modifies "peer-review/1-answer.txt", giving answers to Reviewer 4; these mostly take into account David's email list of proposed answers. No changes are done to "paper.tex".
2020-11-25Reviewer points 16 to 32Boud Roukema-37/+49
Copyediting of points 16 to 32 (paper.tex + peer-review/1-answer.txt) is done in this commit. TODO list: 2. paper lacking focus 9. tidy up README-hacking.md for appearance on website App B.G. similar to Figure ?? - ref missing 29. website: README-hacking.md and tutorial "on same page"
2020-11-25Reviewer points 1-15; appendix clickable linksBoud Roukema-113/+142
This commit updates "paper.tex" and "peer-review/1-answer.txt" for the first 15 (out of 59!) reviewer points, excluding points 2 (not yet done) and 9 (README-hacking.md needs tidying). A fix to "reproduce/analysis/make/paper.mk" for the links in the appendices is also done in this commit (the same algorithm as for paper.tex is added). The links in the appendices are not (yet) clickable.
2020-11-25Copyedit; no-abstract word count 6084Boud Roukema-31/+37
This commit tidies up minor aspects of the language in the text marked by "\new", e.g. a "wokflow" would be fine for Chinese cooking, but is a little off-topic for Maneage. :) The word count is reduced by about 7 words. I haven't yet got to the serious part: checked that we've responded to the referees' points, and completing the responses which we haven't yet done.
2020-11-24List of answers - minor copyeditBoud Roukema-18/+21
This commit does a minor copyedit of "peer-review/1-answer.txt", mostly just at the top, plus some hashes to highlight an unanswered concern; and removes the @ symbols (and full stops) from email addresses in the peer review email in order to reduce our feeding of email harvesters (spiders that collect email addresses for spammers).
2020-11-23Minor edits and correctionsMohammad Akhlaghi-12/+15
Raul's added point on the answer to the referee was very good, so I edited it a little to be more clear (and removed his name). Also, after looking in a few parts of the text, I fixed a few typos.
2020-11-23Minor corrections to the final paper documentRaul Infante-Sainz-20/+17
With this commit, I make several minor changes to the text of the final paper. They are not important, but minor modifications like avoiding contractions (don't -> do not, and so on).
2020-11-23Minor corrections to referees answerRaul Infante-Sainz-6/+13
With this commit, I am just adding several minor corrections to the answer to the referees. They are very minor typos. I would only emphasize the fact that in Maneage there is the "Minimal complexity" criteria, and because of that, even if the project is not able to be executed in the future, the interested reader could have a look at the analysis steps (because it is in plain text). Note that I put "Raul" at the beginning of the line, so my name should have to be removed in the final document to be sent to the referees.
2020-11-23First draft of all the points addressed by the refereesMohammad Akhlaghi-94/+2309
A new directory has been added at the top of the project's source called 'peer-review'. The raw reviews of the paper by the editors and referees has been added there as '1-review.txt'. All the main points raised by the referees have been listed in a numbered list and addressed (mostly) in '1-answers.txt'. The text of the paper now also includes all the implemented answers to the various points.
2020-11-20Highlighting changes can now be toggled at run-timeMohammad Akhlaghi-16/+34
Until now, the core Maneage 'paper.tex' had a '\highlightchanges' macro that defines two LaTeX macros: '\new' and '\tonote'. When '\highlightchanges' was defined, anything that was written within '\new' became dark green (highlighting new things that have been added). Also, anything that was written in '\tonote' was put within a '[]' and became dark red (to show that there is a note here that should be addressed later). When '\highlightchanges' wasn't defined, anything within the '\new' element would be black (like the rest of the text), and the things in '\tonote' would not be shown at all. Commenting the '\newcommand{\highlightchanges}{}' line within 'paper.tex' (to toggle the modes above) would create a different Git hash and has to be committed. But this different commit hash could create a false sense in the reader that other things have also been changed and the only way they could confirm was to actually go and look into the project history (which they will not usually have time to do, and thus won't be able to trust the two modes of the text). Also, the added highlights and the note highlights were bundeled together into one macro, so you couldn't only have one of them. With this commit, the choice of highlighting either one of the two is now done as two new run-time options to the './project' script (which are passed to the Makefiles, and written into the 'project.tex' file which is loaded into 'paper.tex'). In this way, we can generate two PDFs with the same Git commit (project's state): one with the selected highlights and another one without it. This issue actually came up for me while implementing the changes here: we need to submit one PDF to the journal/referees with highlights on the added features. But we also need to submit another PDF to arXiv and Zenodo without any highlights. If the PDFs have different commit hashes, the referees may associate it with other changes in any part of the work. For example https://oadoi.org/10.22541/au.159724632.29528907 that mentions "Another version of the manuscript was published on arXiv: 2006.03018", while the only difference was a few words in the abstract after the journal complained on the abstract word-count of our first submission (where the commit hashes matched with arXiv/Zenodo).
2020-11-15First edits on the newly added appendices in new formMohammad Akhlaghi-320/+350
With the optional appendices added recently to the paper, it was important to go through them and make them more fitting into the paper.
2020-11-15Building final PDF: pdf-build-final has to be given an explicit yesMohammad Akhlaghi-15/+15
Until now, when the 'pdf-build-final' configuration variable (defined in 'reproduce/analysis/config/pdf-build.conf') was given any string a PDF would be built. This was very confusing, because people could put a 'no' and the PDF would still be built! With this commit, only when this variable has a value of 'yes' will the PDF be built. If given any other string (or no string at all), it will not produce a PDF. This issue was reported by Zahra Sharbaf.
2020-11-13README.md: added commands to delete all Docker imagesMohammad Akhlaghi-0/+12
Until now we had described the basic commands on how to create and use Docker images, but we hadn't mentioned how you can delete them. With this commit the commands necessary for deleting Docker images have also been added at the bottom of the section on Docker.
2020-11-04Appendix of long paper added, optionally we can disable itMohammad Akhlaghi-314/+1044
Given the referee reports, after discussing with the editors of CiSE, we decided that it is important to include the complete appendix we had before that included a thorough review of existing tools and methods. However, the appendix will not be published in the paper (due to the strict word-count limit). It will only be used in the arXiv/Zenodo versions of the paper. This actually created a technical problem: we want the commit hash of the project source to remain the same when the paper is built with an appendix or without it. To fix this problem the choice of including an appendix has gone into the 'project' script as a run-time option called '--no-appendix'. So by default (when someone just runs './project make'), the PDF will have an appendix, but when we want to submit to the journal, or when the appendix isn't needed for a certain reason, we can use this new option. The appendix also has its own separate bibliography. Some other corrections made in this commit: 1. Some new references were added that had an '_' in their source, they were corrected in 'references.tex'. 2. I noticed that 'preamble-style.tex' is not actually used in this paper, so it has been deleted.
2020-10-18Recipes for final initialize and verify targets not on stdoutMohammad Akhlaghi-4/+11
The LaTeX macro files for these two subMakefiles are created on every run of './project make'. So their commands are also printed every time and hardly ever will a normal user want to modify or change these. So to avoid populating the standard output of a Maneaged project with all these extra lines every time (possibly getting mixed with the important analysis or LaTeX outputs), an '@' has been placed at the start of the recipes. With an '@' at the start of the recipe, Make is instructed to not print the commands it wants to run in the standard output.
2020-10-09Update README-hacking.md with elaphrocentre ArXiv:2010.03742Boud Roukema-2/+11
This commit updates README-hacking.md with the URIs for the 'elaphrocentre' galaxy formation pipeline paper arXiv:2010.03742. This makes three papers currently in the peer review pipeline: arXiv:2006.03018, arXiv:2007.11779, and arXiv:2010.03742, each chronologically corresponding to various stages of the review process.
2020-10-02TexLive's xstring package is now necessaryMohammad Akhlaghi-4/+4
After a fresh build of Maneage with a newly downloaded TeXLive, I noticed that it is complaining about not finding 'xstring.sty', apparently some package that depeneded on it is no longer including it itself! It is thus now added to the packages that are built by Maneage's TeXLive.
2020-09-24Gnuastro's analysis configuration files removedMohammad Akhlaghi-141/+7
Until now, the core Maneage branch included some configuration files for Gnuastro's programs. This was actually a remnant of the distant past when Maneage didn't actually build its own software and we had to rely on the host's software versions. This file contained the configuration files specific to Gnuastro for this project and also had a feature to avoid checking the host's own configuration files. However, we now build all our software ourselves with fixed configuration files (for the version that is being installed and its version is stored). So those extra configuration files were just extra and caused confusion and problems in some scenarios. With this commit, those extra files are now removed. Also, two small issues are also addressed in parallel with this commit: - When running './project make clean', the 'hardware-parameters.tex' macro file (which is created by './project configure' is not deleted. - The project title is now written into the default output's PDF's properties (through 'hypersetup' in 'tex/src/preamble-header.tex') through the LaTeX macro. All these issues were found and fixed with the help of Samane Raji.
2020-09-15Checking Xcode installation for macOS systemsRaul Infante-Sainz-1/+60
Until now, during the configure step it was checked if the host Operative System were GNU/Linux, and if not, we assumed it is macOS. However, it can be any other different OS! With this commit, now we explicity check if the system is GNU/Linux or Darwin (macOS). If it is not any of them, a warning message says to the user that the host system is different from which we have checked so far (and invite to contact us if there is any problem). In addition to this, if the system is macOS, now it checks if Xcode is already installed in the host system. If it is not installed, a warning message informs the user to do that in case a problem/crash in the configure step occurs. We have found that it is convenient to have Xcode installed in order to avoid some problems.
2020-09-14Add machine class related argument and fix small typosMohammadreza Khellat-14/+12
Before this commit, there were no arguments regarding machine related specifications in the manuscript. This was needed as Mohammad Akhlaghi came across a review of the artcile by Dylan Aïssi in which Dylan mentioned the need for discussing CPU architecture dependence in pursuing a long-trem archivable workflow. With this commit, the required argument has been added in Sec.IV POC: Maneage in the paragraph in which it is explained how 'macro files build the core skeleton of Maneage'. Furthermore, few typos in different places have been fixed and the 'pre-make-build.sh' has been updated with the latest fix in Maneage core project.
2020-09-09Imported recent important fixes in Maneage, no conflictsMohammad Akhlaghi-15/+34
There weren't any conflicts in this merge.
2020-09-09R is built without tcl/tk (for GUI) dependenceBoud Roukema-0/+19
Tcl/Tk are a set of tools to provide Graphic User Interface (GUI) support in some software. But they are not yet natively built within Maneage, primarily because we have higher-priority work right now. GUI tools in general aren't high on our priority list right now because GUI tools are generally good for human interaction (which is contrary to the reproducible philosophy), not automatic analysis (a core concept in reproducibility). So even later, when we do include Tcl/Tk in Maneage, their direct usage will be discouraged. Until this commit, because we don't yet build Tcl/Tk, the default maneage install of the statistical package R failed on a Debian Stretch, with 6227 repeats of the line: '/usr/lib//tcl8.5/tclConfig.sh: line 2: dpkg-architecture: command not found' To fix this problem (atleast until Tcl/Tk is installed within Maneage), R is now configured with the '--without-tcltk' option which fixed the problem. Please see the description above the R installation instructions in 'reproduce/software/make/high-level.mk' for more.
2020-09-08Removed all occurances of IFS in low-level scriptsMohammad Akhlaghi-11/+11
Following the previous commit, we recognized that the 'IFS' terms are not necessary and can be even cause problems. So all their occurances in the scripts of Maneage have been removed with this commit.
2020-09-07Software installation: removed IFS statements in pre-make-build.shMohammad Akhlaghi-4/+4
Until a recent commit, the IFS='"' was added at the start of the variables in this shell script and as a result, the SPACE character wasn't being used as a delimiter. This caused a major problem when downloading the tarballs (all the backup servers were considered as the top link). With this commit we removed these 'IFS' statements). Because we now check for the existance of meta-characters in the build directory name, there is no more problem, and also generally both the calling command and internally, we have double-qutations around the variable names. So removal of IFS will not affect the result in this scenario. This bug was found by Mohammadreza Khellat.
2020-09-04Minor correction in first sentence of abstractMohammad Akhlaghi-2/+1
This paper is generally about data analysis pipelines, so the abstract now starts with "Analysis pipelines" instead of "Reproducible workflows". I also noticed that the sentence was mistakenly broken into multiple lines.
2020-09-03Imported recent work in Maneage, minor conflicts fixedMohammad Akhlaghi-179/+362
Only two small conflicts came up: * The addition of the hardware architecture macro in 'paper.tex' (which was removed for now, but will be added as the referee has requested within the text). * The usage of "" around directory variables in 'paper.mk'.
2020-09-03Added example of DockerHub deleting unused Docker imagesMohammad Akhlaghi-1/+3
I saw this link today in the news (to be implemented from November 1st, 2020), and because it is directly related to this work, I added it. Many people assume that simply pushing a Docker image to DockerHub is enough to preserve it, but ignore how much it costs to maintain the storage and network capacity.
2020-08-28Edited README.md to remove installation of a text editorMohammad Akhlaghi-10/+7
With the previous commit, we now build Nano by default within Maneage, and project authors can ask to install Emacs and Vim within 'TARGETS.conf'. So in the instructions to build within a Docker image have been removed.
2020-08-28Plain text editors: nano in basic, emacs and vim in high-levelMohammad Akhlaghi-18/+87
While a project is under development, the raw analysis software are not the only necessary software in a project. We also need tools to all the edit plain-text files within the Maneaged project. Usually people use their operating system's plain-text editor. However, when working on the project on a new computer, or in a container, the plain-text editors will have different versions, or may not be present at all! This can be very annoying and frustrating! With this commit, Maneage now installs GNU Nano as part of the basic tools. GNU Nano is a very simple and small plain text editor (the installed size is only ~3.5MB, and it is friendly to new users). Therefore, any Maneaged project can assume atleast Nano will be present (in particular when no editor is available on the running system!). GNU Emacs and VIM (both without extra dependencies, in particular without GUI support) are also optionally available in 'high-level.mk' (by adding them to 'TARGETS.conf'). The basic idea for the more advanced editors (Emacs and VIM) is that project authors can add their favorite editor while they are working on the project, but upon publication they can remove them from 'TARGETS.conf'. A few other minor things came up during this work and are now also fixed: - The 'file' program and its libraries like 'libmagic' were linking to system's 'libseccomp'! This dependency then leaked into Nano (which depends on 'libmagic'). But this is just an extra feature of 'file', only for the Linux kernel. Also, we have no dependency on it so far. So 'file' is not configured to not build with 'libseccomp'. - A typo was fixed in the line where the physical core information is being read on macOS. - The top-level directories when running './project shell' are now quoted (in case they have special characters).
2020-08-27Machine architecture and byte-order available as LaTeX macroMohammadreza Khellat-161/+265
Until now, no machine-related specifications were being documented in the workflow. This information can become helpful when observing differences in the outcome of both software and analysis segments of the workflow by others (some software may behave differently based on host machine). With this commit, the host machine's 'hardware class' and 'byte-order' are collected and now available as LaTeX macros for the authors to use in the paper. Currently it is placed in the acknowledgments, right after mentioning the Maneage commit. Furthermore, the project and configuration scripts are now capable of dealing with input directory names that have SPACE (and other special characters) by putting them inside double-quotes. However, having spaces and metacharacters in the address of the build directory could cause build/install failure for some software source files which are beyond the control of Maneage. So we now check the user's given build directory string, and if the string has any '@', '#', '$', '%', '^', '&', '*', '(', ')', '+', ';', and ' ' (SPACE), it will ask the user to provide a different directory.
2020-08-25README.md: added explanation on copying files from Docker imageMohammad Akhlaghi-2/+17
When building Maneage inside a Docker container, in the end the users want to extract the final outputs from the container into their host operating system to inspect more comfortably. So with this commit, a short examplanation has been added on how to do this. We also noticed that it is much better if the 'Dockerfile' is stored and run in an empty directory, otherwise, it will start parsing the full directory and its subdirectories as the docker image's environment.
2020-08-20Data lineage and replicated plot in one rowMohammad Akhlaghi-40/+276
Until now, the replicated plot had the width of the full page and the data lineage graph was under it. Together they were covering more than half of the height of the page! But the plot showing the number of papers with tools really doesn't have too much detail, and all the space was being wasted. With this commit, the plot is now much much thinner and the data lineage graph has been fitted to the right of it.