paper-concept.git - Paper (Towards Long-term and Archivable Reproducibility)

Age	Commit message (Collapse)	Author	Lines
2020-06-10	Corrected bug in using local copy of input dataset	Mohammad Akhlaghi	-13/+47
	As described in Maneage's commit 2bd2e2f18 (which I found while testing this project), the existing download recipe had problems when using a local copy of the input dataset. It was first fixed here, then implemented there. Also, to clarify things for a new user, some long comments were added at the top of 'INPUTS.conf' to describe each of the variables, that comment has also been put here (and is also in commit 2bd2e2f18 of Maneage).
2020-06-10	Updated text of default paper.tex, putting more recent examples	Mohammad Akhlaghi	-100/+165
	The text of the default paper hadn't been changed for a very long time! In this time, three papers using Maneage have been published (which can be very good as an example), Maneage also now has a webpage! With these commit these examples and the webpage have been added and generally it was also polished a little to hopefully be more useful.
2020-06-10	IMPORTANT: bug fix in default data download script of download.mk	Mohammad Akhlaghi	-14/+54
	Summary of possible semantic conflicts 1. The recipe to download input datasets has been modified. You have to re-set the old 'origname' variable to 'localname' (to avoid confusion) and the default dataset URL should now be complete (including the actual filename). See the newly added descriptions in 'INPUTS.conf' for more on this. Until now, when the dataset was already present on the host system, a link couldn't be made to it, causing the project to crash in the checksum phase. This has been fixed with properly naming the main variable as 'localname' to avoid the confusion that caused it. Some other problems have been fixed in this recipe in the meantime: - When the checksum is different, the expected and calculated checksums are printed. - In the default paper, we now print the full URL of the dataset, not just the server, so the checksum of the 'download.tex' step has been updated.
2020-06-09	Two minor typos corrected	Mohammad Akhlaghi	-2/+2
	Two words were corrected in the text that made the sentences grammatically wrong (they were actually typos! historically they were correct, but we later changed the later part of the sentence without fixing the first part).
2020-06-09	Minor edit printing arXiv URL in plain text metadata	Mohammad Akhlaghi	-1/+1
	Until now, in the 'print-copyright' function of 'initialize.mk' (that prints a fixed set of common meta necessary in plain-text files), we were simply printing this line: # Pre-print server: arXiv:1234.56789 But given that all the other elements are click-able URLs, it now prints: # Pre-print server: https://arxiv.org/abs/1234.56789
2020-06-09	Two minor corrections to avoid warnings in make and make clean	Mohammad Akhlaghi	-15/+12
	There were two small warnings that are removed with this commit: - In the end, when we print the number of words in the PDF, we hadn't accounted for the fact that 'paper.pdf' doesn't always exist (for example when './project make clean' is run). So a check was added to only print the number of words when a PDF exists. - I noticed that the '$(texdir)/to-publish' directory was being built both in 'initialize.mk' and in 'demo-plot.mk'. So the one in 'demo-plot.mk' has been removed.
2020-06-09	Imported Maneage, minor conflicts fixed, a bug found and fixed	Mohammad Akhlaghi	-78/+507
	Some minor conflicts came up in 'initialize.mk' and 'verify.mk'. For the former, I chose the version on Maneage, for the latter, I kept the 'master' version on the checksums of this project, but kept the Maneage version for the rest of the improvements there (like printing the verified files as LaTeX comments in 'verify.tex'. While testing the conflicts, I noticed a bug (in the LaTeX macro for the number of years in the Menke+20 paper) in the previous build, thanks to the verification step :-)! Fortunately it wasn't actually printed in the PDF, so a normal reader won't recognize. The bug was caused by the recently added meta-data/commented lines in the 'tools-per-year.txt' file: when calculating the number of years studied in that paper, we were simply counting all the lines and we had forgot to correct this after adding comments. As a result, the un-used LaTeX macro file was saying that they have studied 47 years instead of the real 31 years! This element was actually used in the very first (+40 page!) draft of the paper that was summarized to fit into the journal limits.
2020-06-07	Added SoftwareHeritage link, minor typo corrections and clarifications	Mohammad Akhlaghi	-24/+27
	The git history of the project is now archived on SoftwareHeritage and a link to it as was added in the "Reproducible supplement" tag just under the abstract. Also, some corrections were also made in the text. In particular, the part explaining the separation of software and data reproducibility was slightly clarified to be more clear
2020-06-06	IMPORTANT: Added publication checklist, improved relevant infrastructure	Mohammad Akhlaghi	-172/+727
	Possible semantic conflicts (that may not show up as Git conflicts but may cause a crash in your project after the merge): 1) The project title (and other basic metadata) should be set in 'reproduce/analysis/conf/metadata.conf'. Please include this file in your merge (if it is ignored because of '.gitattributes'!). 2) Consider importing the changes in 'initialize.mk' and 'verify.mk' (if you have added all analysis Makefiles to the '.gitattributes' file (thus not merging any change in them with your branch). For example with this command: git diff master...maneage -- reproduce/analysis/make/initialize.mk 3) The old 'verify-txt-no-comments-leading-space' function has been replaced by 'verify-txt-no-comments-no-space'. The new function will also remove all white-space characters between the columns (not just white space characters at the start of the line). Thus the resulting check won't involve spacing between columns. A common set of steps are always necessary to prepare a project for publication. Until now, we would simply look at previous submissions and try to follow them, but that was prone to errors and could cause confusion. The internal infrastructure also didn't have some useful features to make good publication possible. Now that the submission of a paper fully devoted to the founding criteria of Maneage is complete (arXiv:2006.03018), it was time to formalize the necessary steps for easier submission of a project using Maneage and implement some low-level features that can make things easier. With this commit a first draft of the publication checklist has been added to 'README-hacking.md', it was tested in the submission of arXiv:2006.03018 and zenodo.3872248. To help guide users on implementing the good practices for output datasets, the outputs of the default project shown in the paper now use the new features). After reading the checklist, please inspect these. Some other relevant changes in this commit: - The publication involves a copy of the necessary software tarballs. Hence a new target ('dist-software') was also added to package all the project's software tarballs in one tarball for easy distribution. - A new 'dist-lzip' target has been defined for those who want to distribute an Lzip-compressed tarball. - The '\includetikz' LaTeX macro now has a second argument to allow configuring the '\includegraphics' call when the plot should not be built, but just imported.
2020-06-06	Summarized abstract to be less than 150 words	Mohammad Akhlaghi	-16/+15
	Upon submission to CiSE we were informed that the abstract has to be less than 150 words to be processed. So with this commit, I am shrinking the abstract slightly, trying to remove some points that are less important and trying to shrink some of the sentences. Also, to avoid confusion and be more clear, the term "temporal provenance" has been replaced by "Recorded history".
2020-06-04	Scale element in includegraphics for roughly similar-sized figures	Mohammad Akhlaghi	-9/+11
	Until now, when the figures were built directly from EPS ('\newcommand{\makepdf}{}' was commented), they would take the full line-width becoming a little too large! I noticed this after letting arXiv build the PDF. With this commit, the 'includetikz' tool takes a second argument to be a parameter given to 'includegraphics' (which is scale in this case).
2020-06-04	Final full reading, and minor edits to submit to Zenodo and arXiv	Mohammad Akhlaghi	-58/+57
	Everything else regarding the submission to arXiv and Zenodo has been complete, so I done a final read, making some minor edits to hopefully make the text easier to read.
2020-06-04	README.md, separated scenarios of building from tarball	Mohammad Akhlaghi	-23/+60
	The previous explanation was not too clear and simply following it was confusing. The issue was that with the tarball you have three scenarios: 1) only build the PDF using existing figures. 2) only build the PDF, but build the figures yourself, 3) build the full Maneaged project. Hopefully this distinction is now more clear from the README.md file.
2020-06-04	README.md: improved points on building from tarball	Mohammad Akhlaghi	-14/+11
	Some extra explanation can help the user understand the difference between a Git-based project and a distributed tarball.
2020-06-04	tex/build and tex/tikz treated properly in tarball	Mohammad Akhlaghi	-1/+14
	When the project is being re-built from the tarball (not the Git repository), the 'tex/build' and 'tex/tikz' addresses are actual directories, not symbolic links. In this case, when someone runs './project configure', it will complain about not being able to delete them (it assumes they are symbolic links!). So with this commit, we first check if they are deletable without '-r'. If so, then they are full directories and we rename them to a backup directory to allow the rest of the project to continue building a link there.
2020-06-04	Minor improvements in the make dist command for this paper	Mohammad Akhlaghi	-7/+12
	This paper doesn't use pdflatex or biblatex, so it was necessary to make some small corrections in the make-dist rule of initialize.mk. Also, while testing the upload on arXiv, I noticed that it complains about an empty 'verify.tex' file, so that is also corrected.
2020-06-04	Verification activated, README added, Proper metadata in plot data	Mohammad Akhlaghi	-44/+117
	All the steps following the to-be-added (in 'README-hacking.md') publication checklist prior to the final check from new clone have been added: - 'README.md' file has been set. - "Reproducible supplement" was added just above the keywords, pointing to Zenodo. - A link to the to-be-uploaded data underlying the plot was added in the caption of the tools-per-year plot. - A new meta-data configuration file was added to store basic project metadata to be used throughout the project. This will later be taken into Maneage. For examle the project title is now stored here and written into the paper's LaTeX source and output datasets automatically. - Verification was activated and plot's data and LaTeX macro files are now automatically verified. - A complete metadata was added for the data underlying the plot. - A generic function was added in 'initialize.mk' that will automatically write project info and copyright in all plain-text outputs.
2020-06-04	README-hacking.md: minor edits in description of merging with Maneage	Mohammad Akhlaghi	-7/+15
	The recently added description for this step in the last commit needed some edits to be more clear and encourage re-building the project from scratch anytime authors merge with Maneage.
2020-06-03	Imported recent updated in Maneage, minor conflict fixed	Mohammad Akhlaghi	-915/+1386
	The minor conflict was with 'reproduce/software/make/high-level.mk', and in particular because we implemented the fix to Maneage's Task #15664 in this project first. After it was moved to the main Maneage branch some minor stylistic corrections were done to it, thus causing the conflict. To resolve the conflict, I simply imported the full Maneage version of the file with this command: git checkout maneage -- reproduce/software/make/high-level.mk The other conflicts were due to the deleted files (that were resolved as described in 'README-hacking.md') and the LaTeX files that I had told '.gitattributes' to ignore from the Maneage branch.
2020-06-03	README-hacking.md: Improved section on ignoring some files in Maneage	Mohammad Akhlaghi	-24/+55
	When some files should not be merged, until now we were suggesting to also add deleted files to the '.gitattributes' file. However, this feature of Git doesn't work for deleted files and they would still show up in the 'master' branch after a merge. So with this commit, we have added a simple AWK command to run after a merge that will automatically detect and delete such files (using the output of 'git status --porcelain'). Also, two minor typos were corrected in the newly added 'servers-backup.conf' file: the copyright year was wrong and there was no new-line at the end of the file (a good convention!).
2020-06-03	Updated .gitattributes to include all files to not merge	Mohammad Akhlaghi	-4/+3
	Following a test merge, I noticed that the '.gitattributes' file is not doing anything about the deleted files and also that all the files in 'tex/src/*.txt' should be added (they are too project-specific). So now it only includes the files that aren't deleted. For the files that are deleted, in the Maneage 'README-hacking.md' file, I added an AWK command to easily remove them.
2020-06-03	Adding point on small-ness of final product, some summarization	Mohammad Akhlaghi	-83/+65
	I noticed that we hadn't include the publication of the workflow and the advantage that Maneage provides in this regard. So it was added at the end of the proof-of-concept section. However, it was necessary to summarize some other parts to not increase the wordcount.
2020-06-02	Core software build before using Make to build other software	Mohammad Akhlaghi	-364/+635
	Until now, Maneage would only build Flock before building everything else using Make (calling 'basic.mk') in parallel. Flock was necessary to avoid parallel downloads during the building of software (which could cause network problems). But after recently trying Maneage on FreeBSD (which is not yet complete, see bug #58465), we noticed that the BSD implemenation of Make couldn't parse 'basic.mk' (in particular, complaining with the 'ifeq' parts) and its shell also had some peculiarities. It was thus decided to also install our own minimalist shell, Make and compressor program before calling 'basic.mk'. In this way, 'basic.mk' can now assume the same GNU Make features that high-level.mk and python.mk assume. The pre-make building of software is now organized in 'reproduce/software/shell/pre-make-build.sh'. Another nice feature of this commit is for macOS users: until now the default macOS Make had problems for parallel building of software, so 'basic.mk' was built in one thread. But now that we can build the core tools with GNU Make on macOS too, it uses all threads. Furthermore, since we now run 'basic.mk' with GNU Make, we can use '.ONESHELL' and don't have to finish every line of a long rule with a backslash to keep variables and such. Generally, the pre-make software are now organized like this: first we build Lzip before anything else: it is downloaded as a simple '.tar' file that is not compressed (only ~400kb). Once Lzip is built, the pre-make phase continues with building GNU Make, Dash (a minimalist shell) and Flock. All of their tarballs are in '.tar.lz'. Maneage then enters 'basic.mk' and the first program it builds is GNU Gzip (itself packaged as '.tar.lz'). Once Gzip is built, we build all the other compression software (all downloaded as '.tar.gz'). Afterwards, any compression standard for other software is fine because we have it. In the process, a bug related to using backup servers was found in 'reproduce/analysis/bash/download-multi-try' for calling outside of 'basic.mk' and removed Bash-specific features. As a result of that bug-fix, because we now have multiple servers for software tarballs, the backup servers now have their own configuration file in 'reproduce/software/config/servers-backup.conf'. This makes it much easier to maintain the backup server list across the multiple places that we need it. Some other minor fixes: - In building Bzip2, we need to specify 'CC' so it doesn't use 'gcc'. - In building Zip, the 'generic_gcc' Make option caused a crash on FreeBSD (which doesn't have GCC). - We are now using 'uname -s' to specify if we are on a Linux kernel or not, if not, we are still using the old 'on_mac_os' variable. - While I was trying to build on FreeBSD, I noticed some further corrections that could help. For example the 'makelink' Make-function now takes a third argument which can be a different name compared to the actual program (used for examle to make a link to '/usr/bin/cc' from 'gcc'. - Until now we didn't know if the host's Make implementation supports placing a '@' at the start of the recipe (to avoid printing the actual commands to standard output). Especially in the tarball download phase, there are many lines that are printed for each download which was really annoying. We already used '@' in 'high-level.mk' and 'python.mk' before, but now that we also know that 'basic.mk' is called with our custom GNU Make, we can use it at the start for a cleaner stdout. - Until now, WCSLIB assumed a Fortran compiler, but when the user is on a system where we can't install GCC (or has activated the '--host-cc' option), it may not be present and the project shouldn't break because of this. So with this commit, when a Fortran compiler isn't present, WCSLIB will be built with the '--disable-fortran' configuration option. This commit (task #15667) was completed with help/checks by Raul Infante-Sainz and Boud Roukema.
2020-06-01	Edits by David	David Valls-Gabaud	-100/+110
	These are some corrections that David sent to me by email and I am committing here.
2020-06-01	Implemented Antonio's suggestion and thanked him	Mohammad Akhlaghi	-1/+2
	Antonio Diaz Diaz (author of the Lzip program/library), has had a very supportive role in what became Maneage in the last 4 years. For example I really started to appreciate the value of simplicity and archivability while reading Lzip's documentation. Fortunately he also read a recent version of the paper that was again very supportive. Some of the minor points he raised had already been fixed, but using 'supplier' instead of 'server' (in the Free Software) criterion was new so I implemented it here with this commit. With this, I am also thanking him for all his wonderful support and encouragement in the last 4 years.
2020-06-01	Minor edits to clarify some of the previous corrections	Mohammad Akhlaghi	-3/+3
	Boud's point about a "random reader" not being a good example case was correct. But "user" also gives it a software perspective that is ofcourse not wrong, its can just be confusing. So I thought of changing it to "interested reader". In the part about the C-library dependency of high-level software, from Boud's correction, I found out that it is very hard to convey what I wanted to say (that separating errors due to C-library implementation and measurement errors will be easy, because they should be on much different scales). But I then corrected it to give it a slightly better tone while mentioning the same thing: that with Maneage we can now accurately measure the effect of the C library.
2020-05-31	Mostly minor edits of nearly final version	Boud Roukema	-17/+18
	Changes with this commit are mostly minor and obvious. Some worth commenting on include: * `technologies develop very fast` - As a general statement, this is too jargony, since technology is much wider than just `software`; `some technologies` makes it clear that we're referring to the specific case of the previous sentence * `in a functional-like paradigm, enabling exact provenance` - While `make` is not an imperative programming language, I don't see how `make` is `like` a functional programming language. Classifying it as a declarative and a dataflow programming language and as a metaprogramming language would seem to go in the right direction [1-3]. I also couldn't see how the language type relates to tracking exact provenance. But since we don't want to lengthen the text, my proposal is to put `and efficient in managing exact provenance` without trying to explain this in terms of a taxonomy of programming languages. [1] https://en.wikipedia.org/wiki/Functional_programming [2] https://en.wikipedia.org/wiki/Comparison_of_multi-paradigm_programming_languages [3] https://en.wikipedia.org/wiki/Dataflow_programming * `A random reader` - In the scientific programming context, `random` has quite specific meanings which we are not using here; a `reader` has not necessarily tried to reproduce the project. So I've proposed `A user` here - with the idea that a `user` is more likely to be someone who has done `./project configure && ./project make`. * `studying this is another research project` - the present tense `is` doesn't sound so good; I've put what seems to be about the shortest natural equivalent. Pdf word count: 5856
2020-05-30	Corrected a few words for more clarity	Mohammad Akhlaghi	-2/+2
	An "internally" was added to the part about core GNU tools accounting for the differences between POSIX-compatible systems. One extra word was also removed in the next sentence.
2020-05-30	Corrected a few words to make POSIX-fuzzyness paragraph more clear	Mohammad Akhlaghi	-2/+2
	Hopefully, it is more to the point with these few word-corrections.
2020-05-30	Discussion on issues with POSIX and minor edits to shorten paper	Mohammad Akhlaghi	-39/+47
	Konrad raised some very interesting points in particular about the limitations of POSIX as a fuzzy standard that does not guaratee reproducibility. A relatively long paragraph was thus added in the discussion to address this important point. In order to fit it in, the paragraph on "unwanted competition" was removed since the POSIX issue was much more relevant for a curious reader. Throughout the text, some other parts were edited to decrease the length of the paper while making it easier to read.
2020-05-30	Minor edits removing redundant sentences	Mohammad Akhlaghi	-4/+3
	Some of the redundant sentences have been removed and some minor edits made.
2020-05-29	Minor tidying of about half a dozen words	Boud Roukema	-11/+11
	The changes in this commit are best shown with `git diff --word-diff` or `git patch --word-diff`. There are about half a dozen changes of 1-2 words or a comma, the reasons should be obvious. The sentence with "can not just" seems to be correct formally, but "can not only" seems to me better to warn the reader that this is a phrase of the form "can not only do X but can also do Y"; "can not just" sounds a bit like "You cannot just enter the room without knocking" - it doesn't require a second part.
2020-05-29	Edits to the text, making it slightly shorter and more clear	Mohammad Akhlaghi	-62/+54
	One major point was that following Konrad's suggestion the issue of not being familiar with the Lisp/Scheme framework of GWL is now removed. We actually mention the main problem we have had with Guix, but also highlight that their solution was one of the main inspirations for this work.
2020-05-29	TeX installation crash because of different tarball versions fixed	Raul Infante-Sainz	-2/+58
	Until this commit, when the user have previous TeX tarball already present, the project crashed when trying to re-configure, if there was a newer version of TeX. This is because TeX are updated yearly. With this commit, this bug has been fixed. Now, during the installation of TeX, it checks if this problem happens. If this is the case, then it moves the old tarball, download the new one and install it. If not, it will just install the already present tarball or crash because of any other reason. This probem was recurrent, and each time TeX was updated, the previous tarball had to be removed manually. But now, with this commit, it is done automatically. The detection and fix of this bug has been possible with the help of Mohammad Akhlaghi, thanks!
2020-05-29	Adding small paragraph for Raul's biography	Raul Infante-Sainz	-1/+4
	Until this commit, there was only a small description of me. With this commit, I have added a small paragraph with my biography. I know we are very restricted because of the word limit so I tried to be very short!
2020-05-29	Minor typos corrected	Raul Infante-Sainz	-5/+6
	With this commit, I have corrected several minor typos.
2020-05-29	Minor corrections in abstract and introduction	Raul Infante-Sainz	-9/+9
	With this commit, I did some minor changes in these Sections. Main changes are: define the contraction `OS' from Operating System and use only `OS' later on, and not use contractions like `isn't'
2020-05-29	Cut down biography and inclusion of a mention to reproducibility	Roberto Baena-Gallé	-5/+5
	Before this commit: Roberto's bio was about 120 words. With this commit: it is now less than 100 words. A comment about reproducibility has been added.
2020-05-29	Reproducible research based on open-access papers	Boud Roukema	-3/+2509
	Publishing a paper on reproducible research without making it easy for readers to read the references would defeat the point. Of course we have to make some compromises with some journals' reluctance to shift towards the free world, but to satisfy scientific ethics, we should at least provide clickable URLs to the references, preferably to the ArXiv version if available [1], and also to the DOI, again, preferably to an open-access version of the URL if available. I was not able to fully get this done in the .bst file, so there's an sed/tr hack done to the .bbl file in `reproduce/analysis/make/paper.mk` to tidy up commas and spaces. This commit also reverts some of the hacks in the Akhlaghi IAU Symposium `tex/src/references.tex` entry, to match the improved .bst file, `tex/src/IEEEtran_openaccess.bst`, provided here with a different name to the original, in order to satisfy the LaTeX licence. [1] https://cosmo.torun.pl/blog/arXiv_refs
2020-05-29	pdftotext only called if present in system, minor edit	Mohammad Akhlaghi	-6/+8
	David and Raul had both reported that because 'pdftotext' wasn't available on their system, the project failed (even though the PDF was built!). So with this commit, we first check if the system has 'pdftotext' and call it only if its is available. Some minor edits were made, building upon Boud's previous commit.
2020-05-29	Section V - small changes	Boud Roukema	-25/+26
	This commit provides mostly small changes. There didn't seem much point in repeating the `lessons learned` jargon and claiming that we draw good conclusions - insights - from our experience. Better just state what hypotheses we have generated from the experience rather than give the misleading impression that our hypotheses are well-established facts. In the comments, I put a suggested translation of what the `lessons learned` jargon means. I seem to have first heard this term in the mainstream media a few years after the US 2003 attack on Iraq, when a US military representative stated that the US forces had "learned lessons" after having started a war of aggression against Iraq.
2020-05-29	Sentence with the clerk who can do it, software as uncountable noun	Boud Roukema	-2/+2
	This commit changes two lines. (1) Keeping the exact quote with the clerk while having a sentence that makes sense in plain English cannot be done, it seems to me, without making the sentence a bit longer. Here's one option that seems about the best we can do, even though it still sounds a bit funny, because it's hard to write a future conditional with the present "can". Since it's a quote, it will probably survive the proofreaders. (2) Software is an uncountable noun [1], so we say "software is", like "water is"; "used software" sounds odd; I added "is itself" to emphasise that we're especially talking about the full chain of software for running the project. This commit modifies the "When the ..." sentence and hopefully sounds better. [1] https://en.wiktionary.org/wiki/software#Noun
2020-05-29	Added top-make.mk as a listing for demonstration, minor edits	Mohammad Akhlaghi	-31/+55
	To help show the simplicity of 'top-make.mk', it was included as a listing. I also went over some of Boud's corrections and made small edits. In particular: - The '\label' and '\ref' to a section were removed. I done this after inspecting some of their recent papers and noticing that they generally have a simple flow, without such redirections. - In the part about the RDA adoption grant, I moved the "from the researcher perspective" to the end. Because Austin+2017 is mainly focused on data-center management, not the researcher's. They do touch upon researcher solutions that can help data-base managers, but not directly the researchers. In effect with this grant, they acknowledged that our researcher-focused solution confirms with their criteria for data-base management.
2020-05-29	Many small changes to Section IV - proof of concept: maneage	Boud Roukema	-60/+63
	Possibly the least trivial edit in this commit is that the previous text appeared to state that it's normal to find that a project prepared with `maneage` may be ... unbuildable. Which would defeat our whole claim of reproducibility! Obviously, `maneage` is still in a rapid development stage and might still have significant, not-yet-detected bugs. But the wording has to explain that this would constitute a bug in `Maneage` (in a particular version of it), not an expected regular event. :) This commit aims to fix that and other minor wording issues in IV. Pdf word count 5855.
2020-05-28	Cherry-pick 7bf5fcd to make merging easier	Boud Roukema	-6/+3
	This series of commits aims to edit sections II+III, but first implements the changes from 7bf5fcd, apart from one that conflicts in the abstract: this commit has ``Maneage'' without `(managing+lineage)` in the abstract. From Mohammad: this commit has been rebased after several other parallel branches, so some things may differ from the message.
2020-05-28	Fixed TeXLive crash because of differing local and server versions	Raul Infante-Sainz	-3/+80
	Until this commit, when the user had a previous TeXLive tarball already present (in their software-tarball directory) compared to the CTAN server, the project crashed in the configure phase. This was because TeXLive is updated yearly and we don't yet install TeXLive from source (currently we use its own package manager, but we plan to fix this in task #15267). With this commit, we fix the problem by checking the cause of the crash during the installation of TeX. If the crash is due to this particular error, we ignore the old tarball and download the new one and install it (the old one is still kept in '.build/software/tarballs', but will get a '-OLD' in its name. This probem was recurrent, and every year that TeXLive is updated, the previous tarball had to be removed manually! But with this commit, this is done automatically. The detection and fix of this bug has been possible with the help of Mohammad Akhlaghi, thanks!
2020-05-25	Unified reference to GNU/Linux and free software	Mohammad Akhlaghi	-10/+10
	One of the main reasons to building Maneage is to properly acknowledge/attribute the authors of software in research. So we have adopted a standard of never referring to the GNU-based operating systems running the Linux kernel simply as "Linux", we avoid terms like "Open Sourse" and use Free Software instead (in the same spirit). With this commit, a few instances of the cases above have been corrected, they had slipped through our fingers when we initially imported them into the project. In the special case of the "Journal for Open Source Software", we simply replaced it with its abbreviation (JOSS). This was done because in effect we were generally using journal name abbreviations in almost all the citations already. To avoid any inconsistancies, the names of the three other journals that weren't abbreviated are also abbreviated.
2020-05-23	Some minor edits on Boud's recent corrections	Mohammad Akhlaghi	-12/+12
	Generally they were great, but after looking through them I thought a hand-full of them slightly changed my original idea so I am correcting them here. Boud, if you feel the changes aren't good, let's talk about it and find the best way forward ;-). They are mostly clear from a '--word-diff', just some notes on the ones that have changed the meaning: * On the "a clerk can do it" quotation, since its so short, I think its better to keep its original form, otherwise a reader may thing there were paragraphs instead of the "to" and we have changed their intention. * In the part where we are saying that the workflow can get "separated" from the paper, I mostly meant to highlight that the data-centers and journals (hosts) may diverge in decades, or one of them may go bankrupt, or etc. Hence loosing the connection. The issue of it evolving can in theory be addressed through version control, so I think this is a more fundamental problem. * In the part about free software, in the list, the original point was the free software that are used by the project, not the project itself (after all, the project itself falls under the "Open Science" titles that is very fashionable these days, but my point here is to those people who claim to do "Open Science" with closed software (like Microsoft Excel!).
2020-05-23	Section III edits - 5901 words	Boud Roukema	-43/+43
	This commit makes several small changes to Section III, some of which are quite significant in terms of meaning. It was difficult to improve the clarity without extending the word length. Now we're at 5901 words.
2020-05-23	Section II edits + definition of solutions	Boud Roukema	-23/+23
	This commit implements quite a few minor changes in section II. The aim of most is to clarify the meaning and remove ambiguity. A few changes are that the reader will normally assume that successive sentences in a paragraph are closely related in terms of logical flow. It is superfluous - and considered excessive - to put too many "Therefore"'s and "Hence"'s in (at least) modern astronomy style. These are supposed to be used when there is a strong chain of reasoning. One change is done in the Introduction, because if we're going to use "solution(s)" throughout to mean "reproducible workflow solution(s)", then we have to clearly define this as jargon for this particular paper. It's probably preferable to RWS - reproducible workflow solution - or RWI - reproducible workflow implementation. But we can't just keep saying "solution" because that has many different meanings in a scientific context. Pdf word count = 5880