From f14f30c34d78d7e740eb166678815ae5f610d89d Mon Sep 17 00:00:00 2001 From: Pedram Ashofteh Ardakani Date: Wed, 29 Apr 2020 16:34:13 +0430 Subject: Fix indentations --- about.html | 770 +++++++++++++++++++++++++++++++------------------------------ 1 file changed, 391 insertions(+), 379 deletions(-) diff --git a/about.html b/about.html index 8a4a069..7835ccd 100644 --- a/about.html +++ b/about.html @@ -47,7 +47,19 @@
- +
+ + +

Maneage: managing data lineage

Copyright (C) 2018-2020 Mohammad Akhlaghi mohammad@akhlaghi.org
@@ -533,145 +545,145 @@ cd my-project # Go into git remote rename origin origin-maneage # Rename current/only remote to "origin-maneage". git checkout -b master # Create and enter your own "master" branch. pwd # Just to confirm where you are. - -

  • Prepare to build project: The ./project configure command of the - next step will build the different software packages within the - "build" directory (that you will specify). Nothing else on your system - will be touched. However, since it takes long, it is useful to see - what it is being built at every instant (its almost impossible to tell - from the torrent of commands that are produced!). So open another - terminal on your desktop and navigate to the same project directory - that you cloned (output of last command above). Then run the following - command. Once every second, this command will just print the date - (possibly followed by a non-existent directory notice). But as soon as - the next step starts building software, you'll see the names of - software get printed as they are being built. Once any software is - installed in the project build directory it will be removed. Again, - don't worry, nothing will be installed outside the build directory.

    +
  • +
  • Prepare to build project: The ./project configure command of the + next step will build the different software packages within the + "build" directory (that you will specify). Nothing else on your system + will be touched. However, since it takes long, it is useful to see + what it is being built at every instant (its almost impossible to tell + from the torrent of commands that are produced!). So open another + terminal on your desktop and navigate to the same project directory + that you cloned (output of last command above). Then run the following + command. Once every second, this command will just print the date + (possibly followed by a non-existent directory notice). But as soon as + the next step starts building software, you'll see the names of + software get printed as they are being built. Once any software is + installed in the project build directory it will be removed. Again, + don't worry, nothing will be installed outside the build directory.

    -
    
    +                        
    
     # On another terminal (go to top project source directory, last command above)
     ./project --check-config
    -                            
  • -
  • Test Maneage: Before making any changes, it is important to test it - and see if everything works properly with the commands below. If there - is any problem in the ./project configure or ./project make steps, - please contact us to fix the problem before continuing. Since the - building of dependencies in configuration can take long, you can take - the next few steps (editing the files) while its working (they don't - affect the configuration). After ./project make is finished, open - paper.pdf. If it looks fine, you are ready to start customizing the - Maneage for your project. But before that, clean all the extra Maneage - outputs with make clean as shown below.

    +
  • +
  • Test Maneage: Before making any changes, it is important to test it + and see if everything works properly with the commands below. If there + is any problem in the ./project configure or ./project make steps, + please contact us to fix the problem before continuing. Since the + building of dependencies in configuration can take long, you can take + the next few steps (editing the files) while its working (they don't + affect the configuration). After ./project make is finished, open + paper.pdf. If it looks fine, you are ready to start customizing the + Maneage for your project. But before that, clean all the extra Maneage + outputs with make clean as shown below.

    -
    
    +                        
    
     ./project configure     # Build the project's software environment (can take an hour or so).
     ./project make          # Do the processing and build paper (just a simple demo).
    -                        # Open 'paper.pdf' and see if everything is ok.
    -                            
  • -
  • Setup the remote: You can use any hosting - facility - that supports Git to keep an online copy of your project's version - controlled history. We recommend GitLab because - it is more ethical (although not - perfect), - and later you can also host GitLab on your own server. Anyway, create - an account in your favorite hosting facility (if you don't already - have one), and define a new project there. Please make sure the newly - created project is empty (some services ask to include a README in - a new project which is bad in this scenario, and will not allow you to - push to it). It will give you a URL (usually starting with git@ and - ending in .git), put this URL in place of XXXXXXXXXX in the first - command below. With the second command, "push" your master branch to - your origin remote, and (with the --set-upstream option) set them - to track/follow each other. However, the maneage branch is currently - tracking/following your origin-maneage remote (automatically set - when you cloned Maneage). So when pushing the maneage branch to your - origin remote, you shouldn't use --set-upstream. With the last - command, you can actually check this (which local and remote branches - are tracking each other).

    +# Open 'paper.pdf' and see if everything is ok. +
  • +
  • Setup the remote: You can use any hosting + facility + that supports Git to keep an online copy of your project's version + controlled history. We recommend GitLab because + it is more ethical (although not + perfect), + and later you can also host GitLab on your own server. Anyway, create + an account in your favorite hosting facility (if you don't already + have one), and define a new project there. Please make sure the newly + created project is empty (some services ask to include a README in + a new project which is bad in this scenario, and will not allow you to + push to it). It will give you a URL (usually starting with git@ and + ending in .git), put this URL in place of XXXXXXXXXX in the first + command below. With the second command, "push" your master branch to + your origin remote, and (with the --set-upstream option) set them + to track/follow each other. However, the maneage branch is currently + tracking/following your origin-maneage remote (automatically set + when you cloned Maneage). So when pushing the maneage branch to your + origin remote, you shouldn't use --set-upstream. With the last + command, you can actually check this (which local and remote branches + are tracking each other).

    -
    
    +                        
    
     git remote add origin XXXXXXXXXX        # Newly created repo is now called 'origin'.
     git push --set-upstream origin master   # Push 'master' branch to 'origin' (with tracking).
     git push origin maneage                 # Push 'maneage' branch to 'origin' (no tracking).
    -                                
  • -
  • Title, short description and author: The title and basic - information of your project's output PDF paper should be added in - paper.tex. You should see the relevant place in the preamble (prior - to \begin{document}. After you are done, run the ./project make - command again to see your changes in the final PDF, and make sure that - your changes don't cause a crash in LaTeX. Of course, if you use a - different LaTeX package/style for managing the title and authors (in - particular a specific journal's style), please feel free to use it - your own methods after finishing this checklist and doing your first - commit.

  • -
  • Delete dummy parts: Maneage contains some parts that are only for - the initial/test run, mainly as a demonstration of important steps, - which you can use as a reference to use in your own project. But they - not for any real analysis, so you should remove these parts as - described below:

    +
  • +
  • Title, short description and author: The title and basic + information of your project's output PDF paper should be added in + paper.tex. You should see the relevant place in the preamble (prior + to \begin{document}. After you are done, run the ./project make + command again to see your changes in the final PDF, and make sure that + your changes don't cause a crash in LaTeX. Of course, if you use a + different LaTeX package/style for managing the title and authors (in + particular a specific journal's style), please feel free to use it + your own methods after finishing this checklist and doing your first + commit.

  • +
  • Delete dummy parts: Maneage contains some parts that are only for + the initial/test run, mainly as a demonstration of important steps, + which you can use as a reference to use in your own project. But they + not for any real analysis, so you should remove these parts as + described below:

    -
      -
    • paper.tex: 1) Delete the text of the abstract (from - \includeabstract{ to \vspace{0.25cm}) and write your own (a - single sentence can be enough now, you can complete it later). 2) - Add some keywords under it in the keywords part. 3) Delete - everything between %% Start of main body. and %% End of main - body.. 4) Remove the notice in the "Acknowledgments" section (in - \new{}) and Acknowledge your funding sources (this can also be - done later). Just don't delete the existing acknowledgment - statement: Maneage is possible thanks to funding from several - grants. Since Maneage is being used in your work, it is necessary to - acknowledge them in your work also.

    • -
    • reproduce/analysis/make/top-make.mk: Delete the delete-me line - in the makesrc definition. Just make sure there is no empty line - between the download \ and verify \ lines (they should be - directly under each other).

    • -
    • reproduce/analysis/make/verify.mk: In the final recipe, under the - commented line Verify TeX macros, remove the full line that - contains delete-me, and set the value of s in the line for - download to XXXXX (any temporary string, you'll fix it in the - end of your project, when its complete).

    • -
    • Delete all delete-me* files in the following directories:

      -
      
      +                        
        +
      • paper.tex: 1) Delete the text of the abstract (from + \includeabstract{ to \vspace{0.25cm}) and write your own (a + single sentence can be enough now, you can complete it later). 2) + Add some keywords under it in the keywords part. 3) Delete + everything between %% Start of main body. and %% End of main + body.. 4) Remove the notice in the "Acknowledgments" section (in + \new{}) and Acknowledge your funding sources (this can also be + done later). Just don't delete the existing acknowledgment + statement: Maneage is possible thanks to funding from several + grants. Since Maneage is being used in your work, it is necessary to + acknowledge them in your work also.

      • +
      • reproduce/analysis/make/top-make.mk: Delete the delete-me line + in the makesrc definition. Just make sure there is no empty line + between the download \ and verify \ lines (they should be + directly under each other).

      • +
      • reproduce/analysis/make/verify.mk: In the final recipe, under the + commented line Verify TeX macros, remove the full line that + contains delete-me, and set the value of s in the line for + download to XXXXX (any temporary string, you'll fix it in the + end of your project, when its complete).

      • +
      • Delete all delete-me* files in the following directories:

        +
        
         rm tex/src/delete-me*
         rm reproduce/analysis/make/delete-me*
         rm reproduce/analysis/config/delete-me*
        -                                            
      • -
      • Disable verification of outputs by removing the yes from - reproduce/analysis/config/verify-outputs.conf. Later, when you are - ready to submit your paper, or publish the dataset, activate - verification and make the proper corrections in this file (described - under the "Other basic customizations" section below). This is a - critical step and only takes a few minutes when your project is - finished. So DON'T FORGET to activate it in the end.

      • -
      • Re-make the project (after a cleaning) to see if you haven't - introduced any errors.

        +
    • +
    • Disable verification of outputs by removing the yes from + reproduce/analysis/config/verify-outputs.conf. Later, when you are + ready to submit your paper, or publish the dataset, activate + verification and make the proper corrections in this file (described + under the "Other basic customizations" section below). This is a + critical step and only takes a few minutes when your project is + finished. So DON'T FORGET to activate it in the end.

    • +
    • Re-make the project (after a cleaning) to see if you haven't + introduced any errors.

      -
      
      +                                
      
       ./project make clean
       ./project make
      -                                                
    • -
  • -
  • Don't merge some files in future updates: As described below, you - can later update your infra-structure (for example to fix bugs) by - merging your master branch with maneage. For files that you have - created in your own branch, there will be no problem. However if you - modify an existing Maneage file for your project, next time its - updated on maneage you'll have an annoying conflict. The commands - below show how to fix this future problem. With them, you can - configure Git to ignore the changes in maneage for some of the files - you have already edited and deleted above (and will edit below). Note - that only the first echo command has a > (to write over the file), - the rest are >> (to append to it). If you want to avoid any other - set of files to be imported from Maneage into your project's branch, - you can follow a similar strategy. We recommend only doing it when you - encounter the same conflict in more than one merge and there is no - other change in that file. Also, don't add core Maneage Makefiles, - otherwise Maneage can break on the next run.

    +
  • + +
  • Don't merge some files in future updates: As described below, you + can later update your infra-structure (for example to fix bugs) by + merging your master branch with maneage. For files that you have + created in your own branch, there will be no problem. However if you + modify an existing Maneage file for your project, next time its + updated on maneage you'll have an annoying conflict. The commands + below show how to fix this future problem. With them, you can + configure Git to ignore the changes in maneage for some of the files + you have already edited and deleted above (and will edit below). Note + that only the first echo command has a > (to write over the file), + the rest are >> (to append to it). If you want to avoid any other + set of files to be imported from Maneage into your project's branch, + you can follow a similar strategy. We recommend only doing it when you + encounter the same conflict in more than one merge and there is no + other change in that file. Also, don't add core Maneage Makefiles, + otherwise Maneage can break on the next run.

    -
    
    +                            
    
     echo "paper.tex merge=ours" > .gitattributes
     echo "tex/src/delete-me.mk merge=ours" >> .gitattributes
     echo "tex/src/delete-me-demo.mk merge=ours" >> .gitattributes
    @@ -679,50 +691,50 @@ echo "reproduce/analysis/make/delete-me.mk merge=ours" >> .gitattributes
     echo "reproduce/software/config/TARGETS.conf merge=ours" >> .gitattributes
     echo "reproduce/analysis/config/delete-me-num.conf merge=ours" >> .gitattributes
     git add .gitattributes
    -                                        
  • -
  • Copyright and License notice: It is necessary that all the - "copyright-able" files in your project (those larger than 10 lines) - have a copyright and license notice. Please take a moment to look at - several existing files to see a few examples. The copyright notice is - usually close to the start of the file, it is the line starting with - Copyright (C) and containing a year and the author's name (like the - examples below). The License notice is a short description of the - copyright license, usually one or two paragraphs with a URL to the - full license. Don't forget to add these two notices to any new - file you add in your project (you can just copy-and-paste). When you - modify an existing Maneage file (which already has the notices), just - add a copyright notice in your name under the existing one(s), like - the line with capital letters below. To start with, add this line with - your name and email address to paper.tex, - tex/src/preamble-header.tex, reproduce/analysis/make/top-make.mk, - and generally, all the files you modified in the previous step.

    - -
    
    +                        
  • +
  • Copyright and License notice: It is necessary that all the + "copyright-able" files in your project (those larger than 10 lines) + have a copyright and license notice. Please take a moment to look at + several existing files to see a few examples. The copyright notice is + usually close to the start of the file, it is the line starting with + Copyright (C) and containing a year and the author's name (like the + examples below). The License notice is a short description of the + copyright license, usually one or two paragraphs with a URL to the + full license. Don't forget to add these two notices to any new + file you add in your project (you can just copy-and-paste). When you + modify an existing Maneage file (which already has the notices), just + add a copyright notice in your name under the existing one(s), like + the line with capital letters below. To start with, add this line with + your name and email address to paper.tex, + tex/src/preamble-header.tex, reproduce/analysis/make/top-make.mk, + and generally, all the files you modified in the previous step.

    + +
    
     Copyright (C) 2018-2020 Existing Name <existing@email.address>
     Copyright (C) 2020 YOUR NAME <YOUR@EMAIL.ADDRESS>
    -                                            
  • -
  • Configure Git for fist time: If this is the first time you are - running Git on this system, then you have to configure it with some - basic information in order to have essential information in the commit - messages (ignore this step if you have already done it). Git will - include your name and e-mail address information in each commit. You - can also specify your favorite text editor for making the commit - (emacs, vim, nano, and etc.).

    +
  • +
  • Configure Git for fist time: If this is the first time you are + running Git on this system, then you have to configure it with some + basic information in order to have essential information in the commit + messages (ignore this step if you have already done it). Git will + include your name and e-mail address information in each commit. You + can also specify your favorite text editor for making the commit + (emacs, vim, nano, and etc.).

    -
    
    +                            
    
     git config --global user.name "YourName YourSurname"
     git config --global user.email your-email@example.com
     git config --global core.editor nano
    -                                                
  • -
  • Your first commit: You have already made some small and basic - changes in the steps above and you are in your project's master - branch. So, you can officially make your first commit in your - project's history and push it. But before that, you need to make sure - that there are no problems in the project. This is a good habit to - always re-build the system before a commit to be sure it works as - expected.

    - -
    
    +                        
  • +
  • Your first commit: You have already made some small and basic + changes in the steps above and you are in your project's master + branch. So, you can officially make your first commit in your + project's history and push it. But before that, you need to make sure + that there are no problems in the project. This is a good habit to + always re-build the system before a commit to be sure it works as + expected.

    + +
    
     git status                 # See which files you have changed.
     git diff                   # Check the lines you have added/changed.
     ./project make             # Make sure everything builds successfully.
    @@ -731,13 +743,13 @@ git status                 # Make sure everything is fine.
     git diff --cached          # Confirm all the changes that will be committed.
     git commit                 # Your first commit: put a good description!
     git push                   # Push your commit to your remote.
    -                                                    
  • -
  • Start your exciting research: You are now ready to add flesh and - blood to this raw skeleton by further modifying and adding your - exciting research steps. You can use the "published works" section in - the introduction (above) as some fully working models to learn - from. Also, don't hesitate to contact us if you have any - questions.

  • +
    +
  • Start your exciting research: You are now ready to add flesh and + blood to this raw skeleton by further modifying and adding your + exciting research steps. You can use the "published works" section in + the introduction (above) as some fully working models to learn + from. Also, don't hesitate to contact us if you have any + questions.

  • Other basic customizations

    @@ -777,76 +789,76 @@ git push # Push your commit to your remo
    
     grep -ir wfpc2 ./*
    -                        
    -
  • README.md: Correct all the XXXXX place holders (name of your - project, your own name, address of your project's online/remote - repository, link to download dependencies and etc). Generally, read - over the text and update it where necessary to fit your project. Don't - forget that this is the first file that is displayed on your online - repository and also your colleagues will first be drawn to read this - file. Therefore, make it as easy as possible for them to start - with. Also check and update this file one last time when you are ready - to publish your project's paper/source.

  • -
  • Verify outputs: During the initial customization checklist, you - disabled verification. This is natural because during the project you - need to make changes all the time and its a waste of time to enable - verification every time. But at significant moments of the project - (for example before submission to a journal, or publication) it is - necessary. When you activate verification, before building the paper, - all the specified datasets will be compared with their respective - checksum and if any file's checksum is different from the one recorded - in the project, it will stop and print the problematic file and its - expected and calculated checksums. First set the value of - verify-outputs variable in - reproduce/analysis/config/verify-outputs.conf to yes. Then go to - reproduce/analysis/make/verify.mk. The verification of all the files - is only done in one recipe. First the files that go into the - plots/figures are checked, then the LaTeX macros. Validation of the - former (inputs to plots/figures) should be done manually. If its the - first time you are doing this, you can see two examples of the dummy - steps (with delete-me, you can use them if you like). These two - examples should be removed before you can run the project. For the - latter, you just have to update the checksums. The important thing to - consider is that a simple checksum can be problematic because some - file generators print their run-time date in the file (for example as - commented lines in a text table). When checking text files, this - Makefile already has this function: - verify-txt-no-comments-leading-space. As the name suggests, it will - remove comment lines and empty lines before calculating the MD5 - checksum. For FITS formats (common in astronomy, fortunately there is - a DATASUM definition which will return the checksum independent of - the headers. You can use the provided function(s), or define one for - your special formats.

  • -
  • Feedback: As you use Maneage you will notice many things that if - implemented from the start would have been very useful for your - work. This can be in the actual scripting and architecture of Maneage, - or useful implementation and usage tips, like those below. In any - case, please share your thoughts and suggestions with us, so we can - add them here for everyone's benefit.

  • -
  • Re-preparation: Automatic preparation is only run in the first run - of the project on a system, to re-do the preparation you have to use - the option below. Here is the reason for this: when its necessary, the - preparation process can be slow and will unnecessarily slow down the - whole project while the project is under development (focus is on the - analysis that is done after preparation). Because of this, preparation - will be done automatically for the first time that the project is run - (when .build/software/preparation-done.mk doesn't exist). After the - preparation process completes once, future runs of ./project make - will not do the preparation process anymore (will not call - top-prepare.mk). They will only call top-make.mk for the - analysis. To manually invoke the preparation process after the first - attempt, the ./project make script should be run with the - --prepare-redo option, or you can delete the special file above.

    +
  • +
  • README.md: Correct all the XXXXX place holders (name of your + project, your own name, address of your project's online/remote + repository, link to download dependencies and etc). Generally, read + over the text and update it where necessary to fit your project. Don't + forget that this is the first file that is displayed on your online + repository and also your colleagues will first be drawn to read this + file. Therefore, make it as easy as possible for them to start + with. Also check and update this file one last time when you are ready + to publish your project's paper/source.

  • +
  • Verify outputs: During the initial customization checklist, you + disabled verification. This is natural because during the project you + need to make changes all the time and its a waste of time to enable + verification every time. But at significant moments of the project + (for example before submission to a journal, or publication) it is + necessary. When you activate verification, before building the paper, + all the specified datasets will be compared with their respective + checksum and if any file's checksum is different from the one recorded + in the project, it will stop and print the problematic file and its + expected and calculated checksums. First set the value of + verify-outputs variable in + reproduce/analysis/config/verify-outputs.conf to yes. Then go to + reproduce/analysis/make/verify.mk. The verification of all the files + is only done in one recipe. First the files that go into the + plots/figures are checked, then the LaTeX macros. Validation of the + former (inputs to plots/figures) should be done manually. If its the + first time you are doing this, you can see two examples of the dummy + steps (with delete-me, you can use them if you like). These two + examples should be removed before you can run the project. For the + latter, you just have to update the checksums. The important thing to + consider is that a simple checksum can be problematic because some + file generators print their run-time date in the file (for example as + commented lines in a text table). When checking text files, this + Makefile already has this function: + verify-txt-no-comments-leading-space. As the name suggests, it will + remove comment lines and empty lines before calculating the MD5 + checksum. For FITS formats (common in astronomy, fortunately there is + a DATASUM definition which will return the checksum independent of + the headers. You can use the provided function(s), or define one for + your special formats.

  • +
  • Feedback: As you use Maneage you will notice many things that if + implemented from the start would have been very useful for your + work. This can be in the actual scripting and architecture of Maneage, + or useful implementation and usage tips, like those below. In any + case, please share your thoughts and suggestions with us, so we can + add them here for everyone's benefit.

  • +
  • Re-preparation: Automatic preparation is only run in the first run + of the project on a system, to re-do the preparation you have to use + the option below. Here is the reason for this: when its necessary, the + preparation process can be slow and will unnecessarily slow down the + whole project while the project is under development (focus is on the + analysis that is done after preparation). Because of this, preparation + will be done automatically for the first time that the project is run + (when .build/software/preparation-done.mk doesn't exist). After the + preparation process completes once, future runs of ./project make + will not do the preparation process anymore (will not call + top-prepare.mk). They will only call top-make.mk for the + analysis. To manually invoke the preparation process after the first + attempt, the ./project make script should be run with the + --prepare-redo option, or you can delete the special file above.

    -
    
    +                        
    
     ./project make --prepare-redo
    -                            
  • -
  • Pre-publication: add notice on reproducibility**: Add a notice - somewhere prominent in the first page within your paper, informing the - reader that your research is fully reproducible. For example in the - end of the abstract, or under the keywords with a title like - "reproducible paper". This will encourage them to publish their own - works in this manner also and also will help spread the word.

  • +
    +
  • Pre-publication: add notice on reproducibility**: Add a notice + somewhere prominent in the first page within your paper, informing the + reader that your research is fully reproducible. For example in the + end of the abstract, or under the keywords with a title like + "reproducible paper". This will encourage them to publish their own + works in this manner also and also will help spread the word.

  • Tips for designing your project

    @@ -960,28 +972,28 @@ grep -ir wfpc2 ./*
    
     info make "automatic variables"
    -                                        
    -
  • Debug: Since Make doesn't follow the common top-down paradigm, it - can be a little hard to get accustomed to why you get an error or - un-expected behavior. In such cases, run Make with the -d - option. With this option, Make prints a full list of exactly which - prerequisites are being checked for which targets. Looking - (patiently) through this output and searching for the faulty - file/step will clearly show you any mistake you might have made in - defining the targets or prerequisites.

  • -
  • Large files: If you are dealing with very large files (thus having - multiple copies of them for intermediate steps is not possible), one - solution is the following strategy (Also see the next item on "Fast - access to temporary files"). Set a small plain text file as the - actual target and delete the large file when it is no longer needed - by the project (in the last rule that needs it). Below is a simple - demonstration of doing this. In it, we use Gnuastro's Arithmetic - program to add all pixels of the input image with 2 and create - large1.fits. We then subtract 2 from large1.fits to create - large2.fits and delete large1.fits in the same rule (when its no - longer needed). We can later do the same with large2.fits when it - is no longer needed and so on. -

    
    +                                    
  • +
  • Debug: Since Make doesn't follow the common top-down paradigm, it + can be a little hard to get accustomed to why you get an error or + un-expected behavior. In such cases, run Make with the -d + option. With this option, Make prints a full list of exactly which + prerequisites are being checked for which targets. Looking + (patiently) through this output and searching for the faulty + file/step will clearly show you any mistake you might have made in + defining the targets or prerequisites.

  • +
  • Large files: If you are dealing with very large files (thus having + multiple copies of them for intermediate steps is not possible), one + solution is the following strategy (Also see the next item on "Fast + access to temporary files"). Set a small plain text file as the + actual target and delete the large file when it is no longer needed + by the project (in the last rule that needs it). Below is a simple + demonstration of doing this. In it, we use Gnuastro's Arithmetic + program to add all pixels of the input image with 2 and create + large1.fits. We then subtract 2 from large1.fits to create + large2.fits and delete large1.fits in the same rule (when its no + longer needed). We can later do the same with large2.fits when it + is no longer needed and so on. +

    
     large1.fits.txt: input.fits
     astarithmetic $< 2 + --output=$(subst .txt,,$@)
     echo "done" > $@
    @@ -989,26 +1001,26 @@ large2.fits.txt: large1.fits.txt
     astarithmetic $(subst .txt,,$<) 2 - --output=$(subst .txt,,$@)
     rm $(subst .txt,,$<)
     echo "done" > $@
    -                                            
    - A more advanced Make programmer will use Make's call function - to define a wrapper in reproduce/analysis/make/initialize.mk. This - wrapper will replace $(subst .txt,,XXXXX). Therefore, it will be - possible to greatly simplify this repetitive statement and make the - code even more readable throughout the whole project.

  • -
  • Fast access to temporary files: Most Unix-like operating systems - will give you a special shared-memory device (directory): on systems - using the GNU C Library (all GNU/Linux system), it is /dev/shm. The - contents of this directory are actually in your RAM, not in your - persistence storage like the HDD or SSD. Reading and writing from/to - the RAM is much faster than persistent storage, so if you have enough - RAM available, it can be very beneficial for large temporary files to - be put there. You can use the mktemp program to give the temporary - files a randomly-set name, and use text files as targets to keep that - name (as described in the item above under "Large files") for later - deletion. For example, see the minimal working example Makefile below - (which you can actually put in a Makefile and run if you have an - input.fits in the same directory, and Gnuastro is installed). -

    
    +                                        
    + A more advanced Make programmer will use Make's call function + to define a wrapper in reproduce/analysis/make/initialize.mk. This + wrapper will replace $(subst .txt,,XXXXX). Therefore, it will be + possible to greatly simplify this repetitive statement and make the + code even more readable throughout the whole project.

  • +
  • Fast access to temporary files: Most Unix-like operating systems + will give you a special shared-memory device (directory): on systems + using the GNU C Library (all GNU/Linux system), it is /dev/shm. The + contents of this directory are actually in your RAM, not in your + persistence storage like the HDD or SSD. Reading and writing from/to + the RAM is much faster than persistent storage, so if you have enough + RAM available, it can be very beneficial for large temporary files to + be put there. You can use the mktemp program to give the temporary + files a randomly-set name, and use text files as targets to keep that + name (as described in the item above under "Large files") for later + deletion. For example, see the minimal working example Makefile below + (which you can actually put in a Makefile and run if you have an + input.fits in the same directory, and Gnuastro is installed). +

    
     .ONESHELL:
     .SHELLFLAGS = -ec
     all: mean-std.txt
    @@ -1027,30 +1039,30 @@ mean-std.txt: large2.txt
     input=$$(cat $<)
     aststatistics $$input.fits --mean --std > $@
     rm $$input.fits $$input
    -                                            
    - The important point here is that the temporary name template - (shm-maneage) has no suffix. So you can add the suffix - corresponding to your desired format afterwards (for example - $$out.fits, or $$out.txt). But more importantly, when mktemp - sets the random name, it also checks if no file exists with that name - and creates a file with that exact name at that moment. So at the end - of each recipe above, you'll have two files in your /dev/shm, one - empty file with no suffix one with a suffix. The role of the file - without a suffix is just to ensure that the randomly set name will - not be used by other calls to mktemp (when running in parallel) and - it should be deleted with the file containing a suffix. This is the - reason behind the rm $$input.fits $$input command above: to make - sure that first the file with a suffix is deleted, then the core - random file (note that when working in parallel on powerful systems, - in the time between deleting two files of a single rm command, many - things can happen!). When using Maneage, you can put the definition - of shm-maneage in reproduce/analysis/make/initialize.mk to be - usable in all the different Makefiles of your analysis, and you won't - need the three lines above it. Finally, BE RESPONSIBLE: after you - are finished, be sure to clean up any possibly remaining files (due - to crashes in the processing while you are working), otherwise your - RAM may fill up very fast. You can do it easily with a command like - this on your command-line: rm -f /dev/shm/$(whoami)-*.

  • + + The important point here is that the temporary name template + (shm-maneage) has no suffix. So you can add the suffix + corresponding to your desired format afterwards (for example + $$out.fits, or $$out.txt). But more importantly, when mktemp + sets the random name, it also checks if no file exists with that name + and creates a file with that exact name at that moment. So at the end + of each recipe above, you'll have two files in your /dev/shm, one + empty file with no suffix one with a suffix. The role of the file + without a suffix is just to ensure that the randomly set name will + not be used by other calls to mktemp (when running in parallel) and + it should be deleted with the file containing a suffix. This is the + reason behind the rm $$input.fits $$input command above: to make + sure that first the file with a suffix is deleted, then the core + random file (note that when working in parallel on powerful systems, + in the time between deleting two files of a single rm command, many + things can happen!). When using Maneage, you can put the definition + of shm-maneage in reproduce/analysis/make/initialize.mk to be + usable in all the different Makefiles of your analysis, and you won't + need the three lines above it. Finally, BE RESPONSIBLE: after you + are finished, be sure to clean up any possibly remaining files (due + to crashes in the processing while you are working), otherwise your + RAM may fill up very fast. You can do it easily with a command like + this on your command-line: rm -f /dev/shm/$(whoami)-*.

  • Software tarballs and raw inputs: It is critically important to document the raw inputs to your project (software tarballs and raw @@ -1101,91 +1113,91 @@ git log XXXXXX..XXXXXX --reverse # Inspect new work (re git log --oneline --graph --decorate --all # General view of branches. git checkout master # Go to your top working branch. git merge maneage # Import all the work into master. -

  • -
  • Adding Maneage to a fork of your project: As you and your colleagues - continue your project, it will be necessary to have separate - forks/clones of it. But when you clone your own project on a - different system, or a colleague clones it to collaborate with you, - the clone won't have the origin-maneage remote that you started the - project with. As shown in the previous item above, you need this - remote to be able to pull recent updates from Maneage. The steps - below will setup the origin-maneage remote, and a local maneage - branch to track it, on the new clone.

    - -
    
    +                                            
  • +
  • Adding Maneage to a fork of your project: As you and your colleagues + continue your project, it will be necessary to have separate + forks/clones of it. But when you clone your own project on a + different system, or a colleague clones it to collaborate with you, + the clone won't have the origin-maneage remote that you started the + project with. As shown in the previous item above, you need this + remote to be able to pull recent updates from Maneage. The steps + below will setup the origin-maneage remote, and a local maneage + branch to track it, on the new clone.

    + +
    
     git remote add origin-maneage https://git.maneage.org/project.git
     git fetch origin-maneage
     git checkout -b maneage --track origin-maneage/maneage
    -                                                    
  • -
  • Commit message: The commit message is a very important and useful - aspect of version control. To make the commit message useful for - others (or yourself, one year later), it is good to follow a - consistent style. Maneage already has a consistent formatting - (described below), which you can also follow in your project if you - like. You can see many examples by running git log in the maneage - branch. If you intend to push commits to Maneage, for the consistency - of Maneage, it is necessary to follow these guidelines. 1) No line - should be more than 75 characters (to enable easy reading of the - message when you run git log on the standard 80-character - terminal). 2) The first line is the title of the commit and should - summarize it (so git log --oneline can be useful). The title should - also not end with a point (., because its a short single sentence, - so a point is not necessary and only wastes space). 3) After the - title, leave an empty line and start the body of your message - (possibly containing many paragraphs). 4) Describe the context of - your commit (the problem it is trying to solve) as much as possible, - then go onto how you solved it. One suggestion is to start the main - body of your commit with "Until now ...", and continue describing the - problem in the first paragraph(s). Afterwards, start the next - paragraph with "With this commit ...".

  • -
  • Project outputs: During your research, it is possible to checkout a - specific commit and reproduce its results. However, the processing - can be time consuming. Therefore, it is useful to also keep track of - the final outputs of your project (at minimum, the paper's PDF) in - important points of history. However, keeping a snapshot of these - (most probably large volume) outputs in the main history of the - project can unreasonably bloat it. It is thus recommended to make a - separate Git repo to keep those files and keep your project's source - as small as possible. For example if your project is called - my-exciting-project, the name of the outputs repository can be - my-exciting-project-output. This enables easy sharing of the output - files with your co-authors (with necessary permissions) and not - having to bloat your email archive with extra attachments also (you - can just share the link to the online repo in your - communications). After the research is published, you can also - release the outputs repository, or you can just delete it if it is - too large or un-necessary (it was just for convenience, and fully - reproducible after all). For example Maneage's output is available - for demonstration in a - separate repository.

  • -
  • Full Git history in one file: When you are publishing your project - (for example to Zenodo for long term preservation), it is more - convenient to have the whole project's Git history into one file to - save with your datasets. After all, you can't be sure that your - current Git server (for example GitLab, Github, or Bitbucket) will be - active forever. While they are good for the immediate future, you - can't rely on them for archival purposes. Fortunately keeping your - whole history in one file is easy with Git using the following - commands. To learn more about it, run git help bundle.

    - -
      -
    • "bundle" your project's history into one file (just don't forget to - change my-project-git.bundle to a descriptive name of your - project):
    • -
    - -
    
    +                                            
  • +
  • Commit message: The commit message is a very important and useful + aspect of version control. To make the commit message useful for + others (or yourself, one year later), it is good to follow a + consistent style. Maneage already has a consistent formatting + (described below), which you can also follow in your project if you + like. You can see many examples by running git log in the maneage + branch. If you intend to push commits to Maneage, for the consistency + of Maneage, it is necessary to follow these guidelines. 1) No line + should be more than 75 characters (to enable easy reading of the + message when you run git log on the standard 80-character + terminal). 2) The first line is the title of the commit and should + summarize it (so git log --oneline can be useful). The title should + also not end with a point (., because its a short single sentence, + so a point is not necessary and only wastes space). 3) After the + title, leave an empty line and start the body of your message + (possibly containing many paragraphs). 4) Describe the context of + your commit (the problem it is trying to solve) as much as possible, + then go onto how you solved it. One suggestion is to start the main + body of your commit with "Until now ...", and continue describing the + problem in the first paragraph(s). Afterwards, start the next + paragraph with "With this commit ...".

  • +
  • Project outputs: During your research, it is possible to checkout a + specific commit and reproduce its results. However, the processing + can be time consuming. Therefore, it is useful to also keep track of + the final outputs of your project (at minimum, the paper's PDF) in + important points of history. However, keeping a snapshot of these + (most probably large volume) outputs in the main history of the + project can unreasonably bloat it. It is thus recommended to make a + separate Git repo to keep those files and keep your project's source + as small as possible. For example if your project is called + my-exciting-project, the name of the outputs repository can be + my-exciting-project-output. This enables easy sharing of the output + files with your co-authors (with necessary permissions) and not + having to bloat your email archive with extra attachments also (you + can just share the link to the online repo in your + communications). After the research is published, you can also + release the outputs repository, or you can just delete it if it is + too large or un-necessary (it was just for convenience, and fully + reproducible after all). For example Maneage's output is available + for demonstration in a + separate repository.

  • +
  • Full Git history in one file: When you are publishing your project + (for example to Zenodo for long term preservation), it is more + convenient to have the whole project's Git history into one file to + save with your datasets. After all, you can't be sure that your + current Git server (for example GitLab, Github, or Bitbucket) will be + active forever. While they are good for the immediate future, you + can't rely on them for archival purposes. Fortunately keeping your + whole history in one file is easy with Git using the following + commands. To learn more about it, run git help bundle.

    + +
      +
    • "bundle" your project's history into one file (just don't forget to + change my-project-git.bundle to a descriptive name of your + project):
    • +
    + +
    
     git bundle create my-project-git.bundle --all
    -                                                        
    + -
      -
    • You can easily upload my-project-git.bundle anywhere. Later, if - you need to un-bundle it, you can use the following command.
    • -
    +
      +
    • You can easily upload my-project-git.bundle anywhere. Later, if + you need to un-bundle it, you can use the following command.
    • +
    -

    
    +                                                

    
     git clone my-project-git.bundle
    -                                                        
  • +

    -- cgit v1.2.1