Effortless provenance tracking in Python
Imagine the situation: You’ve written some wonderful Python code which produces a beautiful graph as an output. You save that graph, naturally enough, as graph.png. You run the code a couple of times, each time making minor modifications. You come back to it the next week/month/year. Do you remember how you created that graph? What input data? What version of your code? Frustratingly, the answer will often be 'no'. Of course, you then waste lots of time trying to work out how you created it, or even give up and never use it in that journal paper that will win you a Nobel Prize…
ReciPy (from recipe and python) is a Python module that will save you from this situation! (Although it can’t guarantee that your paper will win a Nobel Prize!) With the addition of a single line of code to the top of your Python files, ReciPy will log each run of your code to a database, keeping track of the input files, output files and the version of your code, and then let you query this database to find out how you actually did create graph.png.