Effortless provenance tracking in Python


What recipy can do for you

  • Keep track of what code you ran to generate results (e.g., graphs or data)
  • Add a single statement to enable provenance tracking in your Python script
  • Search your runs using a command line interface or GUI
  • Customize provenance tracking for each project

Imagine the situation: You’ve written some wonderful Python code which produces a beautiful graph as an output. You save that graph, naturally enough, as graph.png. You run the code a couple of times, each time making minor modifications. You come back to it the next week/month/year. Do you remember how you created that graph? What input data? What version of your code? Frustratingly, the answer will often be 'no'. Of course, you then waste lots of time trying to work out how you created it, or even give up and never use it in that journal paper that will win you a Nobel Prize…

ReciPy (from recipe and python) is a Python module that will save you from this situation! (Although it can’t guarantee that your paper will win a Nobel Prize!) With the addition of a single line of code to the top of your Python files, ReciPy will log each run of your code to a database, keeping track of the input files, output files and the version of your code, and then let you query this database to find out how you actually did create graph.png.

Programming languages
  • Python 91%
  • HTML 8%
  • Jupyter Notebook 1%
</>Source code

Participating organisations

Netherlands eScience Center


Contact person

Janneke van der Zwaan

Janneke van der Zwaan

Netherlands eScience Center
Mail Janneke
Janneke van der Zwaan
Janneke van der Zwaan
Netherlands eScience Center
Robin Wilson
University of Southampton