Facilitating the "Great Bake Off" of Bioinformatics Workflows

Workflomics: A platform for automated generation and workflow benchmarking in bioinformatics

Photo credit – Shutterstock

Workflomics

Life science researchers across all disciplines work with ever larger and increasingly complex datasets. They use increasingly sophisticated data analysis pipelines and workflows, constructed from numerous individual software tools. Creating optimal workflows for specific data analysis problems is a challenge. It requires an interplay of exploring the latest relevant tool combinations and benchmarking selected workflow candidates with reference data to determine the best-performing ones. Due to a lack of adequate tooling, this is currently hardly done systematically. Therefore, many workflows compromise on scientific quality.

To tackle this problem, we developed Workflomics a software system facilitating a “Great Bake Off” of computational workflows in bioinformatics. Its key contribution is a new and unique integration of bioinformatics tools and metadata with technologies for automated workflow exploration and benchmarking. Workflomics provides a much-needed platform for systematic workflow generation and evaluation that complements and can be interfaced with existing state-of-the-art workflow systems. It leverages parallel technological developments at the European level, in particular services and resources provided by the ELIXIR Tools Platform.

During the initial development of Workflomics, we selected use cases from the thriving bioinformatics discipline of proteomics. These are representative of many modern workflow applications as they deal with highly complex data, are composed of large collections of individual software tools, and typically require high-performance computing resources. Workflomics enables a new, systematic and rigorous approach to the development of cutting-edge proteomics workflows, increasing their scientific quality and robustness, and furthermore improve their reproducibility, FAIRness and maintainability.

After the initial eScience Center assisted development of the Workflomics platform, the projects continues to be maintained by the groups of Prof. Anna-Lena Lamprecht at the University of Potsdam and Assoc. Prof. Magnus Palmblad at the Leiden University Medical Center.

RESTful APE - Software Sustainability project

The RESTful APE (RESTful API for the APE library) was initially developed for the Great Bake Off project, enabling streamlined interaction with APE's automated pipeline exploration through HTTP requests. Originally focused on bioinformatics, the project, upon receiving a software sustainability budget, has expanded its usability to include domains such as geosciences. This expansion has been supported by enhancements in documentation, functionalities, and demonstrations.

Participating organisations

Leiden University Medical Center
Netherlands eScience Center
University of Potsdam
Utrecht University
Life Sciences
Life Sciences

Output

Team

Peter Kok
Research Software Engineer
Netherlands eScience Center
NA
Nauman Ahmed
Research Software Engineer
Netherlands eScience Center
Magnus Palmblad
Principal investigator
Leiden University Medical Center
Anna-Lena Lamprecht
Anna-Lena Lamprecht
Principal investigator
University of Potsdam
Rob Marissen
Rob Marissen
Scientific Software Developer
Leiden University Medical Center
Pablo Lopez-Tarifa
eScience Coordinator
Netherlands eScience Center
Mario Frank
Mario Frank
Research Associate
University of Potsdam

Related projects

Common Workflow Language

The reference CWL runner and other software from the Common Workflow Language open standards community.

Updated 30 months ago

FAIR is as FAIR does

Integrating data publishing principles in scientific workflows

Updated 30 months ago
Finished

Related software

APE

AP

A CLI, Java API and RESTful API for the automated generation of computational pipelines (scientific workflows) from large collections of computational tools.

Updated 15 months ago
25 5

RESTful APE

RE

RESTfull API for the APE (Automated Pipeline Explorer) library.

Updated 18 months ago
3

Workflomics

WO

Web platform for workflow exploration and benchmarking in bioinformatics

Updated 9 months ago
1 7