Scaling up pangenomics for plant breeding

Delivering a pangenome approach that drastically improves the analytical power on plant data

To capture and capitalize on genetic variation, the field of comparative genomics is switching from a reference-based approach to pangenomic approaches. The main aim of this project was to improve the scalability of our pangenomics software, called PanTools, for large-scale applications in plant sciences and biotechnology. We envisioned improvements in the data representation, in the construction and annotation algorithms, and in the use of novel technologies like Apache Spark. In addition, we aimed to improve our development efforts and make the code base more sustainable by reorganizing/refactoring, writing proper unit tests, and improving the documentation.

Participating organisations

Life Sciences

Output

1.
Author(s): Eef Jonkheer, Sandra Smit
Published by 4TU.ResearchData in 2022
10.4121/19874485.v1

1.
Author(s): Vlugter, H.
Published in 2024

Team

Contact person

Thijs van Lankveld

Lead RSE

Netherlands eScience Center

0009-0001-1147-4813

Mail Thijs

Sandra Smit

Principal investigator

Wageningen University and Research

0000-0001-5239-5321

Pablo Lopez-Tarifa

Programme Manager

Netherlands eScience Center

0000-0002-4136-1860

Thijs van Lankveld

Lead RSE

Netherlands eScience Center

0009-0001-1147-4813

Matthijs Moed

Advisor

SURF

0000-0003-3372-987X

Nauman Ahmed

RSE

Netherlands eScience Center

0000-0003-3559-9941

Related projects

PADRE - The PetaFLOP AARTFAAC Data-Reduction Engine

Improving the AARTFAAC processing pipeline

Updated 13 months ago

Finished

DarkGenerators

Interpretable large scale deep generative models for Dark Matter searches

Updated 15 months ago

Finished

A new perspective on global vegetation water dynamics from radar satellite data

Global vegetation water dynamics using radar satellite data

Updated 21 months ago

Finished

RETURN - Monitoring tropical forest recovery capacity using RADAR Sentinel satellite data

Demonstrating the potential of European Sentinel satellite data

Updated 39 months ago

Finished

eEcoLiDAR

eScience infrastructure for ecological applications of LiDAR point clouds

Updated 16 months ago

Finished

Blue-Action

Arctic impact on weather and climate

Updated 13 months ago

Finished

MAGIC

Metrics and Access to Global Indices for Climate Projections

Updated 39 months ago

Finished

Towards a species-by-species approach to global biodiversity modelling

The current decline of global biodiversity

Updated 35 months ago

Finished

PRIMAVERA

Process-based climate simulation: advances in high-resolution modelling and European climate risk...

Updated 35 months ago

Finished

Improving Open-Source Photogrammetric Workflows for Processing Big Datasets

Processing large datasets on consumer-grade computers

Updated 35 months ago

Finished

ERA-URBAN

Environmental re-analysis of urban areas: quantifying high-resolution energy and water budgets of...

Updated 36 months ago

Finished

Related software

PanTools

PanTools is a pangenomic toolkit for comparative analysis of large number of genomes. It is developed in the Bioinformatics Group of Wageningen University, the Netherlands. Please cite the relevant publication(s) from the list of publications if you use PanTools in your research.

Updated 23 months ago

61 3

PanTools-pipeline-v4

General purpose Snakemake pipeline for PanTools v4.

Updated 23 months ago

PanVA

Variant Analysis within Pangenomes.

Updated 23 months ago

7 2

Scaling up pangenomics for plant breeding

Participating organisations

Output

Dataset1

Thesis1

Team

Contact person

Thijs van Lankveld

Lead RSE

Netherlands eScience Center

.logo-orcid_svg__st1{fill:#fff}0009-0001-1147-4813

Related projects

PADRE - The PetaFLOP AARTFAAC Data-Reduction Engine

DarkGenerators

A new perspective on global vegetation water dynamics from radar satellite data

RETURN - Monitoring tropical forest recovery capacity using RADAR Sentinel satellite data

eEcoLiDAR

Blue-Action

MAGIC

Towards a species-by-species approach to global biodiversity modelling

PRIMAVERA

Improving Open-Source Photogrammetric Workflows for Processing Big Datasets

ERA-URBAN

Related software

PanTools

PanTools-pipeline-v4

PanVA

0009-0001-1147-4813