Kernel Tuner
Kernel Tuner greatly simplifies the development of highly-optimized and auto-tuned CUDA, OpenCL, and C code, supporting many advanced use-cases and optimization strategies that speed up the auto-tuning process.
Dynamically compile GPU kernels and launch them easily and safely using C++ magic. Tight integration with Kernel Tuner results in blazing fast CUDA code that is maintainable and performance portable.

Kernel Launcher is a C++ library that makes it easy to dynamically compile CUDA kernels at run time (using NVRTC) and call them in an easy type-safe way using C++ magic. Additionally, Kernel Launcher supports exporting kernel specifications from your application, to enable tuning by Kernel Tuner, and importing the tuning results, known as wisdom files, back into your application. The result: blazing fast GPU code that is maintainable and performance portable, all with minimal effort.
Centre of Excellence in Simulation of Weather and Climate in Europe
Self-learning machines hunt for explosions in the universe and speed up innovations in industry and...
For future exascale climate and weather predictions
Kernel Tuner greatly simplifies the development of highly-optimized and auto-tuned CUDA, OpenCL, and C code, supporting many advanced use-cases and optimization strategies that speed up the auto-tuning process.