Kernel Tuner
Kernel Tuner greatly simplifies the development of highly-optimized and auto-tuned CUDA, OpenCL, and C code, supporting many advanced use-cases and optimization strategies that speed up the auto-tuning process.
Dynamically compile GPU kernels and launch them easily and safely using C++ magic. Tight integration with Kernel Tuner results in blazing fast CUDA code that is maintainable and performance portable.

Kernel Launcher is a C++ library that makes it easy to dynamically compile CUDA kernels at run time (using NVRTC) and call them in an easy type-safe way using C++ magic. Additionally, Kernel Launcher supports exporting kernel specifications from your application, to enable tuning by Kernel Tuner, and importing the tuning results, known as wisdom files, back into your application. The result: blazing fast GPU code that is maintainable and performance portable, all with minimal effort.
Self-learning machines hunt for explosions in the universe and speed up innovations in industry and...
Centre of Excellence in Simulation of Weather and Climate in Europe
For future exascale climate and weather predictions
Kernel Tuner greatly simplifies the development of highly-optimized and auto-tuned CUDA, OpenCL, and C code, supporting many advanced use-cases and optimization strategies that speed up the auto-tuning process.