Kernel Tuner
Kernel Tuner greatly simplifies the development of highly-optimized and auto-tuned CUDA, OpenCL, and C code, supporting many advanced use-cases and optimization strategies that speed up the auto-tuning process.
Dynamically compile CUDA kernels and launch them type-safely using C++ magic. Tight integration with Kernel Tuner results in blazing fast GPU code.
Kernel Launcher is a C++ library that makes it easy to dynamically compile CUDA kernels at run time (using NVRTC) and call them in an easy type-safe way using C++ magic. Additionally, Kernel Launcher supports exporting kernel specifications from your application, to enable tuning by Kernel Tuner, and importing the tuning results, known as wisdom files, back into your application. The result: blazing fast GPU code that is maintainable and performance portable, all with minimal effort.
For future exascale climate and weather predictions
Kernel Tuner greatly simplifies the development of highly-optimized and auto-tuned CUDA, OpenCL, and C code, supporting many advanced use-cases and optimization strategies that speed up the auto-tuning process.