AMBER
A real-time pipeline to search for Fast Radio Bursts and other transient radio sources.
Kernel Tuner greatly simplifies the development of highly-optimized and auto-tuned CUDA, OpenCL, and C code, supporting many advanced use-cases and optimization strategies that speed up the auto-tuning process.
Kernel Tuner simplifies the development of efficient GPU programs, or kernels. It does so by making kernels written in C/C++, OpenCL, or CUDA accessible from Python, while taking care of the required synchronization between data kept in host memory and data kept in device memory.
This has a number of advantages. First, it simplifies auto-tuning of the kernel parameters. In fact, Kernel Tuner comes standard with a variety of strategies for efficiently searching the parameter space, leading to greatly improved performance of tuned kernels. Second, it allows for unit testing of GPU code from within Python.
Kernel Tuner does not add any additional dependencies to the kernel code, and does not require extensive code changes. Furthermore, it is noteworthy that kernels tuned by Kernel Tuner do not require any changes after tuning to make them production ready--tuned kernels can be used as-is from any host programming language.
With Kernel Tuner, we were able to accelerate our CUDA kernels by a factor of 10 in just a few weeks
A Computational Answer to the Soaring MRI demand
Centre of Excellence in Simulation of Weather and Climate in Europe
Reducing Energy Consumption in Radio-astronomical and Ultrasound Imaging Tools
Self-learning machines hunt for explosions in the universe and speed up innovations in industry and...
Consolidating and Future-proofing Kernel Tuner by developing Software Engineering Best Practices
GPU implementation of full vectorial Point Spread Function (PSF) fitting
Real Time National Policy Adjustment and Evaluation on the Basis of a Computational Model for COVID19
Verified construction of correct and optimised parallel software
For future exascale climate and weather predictions
Boosting the performance of current and future programs
Distributed radio astronomical computing
Accelerating astronomical applications 2
Studying subcellular structures and functions
The country below sea level
Observing processes that are inaccessible to optical telescopes
Programming tools that simplify application development and deployment
A real-time pipeline to search for Fast Radio Bursts and other transient radio sources.
A framework and predictor based on support vector machine and random walk graph kernel for scoring protein-protein interfaces.
Lightning: Fast data processing using GPUs on distributed platforms
PowerSensor is a low-cost, custom-built device that measures the instantaneous power consumption of GPUs and other devices at a high time resolution.