Ctrl K

KMM

KMM is a lightweight C++ middleware for accelerated computing.

2
contributors
Get started
731 commitsLast commit ≈ 8 months ago1 star1 fork

Cite this software

Description

KMM: Kernel Memory Manager

CPU Build Status CUDA Build Status HIP Build Status

The Kernel Memory Manager (KMM) is a lightweight, high-performance framework designed for parallel dataflow execution and efficient memory management on multi-GPU platforms.

KMM automatically manages GPU memory, partitions workloads across multiple GPUs, and schedules tasks efficiently. Unlike frameworks that require a specific programming model, KMM integrates existing GPU kernels or functions without the need to fully rewrite your code.

Features

  • Efficient Memory Management: automatically allocates memory and transfers data between GPU and host only when neccessary.
  • Scalable Computing: seamlessly spills data from GPU to host memory, enabling huge datasets that exceed GPU memory.
  • Optimized Scheduling: DAG scheduler automatically tracks dependencies and executes kernels in a sequentially consistent order.
  • Flexible Work Partitioning: Split workloads and data according to user-defined distributions, ensuring utilization of available resources.
  • Portable Execution: supports existing CUDA, HIP, and CPU-based functions; seamless integration with minimal changes.
  • Multi-Dimensional Arrays: handles ND-arrays of any shape, dimensionality, and data type.

Resources

Example

Example: A simple vector add kernel:

#include "kmm/kmm.hpp"

__global__ void vector_add(
    kmm::Range<int64_t> range,
    kmm::GPUSubviewMut<float> output,
    kmm::GPUSubview<float> left,
    kmm::GPUSubview<float> right
) {
    int64_t i = blockIdx.x * blockDim.x + threadIdx.x + range.begin;
    if (i >= range.end) return;

    output[i] = left[i] + right[i];
}

int main() {
    // 2B items, 10 chunks, 256 threads per block
    long n = 2'000'000'000;
    long chunk_size = n / 10;
    dim3 block_size = 256;

    // Initialize runtime
    auto rt = kmm::make_runtime();

    // Create arrays
    auto A = kmm::Array<float> {n};
    auto B = kmm::Array<float> {n};
    auto C = kmm::Array<float> {n};

    // Initialize input arrays
    initialize_inputs(A, B);

    // Launch the kernel!
    rt.parallel_submit(
        n, chunk_size,
        kmm::GPUKernel(vector_add, block_size),
        _x,
        write(C[_x]),
        A[_x],
        B[_x]
    );

    // Wait for completion
    rt.synchronize();

    return 0;
}

License

KMM is made available under the terms of the Apache License version 2.0, see the file LICENSE for details.

Logo of KMM
Keywords
Programming languages
  • C++ 95%
  • Cuda 4%
  • CMake 1%
License
</>Source code

Participating organisations

Netherlands eScience Center

Contributors

Alessio Sclocco
Research Software Engineer
Netherlands eScience Center
Stijn Heldens
Stijn Heldens
Research Software Engineer
Netherlands eScience Center

Related projects

SUNBEAM

Scalable UNified Beam-tracing for Earth–Atmosphere Models

Updated 2 weeks ago
In progress

COMPAS

A Computational Answer to the Soaring MRI demand

Updated 9 months ago
Finished