whisper.cpp/ggml
Aman Gupta 7d60b431a5 CUDA: add expert reduce kernel (llama/16857)
* CUDA: add expert reduce kernel

* contigous checks, better formatting, use std::vector instead of array

* use vector empty instead of size

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

---------

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2025-11-09 23:38:03 +02:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (llama/15094) 2025-08-18 20:30:45 +03:00
include model: add support for qwen3vl series (llama/16780) 2025-11-09 23:38:03 +02:00
src CUDA: add expert reduce kernel (llama/16857) 2025-11-09 23:38:03 +02:00
.gitignore
CMakeLists.txt Add experimental ggml-hexagon backend for the Hexagon NPU (llama/16547) 2025-11-09 23:38:03 +02:00