whisper.cpp/ggml
Johannes Gäßler a65976fc3c CUDA: fix quantized KV cache + multiple sequences (llama/14822)
* CUDA: fix quantized KV cache + multiple sequences

* Update ggml/src/ggml-cuda/fattn-common.cuh

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-07-28 13:02:32 +03:00
..
cmake cmake : fix usage issues (ggml/1257) 2025-07-28 13:02:32 +03:00
include ggml: Add initial WebGPU backend (llama/14521) 2025-07-20 00:23:50 +03:00
src CUDA: fix quantized KV cache + multiple sequences (llama/14822) 2025-07-28 13:02:32 +03:00
.gitignore whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
CMakeLists.txt ggml: Add initial WebGPU backend (llama/14521) 2025-07-20 00:23:50 +03:00