whisper.cpp/ggml
Georgi Gerganov 1c21a850be
metal : optimize FA vec for large sequences and BS <= 8 (llama/15566)
* metal : optmize FA vec for large heads and sequences

* metal : adjust small-batch mul mv kernels

ggml-ci

* batched-bench : fix total speed computation

ggml-ci

* cont : add comments

ggml-ci
2025-09-20 13:42:42 +03:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (llama/15094) 2025-08-18 20:30:45 +03:00
include ggml: add `conv3d` op (llama/15182) 2025-09-20 13:42:39 +03:00
src metal : optimize FA vec for large sequences and BS <= 8 (llama/15566) 2025-09-20 13:42:42 +03:00
.gitignore whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
CMakeLists.txt CUDA: replace GGML_CUDA_F16 with CUDA arch checks (llama/15433) 2025-09-20 13:42:38 +03:00