whisper.cpp

History

Aman Gupta 1642a4fb60 ggml-cpu: Use tiled FA for prompt-processing (llama/19012) * ggml-cpu: Use tiled FA for prompt-processing the FA performance is gimped on CPU on long contexts because it essentially uses a vector kernel. This PR adds a tiled FA for PP. Perf tuning for tile sizes done on a AMD EPYC single-socket 64-c machine. * fix out of bounds for mask * skip rows where there are all masks * skip tile if mask is inf * store mask in worksize * check inf tile earlier		2026-01-30 15:56:40 +02:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (llama/15094)	2025-08-18 20:30:45 +03:00
include	ggml : add ggml_build_forward_select (llama/18550)	2026-01-30 15:56:40 +02:00
src	ggml-cpu: Use tiled FA for prompt-processing (llama/19012)	2026-01-30 15:56:40 +02:00
.gitignore	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
CMakeLists.txt	ggml : bump version to 0.9.5 (ggml/1410)	2025-12-31 18:27:20 +02:00