whisper.cpp

History

Jeff Bolz fbf720dc9f vulkan: Use cm2 decode_vector for mul_mat_id B matrix loads (llama/23991) This allows vec4 loads of the B elements. Also increase BK to 64 when this is enabled. Neither of these alone is consistently faster, but together these give a nice speedup. In ggml-vulkan.cpp, we need to make sure the B matrix alignment and stride are multiples of 4.		2026-06-15 10:33:53 +03:00
..
cmake	ggml : Parallelize quant LUT init (llama/23595)	2026-05-25 12:26:07 +03:00
include	TP: quantized KV cache support (llama/23792)	2026-06-08 14:36:36 +03:00
src	vulkan: Use cm2 decode_vector for mul_mat_id B matrix loads (llama/23991)	2026-06-15 10:33:53 +03:00
.gitignore	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
CMakeLists.txt	ggml : bump version to 0.14.0 (ggml/1533)	2026-06-08 14:36:36 +03:00