whisper.cpp

History

Jeff Bolz 7db8f278f0 vulkan: enable coopmat2 FA gqa and split_k optimizations more often (llama/12931) The grouped query attention optmization doesn't require a power of two ratio, the only thing relying on it was the modulo operation written as bitwise &. split_k need not depend on gqa_ratio - enable it any time there's only one workgroup in the X dimension. The shader gets the split index from the x coord, and multiple workgroups in the X dimension (pre-split) indicates a larger FA operation that wouldn't need splitting.		2025-04-24 20:39:16 +03:00
..
cmake	ggml : sync/merge cmake,riscv,powerpc, add common.cmake (ggml/0)	2025-03-27 11:06:03 +02:00
include	ggml : Depthwise 2D convolution (ggml/1152)	2025-04-24 20:39:16 +03:00
src	vulkan: enable coopmat2 FA gqa and split_k optimizations more often (llama/12931)	2025-04-24 20:39:16 +03:00
.gitignore	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
CMakeLists.txt	CUDA/HIP: Share the same unified memory allocation logic. (llama/12934)	2025-04-24 20:39:16 +03:00