whisper.cpp

History

Jeff Bolz 162bbe8220 vulkan: KHR_coopmat flash attention (llama/13506) This shader uses coopmat1 to do the QK^T multiply. The PV multiply is more difficult for various reasons so I haven't done it. Performance for this shader is around 2.5x better than for the scalar shader when doing prompt processing. Some of the benefit may be from other optimizations like staging through shared memory, or splitting by rows.		2025-05-19 14:58:39 +03:00
..
cmake	ggml : sync/merge cmake,riscv,powerpc, add common.cmake (ggml/0)	2025-03-27 11:06:03 +02:00
include	mnist: fix segmentation fault (ggml/1227)	2025-05-19 14:58:39 +03:00
src	vulkan: KHR_coopmat flash attention (llama/13506)	2025-05-19 14:58:39 +03:00
.gitignore	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
CMakeLists.txt	whisper: remove MSVC warnings pragmas (#3090 )	2025-05-05 13:09:35 +02:00