whisper.cpp

History

Jeff Bolz c66c71e9f4 vulkan: Use one row per workgroup for f32 mmv (llama/17711) The MoE models have a mul_mat_vec with very small m (32, 64, 128) right before the topk_moe selection. Running multiple rows per wg doesn't utilize the SMs well. I think even for larger m, f32 is so bandwidth-limited that running multiple rows doesn't help.		2025-12-12 17:53:20 +02:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (llama/15094)	2025-08-18 20:30:45 +03:00
include	rpc : fix alloc size logic (llama/17116)	2025-12-12 17:53:18 +02:00
src	vulkan: Use one row per workgroup for f32 mmv (llama/17711)	2025-12-12 17:53:20 +02:00
.gitignore	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
CMakeLists.txt	build : move _WIN32_WINNT definition to headers (llama/17736)	2025-12-12 17:53:16 +02:00