whisper.cpp

History

Zheyuan Chen 13133ab299 ggml-webgpu: makes the flash attn vec path subgroup-aware (llama/23040) * ggml-webgpu: makes the flash attn vec path compile and size its split/reduce work from the device’s reported subgroup range instead of assuming 32 subgroup size. * ggml-webgpu: remove the extra max_wg_size >= max_subgroup_size guard. Remove hardcoded 32 when determine the value of reduce_wg_size and vec_nwg_cap		2026-05-25 12:26:07 +03:00
..
cmake	cmake : add FindNCCL.cmake (ggml/0)	2026-05-02 15:02:42 +03:00
include	CUDA: lower-case PCI bus id, standardize for ggml (llama/22820)	2026-05-14 21:26:48 +03:00
src	ggml-webgpu: makes the flash attn vec path subgroup-aware (llama/23040)	2026-05-25 12:26:07 +03:00
.gitignore	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
CMakeLists.txt	SYCL: fix multi-GPU system RAM exhaustion by using Level Zero allocations (llama/21597)	2026-05-25 12:26:07 +03:00