whisper.cpp

History

Jeff Bolz 46e9e5b9a7 vulkan: optimizations for direct convolution (llama/14933) * vulkan: optimizations for direct convolution - Empirically choose a better tile size. Reducing BS_K/BS_NPQ helps fill the GPU. The new size should be amenable to using coopmat, too. - Fix shmem bank conflicts. 16B padding should work with coopmat. - Some explicit loop unrolling. - Skip math/stores work for parts of the tile that are OOB. - Apply fastdiv opt. - Disable shuffles for NV. * Three tiles sizes for CONV_2D, and a heuristic to choose * reallow collectives for pre-Turing * make SHMEM_PAD a spec constant * fixes for intel perf - no shmem padding, placeholder shader core count * shader variants with/without unrolling * 0cc4m's fixes for AMD perf Co-authored-by: 0cc4m <picard12@live.de> --------- Co-authored-by: 0cc4m <picard12@live.de>		2025-08-18 20:30:45 +03:00
..
cmake	cmake : Fix BLAS link interface (ggml/1316)	2025-08-18 20:30:45 +03:00
include	ggml : remove old kompute, cann (skip) (#3349 )	2025-07-30 16:08:57 +03:00
src	vulkan: optimizations for direct convolution (llama/14933)	2025-08-18 20:30:45 +03:00
.gitignore	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
CMakeLists.txt	HIP: add GGML_HIP_MMQ_MFMA option to allow disableing the MFMA path. (llama/14930)	2025-08-18 20:30:45 +03:00