whisper.cpp

History

Jeff Bolz 0484147ab2 vulkan: fix top_k bug when there are ties in the input (llama/17659) * vulkan: Reduce temporary memory usage for TOP_K - Compute row size for the temp buffer based on the output of the first pass. - Update shader addressing math to use the output row size - Pass the output row size as "ncols_output", what used to be "ncols_output" is now "k" For the common case of K=40 and src0=(200000,1,1,1), this reduces the temporary buffer from about 3.2MB to 500KB. * vulkan: fix top_k bug when there are ties in the input I noticed by inspection a bug in the vulkan top_k shader where if the least value in the top_k appears multiple times we could end up writing those extra copies out rather than some larger values (if the larger values are on higher numbered threads). I rewrote the test verification to handle this case, where the final index set is not necessarily the same. * Update tests/test-backend-ops.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>		2025-12-12 17:53:19 +02:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (llama/15094)	2025-08-18 20:30:45 +03:00
include	rpc : fix alloc size logic (llama/17116)	2025-12-12 17:53:18 +02:00
src	vulkan: fix top_k bug when there are ties in the input (llama/17659)	2025-12-12 17:53:19 +02:00
.gitignore	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
CMakeLists.txt	build : move _WIN32_WINNT definition to headers (llama/17736)	2025-12-12 17:53:16 +02:00