whisper.cpp

History

Oliver Simons ef85b26d9f CUDA: Fix ssm_scan_f32 data-races (llama/24360) * Add missing syncthreads before resuing cub_temp_storage __syncthreads() is required before being allowed to resue TempStorage smem: https://nvidia.github.io/cccl/unstable/cub/api/classcub_1_1BlockLoad.html#_CPPv4I0EN3cub9BlockLoad4LoadEv20RandomAccessIteratorRA14ItemsPerThread_1Ti * Add one more missing __syncthreads Could also double-buffer, but alternative is to simply ensure all threads have read smem* before writing to it again in the next loop iteration * Remove unused smem from ssm_scan_f32		2026-06-15 10:33:53 +03:00
..
cmake	ggml : Parallelize quant LUT init (llama/23595)	2026-05-25 12:26:07 +03:00
include	ggml : add GGML_OP_COL2IM_1D (llama/24206)	2026-06-15 10:33:53 +03:00
src	CUDA: Fix ssm_scan_f32 data-races (llama/24360)	2026-06-15 10:33:53 +03:00
.gitignore	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
CMakeLists.txt	ggml : bump version to 0.14.0 (ggml/1533)	2026-06-08 14:36:36 +03:00