whisper.cpp

History

Alan Gray d1d847f184 Simplify and improve CUDA graphs through use of indirect copy pointers (llama/9017) * CUDA: Simplify and improve CUDA graphs through use of indirect copy pointers Previously there was complexity in the CUDA graphs implementation due frequently changing parameters to copy kernels associated with K and V cache pointers. This patch simplifies by using indirection to avoid such parameters frequently changing, avoiding the need for frequent graph updates. Fixes #12152 * Addressed comments * fix HIP builds * properly sync to stream * removed ggml_cuda_cpy_fn_ptrs * move stream sync before free * guard to only use indirection with graphs * style fixes * check for errors --------- Co-authored-by: slaren <slarengh@gmail.com>		2025-04-24 20:39:16 +03:00
..
cmake	ggml : sync/merge cmake,riscv,powerpc, add common.cmake (ggml/0)	2025-03-27 11:06:03 +02:00
include	metal : improve FA + improve MoE (llama/12612)	2025-03-28 21:47:42 +02:00
src	Simplify and improve CUDA graphs through use of indirect copy pointers (llama/9017)	2025-04-24 20:39:16 +03:00
.gitignore	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
CMakeLists.txt	ggml : sync/merge cmake,riscv,powerpc, add common.cmake (ggml/0)	2025-03-27 11:06:03 +02:00