whisper.cpp

History

Chenguang Li 9b773acac0 CANN: implement LRU cache for ACL graphs (llama/15814) * CANN: implement LRU cache for ACL graphs in CANN backend - Introduce ggml_cann_graph_lru_cache to store multiple ggml_cann_graph objects. - Graphs are loaded on demand and evicted using LRU policy when capacity is exceeded. - Updated push, move_to_front, and clear methods to manage cached graphs efficiently. - Ensures reuse of graphs, reducing graph reconstruction overhead in CANN backend. * fix typo * The LRU cache capacity can be configured via an env variable Signed-off-by: noemotiovon <757486878@qq.com> * refactory acl graph * refactory && fix review comments Signed-off-by: noemotiovon <757486878@qq.com> --------- Signed-off-by: noemotiovon <757486878@qq.com>		2025-09-20 13:42:53 +03:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (llama/15094)	2025-08-18 20:30:45 +03:00
include	cuda : fix supports_op condition for get_rows when number of blocks is too large (llama/15868)	2025-09-20 13:42:52 +03:00
src	CANN: implement LRU cache for ACL graphs (llama/15814)	2025-09-20 13:42:53 +03:00
.gitignore	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
CMakeLists.txt	ggml-cpu: drop support for nnpa intrinsics (llama/15821)	2025-09-20 13:42:50 +03:00