whisper.cpp/ggml
hipudding 314ce5981e CANN: Add support for async operator submission (llama/12864)
Submit operators using asynchronous threads to improve performance.

Use the environment variable GGML_CANN_ASYNC_MODE to control whether
asynchronous submission is enabled. It is disabled by default.

Testing shows a 10%–20% performance improvement in scenarios with
small parameter sizes, especially in quantized models.
2025-04-24 20:39:16 +03:00
..
cmake ggml : sync/merge cmake,riscv,powerpc, add common.cmake (ggml/0) 2025-03-27 11:06:03 +02:00
include ggml : Depthwise 2D convolution (ggml/1152) 2025-04-24 20:39:16 +03:00
src CANN: Add support for async operator submission (llama/12864) 2025-04-24 20:39:16 +03:00
.gitignore whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
CMakeLists.txt CUDA/HIP: Share the same unified memory allocation logic. (llama/12934) 2025-04-24 20:39:16 +03:00