whisper.cpp/ggml
Doctor Shotgun b9965c89a1 ggml: add env var GGML_OP_OFFLOAD_MIN_BATCH (llama/18535)
* ggml: add env var GGML_OP_OFFLOAD_MIN_BATCH
* makes the min_batch_size for triggering op offload configurable via env var, defaulting to the prior hardcoded value of 32

* ggml: read GGML_OP_OFFLOAD_MIN_BATCH once and store to dev ctx

* cann: forward declaration of device context struct

* cann: move offload op check after device context declaration

* cuda: fix whitespace

Co-authored-by: Aman Gupta <amangupta052@gmail.com>

---------

Co-authored-by: Aman Gupta <amangupta052@gmail.com>
2026-01-14 09:11:59 +02:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (llama/15094) 2025-08-18 20:30:45 +03:00
include vulkan: extend topk_moe to handle sigmoid w/exp_probs_b for nemotron (llama/18295) 2026-01-14 09:11:59 +02:00
src ggml: add env var GGML_OP_OFFLOAD_MIN_BATCH (llama/18535) 2026-01-14 09:11:59 +02:00
.gitignore whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
CMakeLists.txt ggml : bump version to 0.9.5 (ggml/1410) 2025-12-31 18:27:20 +02:00