whisper.cpp/ggml
Johannes Gäßler c426829771
CUDA: fix crash with partial offloading of MoE (llama/13439)
2025-05-13 13:05:33 +03:00
..
cmake ggml : sync/merge cmake,riscv,powerpc, add common.cmake (ggml/0) 2025-03-27 11:06:03 +02:00
include Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (llama/13386) 2025-05-13 13:05:33 +03:00
src CUDA: fix crash with partial offloading of MoE (llama/13439) 2025-05-13 13:05:33 +03:00
.gitignore
CMakeLists.txt whisper: remove MSVC warnings pragmas (#3090) 2025-05-05 13:09:35 +02:00