whisper.cpp

History

Jeff Bolz b1f65a4a7e vulkan: extend topk_moe to handle sigmoid w/exp_probs_b for nemotron (llama/18295) * vulkan: extend topk_moe to handle sigmoid w/exp_probs_b for nemotron Also handle GGML_OP_SCALE at the end (nemotron, deepseek2). Fewer pipeline variants and spec constants, just use push constants. In test_topk_moe, change exp_probs_b to be 1D, matching real networks. Update test-backend-ops and ggml-backend to allow verifying multiple outputs in a fusion test (topk_moe has two outputs). Previously only the final node was verified. * change test_topk_moe to allow results in arbitrary order * disable sigmoid fusion for moltenvk		2026-01-14 09:11:59 +02:00
..
ggml-alloc.h	llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (llama/16653)	2025-12-18 08:20:56 +02:00
ggml-backend.h	vulkan: extend topk_moe to handle sigmoid w/exp_probs_b for nemotron (llama/18295)	2026-01-14 09:11:59 +02:00
ggml-blas.h	…
ggml-cann.h	…
ggml-cpp.h	…
ggml-cpu.h	ggml-cpu : fix RISC-V Q4_0 repack select and RVV feature reporting (llama/17951)	2025-12-18 08:20:56 +02:00
ggml-cuda.h	…
ggml-hexagon.h	Add experimental ggml-hexagon backend for the Hexagon NPU (llama/16547)	2025-11-09 23:38:03 +02:00
ggml-metal.h	…
ggml-opencl.h	…
ggml-opt.h	…
ggml-rpc.h	rpc : fix alloc size logic (llama/17116)	2025-12-12 17:53:18 +02:00
ggml-sycl.h	…
ggml-vulkan.h	…
ggml-webgpu.h	…
ggml-zdnn.h	zdnn: refactor codebase + add docs (llama/16178)	2025-09-29 15:18:09 +03:00
ggml-zendnn.h	ggml-zendnn : add ZenDNN backend for AMD CPUs (llama/17690)	2025-12-12 17:53:21 +02:00
ggml.h	llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (llama/16653)	2025-12-18 08:20:56 +02:00
gguf.h	…