whisper.cpp

History

Gaurav Garg 1a1900f90c Remove padding and multiple D2D copies for MTP (llama/24086) * Make ggml_gated_delta_net take only the initial recurrent state (D, 1, n_seqs) and passes the snapshot count K as an op parameter instead of inferring it from state->ne[1]. Remove the padding hack and copy all emitted snapshots into the recurrent cache with a single strided ggml_cpy * Make GDN changes in all backends. Address review comments. * Fix CI build errors		2026-06-15 10:33:53 +03:00
..
ggml-alloc.h	TP: fix entirely zero-sized slices per device (llama/23525)	2026-05-25 12:26:07 +03:00
ggml-backend.h	TP: quantized KV cache support (llama/23792)	2026-06-08 14:36:36 +03:00
ggml-blas.h	…
ggml-cann.h	docs : Minor cleanups (llama/19252)	2026-02-08 09:29:10 +02:00
ggml-cpp.h	ggml : fix ggml_gallocr_ptr type (ggml/1205)	2025-05-01 13:29:02 +03:00
ggml-cpu.h	ggml-cpu: FA split across kv for faster TG (llama/19209)	2026-02-08 09:29:10 +02:00
ggml-cuda.h	ggml: backend-agnostic tensor parallelism (experimental) (llama/19378)	2026-04-30 11:29:05 +03:00
ggml-hexagon.h	Add experimental ggml-hexagon backend for the Hexagon NPU (llama/16547)	2025-11-09 23:38:03 +02:00
ggml-metal.h	metal : refactor + optimize v2 (llama/15995)	2025-09-20 13:46:10 +03:00
ggml-opencl.h	Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (llama/10693)	2024-12-18 12:52:16 +02:00
ggml-openvino.h	ggml : add OpenVINO backend (llama/15307)	2026-03-16 13:10:15 +02:00
ggml-opt.h	chore : correct typos [no ci] (llama/20041)	2026-03-16 13:10:15 +02:00
ggml-rpc.h	ggml : add GGML_OP_COL2IM_1D (llama/24206)	2026-06-15 10:33:53 +03:00
ggml-sycl.h	…
ggml-virtgpu.h	ggml-virtgpu: make the code thread safe (llama/19204)	2026-02-08 09:29:10 +02:00
ggml-vulkan.h	vulkan: Make Vulkan optional at runtime (ggml/11493). (llama/11494)	2025-02-27 08:55:36 +02:00
ggml-webgpu.h	ggml: Add initial WebGPU backend (llama/14521)	2025-07-20 00:23:50 +03:00
ggml-zdnn.h	zdnn: refactor codebase + add docs (llama/16178)	2025-09-29 15:18:09 +03:00
ggml-zendnn.h	ggml-zendnn : add ZenDNN backend for AMD CPUs (llama/17690)	2025-12-12 17:53:21 +02:00
ggml.h	Remove padding and multiple D2D copies for MTP (llama/24086)	2026-06-15 10:33:53 +03:00
gguf.h	ggml: `gguf_init_from_callback` and `gguf_init_from_buffer` (llama/22341)	2026-05-25 12:44:04 +03:00