whisper.cpp/ggml/include
Gaurav Garg 1a1900f90c Remove padding and multiple D2D copies for MTP (llama/24086)
* Make ggml_gated_delta_net take only the initial recurrent state (D, 1, n_seqs) and passes the snapshot count K as an op parameter instead of inferring it from state->ne[1].

Remove the padding hack and copy all emitted snapshots into the recurrent cache with a single strided ggml_cpy

* Make GDN changes in all backends. Address review comments.

* Fix CI build errors
2026-06-15 10:33:53 +03:00
..
ggml-alloc.h TP: fix entirely zero-sized slices per device (llama/23525) 2026-05-25 12:26:07 +03:00
ggml-backend.h TP: quantized KV cache support (llama/23792) 2026-06-08 14:36:36 +03:00
ggml-blas.h
ggml-cann.h docs : Minor cleanups (llama/19252) 2026-02-08 09:29:10 +02:00
ggml-cpp.h ggml : fix ggml_gallocr_ptr type (ggml/1205) 2025-05-01 13:29:02 +03:00
ggml-cpu.h ggml-cpu: FA split across kv for faster TG (llama/19209) 2026-02-08 09:29:10 +02:00
ggml-cuda.h ggml: backend-agnostic tensor parallelism (experimental) (llama/19378) 2026-04-30 11:29:05 +03:00
ggml-hexagon.h Add experimental ggml-hexagon backend for the Hexagon NPU (llama/16547) 2025-11-09 23:38:03 +02:00
ggml-metal.h metal : refactor + optimize v2 (llama/15995) 2025-09-20 13:46:10 +03:00
ggml-opencl.h
ggml-openvino.h ggml : add OpenVINO backend (llama/15307) 2026-03-16 13:10:15 +02:00
ggml-opt.h chore : correct typos [no ci] (llama/20041) 2026-03-16 13:10:15 +02:00
ggml-rpc.h ggml : add GGML_OP_COL2IM_1D (llama/24206) 2026-06-15 10:33:53 +03:00
ggml-sycl.h
ggml-virtgpu.h ggml-virtgpu: make the code thread safe (llama/19204) 2026-02-08 09:29:10 +02:00
ggml-vulkan.h
ggml-webgpu.h ggml: Add initial WebGPU backend (llama/14521) 2025-07-20 00:23:50 +03:00
ggml-zdnn.h zdnn: refactor codebase + add docs (llama/16178) 2025-09-29 15:18:09 +03:00
ggml-zendnn.h ggml-zendnn : add ZenDNN backend for AMD CPUs (llama/17690) 2025-12-12 17:53:21 +02:00
ggml.h Remove padding and multiple D2D copies for MTP (llama/24086) 2026-06-15 10:33:53 +03:00
gguf.h ggml: `gguf_init_from_callback` and `gguf_init_from_buffer` (llama/22341) 2026-05-25 12:44:04 +03:00