whisper.cpp/ggml
Gaurav Garg 1a1900f90c Remove padding and multiple D2D copies for MTP (llama/24086)
* Make ggml_gated_delta_net take only the initial recurrent state (D, 1, n_seqs) and passes the snapshot count K as an op parameter instead of inferring it from state->ne[1].

Remove the padding hack and copy all emitted snapshots into the recurrent cache with a single strided ggml_cpy

* Make GDN changes in all backends. Address review comments.

* Fix CI build errors
2026-06-15 10:33:53 +03:00
..
cmake ggml : Parallelize quant LUT init (llama/23595) 2026-05-25 12:26:07 +03:00
include Remove padding and multiple D2D copies for MTP (llama/24086) 2026-06-15 10:33:53 +03:00
src Remove padding and multiple D2D copies for MTP (llama/24086) 2026-06-15 10:33:53 +03:00
.gitignore
CMakeLists.txt ggml : bump version to 0.14.0 (ggml/1533) 2026-06-08 14:36:36 +03:00