whisper.cpp/ggml/src
Max Krasnyansky 53b571a47e hexagon refactor all Ops to use local context struct (llama/19819)
* hexagon: refactor set/get/sum-rows ops to use local context

* hexagon: refactor ROPE and Softmax Ops to use local context

Improves performance a bit by precomputing things and saving in the context.

* hexagon: refactor activation ops to use local context struct

* hexagon: refactor unary ops to use local context struct and DMA/VTCM

* hexagon: use aligned hvx_scale function

* hexagon: remove unused fields from op_context

* hexagon: rewrite ROPE to use DMA and VTCM scratchpad

* hex-rope: keep N rows in scratchpad (instead of just two)

* hex-rope: introduce rowidx cache

* hex-rope: remove unused fields

* hex-rope: rewrite dma prefetch logic to allow for multi-row fetch/compute

also removes the need for fastdiv.

* hex-rope: minor formatting

* hex-rope: use indices and unroll the loops

* hex-rope: more updates to cleanup rope-block handling

* hexagon: cleanup supported type/dims checks

* hexagon: all reduce funcs replicated across lanes

There is no need to explicitly replicate the first value.

* snapdragon: update adb and windows scripts to use ubatch-size 256

Updated Ops support handles larger ubatches.
2026-02-27 20:57:58 +02:00
..
ggml-blas
ggml-cann
ggml-cpu ggml-cpu: arm64: q5_K repack gemm and gemv (and generic) implementations (dotprod) (llama/19356) 2026-02-27 20:57:58 +02:00
ggml-cuda Improve CUDA graph capture (llama/19754) 2026-02-27 20:57:58 +02:00
ggml-hexagon hexagon refactor all Ops to use local context struct (llama/19819) 2026-02-27 20:57:58 +02:00
ggml-hip
ggml-metal
ggml-musa
ggml-opencl opencl: refactor expm1 and softplus (llama/19404) 2026-02-27 20:57:58 +02:00
ggml-rpc
ggml-sycl
ggml-virtgpu
ggml-vulkan vulkan: fix MMQ shader push constants and multi-dispatch (llama/19732) 2026-02-27 20:57:58 +02:00
ggml-webgpu ggml-webgpu: Add unary op (SQR, SQRT, SIN, COS) support. (llama/19700) 2026-02-27 20:57:58 +02:00
ggml-zdnn
ggml-zendnn
CMakeLists.txt
ggml-alloc.c
ggml-backend-dl.cpp
ggml-backend-dl.h
ggml-backend-impl.h
ggml-backend-reg.cpp
ggml-backend.cpp
ggml-common.h
ggml-impl.h
ggml-opt.cpp
ggml-quants.c
ggml-quants.h
ggml-threading.cpp
ggml-threading.h
ggml.c
ggml.cpp
gguf.cpp