whisper.cpp/ggml
Max Krasnyansky 55f8cfdaed hexagon: Q4_0 and MXFP4 repack fixes (llama/20527)
* hexagon: fix tail corruption with rows sizes not multiple of 256

* hexagon: use different stride for repacking partial blocks

* hex-mm: update repack and kernels to avoid shuffles for full 256-element blocks

Previous commit changed the repacking to use even:odd (0:1,2:3,..) packing
instead of the original (0:128,1:129,...) packing in order to fix tail corruption.
Since the mm kernels already deal with partial tails we can use even:odd
packing only for the last block.
This avoid performance penalty of having to shuffle to zip the elements
in the common case.

* hex-mm: update rmpy x8 for better optimizations

* hex-mm: tighten supported MUL_MAT checks to avoid spurios failures

* hex-mm: use vzero to init accumulators

* hex-mm: properly call partial rmpy_x8
2026-03-16 13:10:15 +02:00
..
cmake cmake : remove unused file (ggml/1419) 2026-02-08 09:29:10 +02:00
include ggml : add OpenVINO backend (llama/15307) 2026-03-16 13:10:15 +02:00
src hexagon: Q4_0 and MXFP4 repack fixes (llama/20527) 2026-03-16 13:10:15 +02:00
.gitignore
CMakeLists.txt ggml : add OpenVINO backend (llama/15307) 2026-03-16 13:10:15 +02:00