whisper.cpp

History

Yiwei Shao 52699f6d19 hexagon: support for IQ4_NL and MXFP4 (llama/21018) * ggml-hexagon: add IQ4_NL and MXFP4 HMX matmul support - Add IQ4_NL quantization type support to Hexagon backend (buffer set/get tensor repack, mul_mat, mul_mat_id dispatch) - Implement HVX IQ4_NL vec_dot kernels (1x1, 2x1, 2x2) with LUT-based 4-bit index to int8 kvalue dequantization - Add MXFP4 HMX dequantization path with E8M0 scale conversion, including batch-4 fast path and single-tile fallback - Unify quantized row size / scale offset logic to handle Q4_0, Q8_0, IQ4_NL, and MXFP4 in the DMA fetch path * ggml-hexagon: fix SKIP_QUANTIZE src1 address mismatch in mixed-quant models * Fix the pragma indent		2026-03-29 15:04:36 +03:00
..
cmake	cmake : remove unused file (ggml/1419)	2026-02-08 09:29:10 +02:00
include	llama: fix llama-model-saver (llama/20503)	2026-03-29 15:04:36 +03:00
src	hexagon: support for IQ4_NL and MXFP4 (llama/21018)	2026-03-29 15:04:36 +03:00
.gitignore	…
CMakeLists.txt	ggml : bump version to 0.9.8 (ggml/1442)	2026-03-18 15:18:24 +02:00