whisper.cpp/ggml
Ruben Ortlam 2fcc0a3a9f
Vulkan: MMVQ Integer Dot K-Quant and MUL_MAT_ID support (llama/16900)
* vulkan: split mul_mmq_funcs for mul_mat_vecq use

* add mxfp4 mmvq

* add q2_k mmvq

* add q3_k mmvq

* add q4_k and q5_k mmvq

* add q6_k mmvq

* handle 4x4 quants per mmvq thread

* enable MUL_MAT_ID mmvq support

* enable subgroup optimizations for mul_mat_vec_id shaders

* device tuning

* request prealloc_y sync after quantization

* fix indentation

* fix llvmpipe test failures

* fix mul_mat_id mmvq condition

* fix unused variable warning
2025-12-12 17:53:12 +02:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (llama/15094) 2025-08-18 20:30:45 +03:00
include rpc : cache and reuse compute graphs (llama/15405) 2025-12-12 17:53:11 +02:00
src Vulkan: MMVQ Integer Dot K-Quant and MUL_MAT_ID support (llama/16900) 2025-12-12 17:53:12 +02:00
.gitignore whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
CMakeLists.txt ggml : add GGML_SCHED_NO_REALLOC option to disable reallocations in ggml_backend_sched (llama/17276) 2025-12-12 17:53:12 +02:00