whisper.cpp/ggml
Atharva Dubey ef2a79d2b8 sycl: quantize and reorder the input to q8_1 when reorder is enabled (llama/13826)
* [WIP]: fuse q8 quantization and reorder

* wip2: fuse q8 quantization and reorder

* working q8 reorder commit

* restored common.hpp

* remove debug prints

* remove unnecessary headers and remove trailing whitespace

* Update ggml/src/ggml-sycl/ggml-sycl.cpp

Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com>

---------

Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com>
2025-06-10 12:40:33 +03:00
..
cmake cmake: Factor out CPU architecture detection (llama/13883) 2025-06-01 15:14:44 +03:00
include threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (llama/12995) 2025-06-01 15:14:44 +03:00
src sycl: quantize and reorder the input to q8_1 when reorder is enabled (llama/13826) 2025-06-10 12:40:33 +03:00
.gitignore
CMakeLists.txt vulkan: use timestamp queries for GGML_VULKAN_PERF (llama/13817) 2025-06-01 15:14:44 +03:00