whisper.cpp/ggml
Daniele 13eeebb1b2 vulkan: subgroup size tuning (llama/12087)
* vulkan: subgroup size test

* Vulkan: Add device architecture enum and logic to recognize AMD generations

* vulkan: use new architecture logic to specify subgroup size

* Initial vulkan subgroup size tuning for RDNA3

* vulkan: commonize RDNA subgroup tuning

* vulkan: override subgroup size if required_subgroup_size = 0

* vulkan: disable warp 32 for RDNA3

* vulkan: fine tuned RDNA1 subgroup sizes

* vulkan: adjusted subgroup size map

* vulkan: fixed RDNA2 subgroup map

---------

Co-authored-by: 0cc4m <picard12@live.de>
2025-03-27 11:06:03 +02:00
..
cmake cmake: Comment out GGML_BIN_DIR for now (ggml/1139) 2025-03-27 11:06:03 +02:00
include ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions (llama/12154) 2025-03-08 15:13:01 +02:00
src vulkan: subgroup size tuning (llama/12087) 2025-03-27 11:06:03 +02:00
.gitignore whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
CMakeLists.txt opencl: use OpenCL C standard supported by the device (llama/12221) 2025-03-27 11:06:03 +02:00