whisper.cpp/ggml/src/ggml-webgpu
Reese Levine e9dbd0c18a ggml-webgpu: FlashAttention refactor + standardize quantization support (llama/23834)
* Start work on flash_attn refactor

* Refactor

* Split k/v quantization

* Refactor and abstract quantization logic for flash_attn and mul_mat

* Add quantization support to tile path

* formatting

* Move to functions, add a check
2026-06-08 14:36:36 +03:00
..
wgsl-shaders ggml-webgpu: FlashAttention refactor + standardize quantization support (llama/23834) 2026-06-08 14:36:36 +03:00
CMakeLists.txt ggml-webgpu: FlashAttention refactor + standardize quantization support (llama/23834) 2026-06-08 14:36:36 +03:00
ggml-webgpu-shader-lib.hpp ggml-webgpu: FlashAttention refactor + standardize quantization support (llama/23834) 2026-06-08 14:36:36 +03:00
ggml-webgpu.cpp ggml-webgpu: FlashAttention refactor + standardize quantization support (llama/23834) 2026-06-08 14:36:36 +03:00
pre_wgsl.hpp ggml-webgpu: FlashAttention refactor + standardize quantization support (llama/23834) 2026-06-08 14:36:36 +03:00