whisper.cpp/ggml/src/ggml-webgpu
Reese Levine 092330b474
ggml-webgpu: compute pass batching and removing profiling overhead (llama/21873)
* Update register tiling matmul to use f32 accumulation

* fix profiling code

* Fix register tiling matmul for chrome, i'm blaming dawn

* Update batch tuning value for iOS

* compile fix

* Fix use of new load function

* Move to a single query set for GPU profiling

* Move to batching compute passes when not profiling

* Refactor build_multi

* remove iOS throttling now that we're batching compute passes
2026-04-30 11:29:10 +03:00
..
wgsl-shaders ggml-webgpu: Fix dequantization helpers to not pass in pointers (llama/21872) 2026-04-30 11:29:10 +03:00
CMakeLists.txt ggml webgpu: add support for emscripten builds (llama/17184) 2025-12-12 17:53:16 +02:00
ggml-webgpu-shader-lib.hpp ggml-webgpu: address quantization precision and backend lifecycle managment (llama/21521) 2026-04-30 11:29:05 +03:00
ggml-webgpu.cpp ggml-webgpu: compute pass batching and removing profiling overhead (llama/21873) 2026-04-30 11:29:10 +03:00
pre_wgsl.hpp ggml webgpu: initial flashattention implementation (llama/18610) 2026-01-14 09:11:59 +02:00