whisper.cpp/ggml
Reese Levine 092330b474
ggml-webgpu: compute pass batching and removing profiling overhead (llama/21873)
* Update register tiling matmul to use f32 accumulation

* fix profiling code

* Fix register tiling matmul for chrome, i'm blaming dawn

* Update batch tuning value for iOS

* compile fix

* Fix use of new load function

* Move to a single query set for GPU profiling

* Move to batching compute passes when not profiling

* Refactor build_multi

* remove iOS throttling now that we're batching compute passes
2026-04-30 11:29:10 +03:00
..
cmake cmake : remove unused file (ggml/1419) 2026-02-08 09:29:10 +02:00
include CUDA: manage NCCL communicators in context (llama/21891) 2026-04-30 11:29:09 +03:00
src ggml-webgpu: compute pass batching and removing profiling overhead (llama/21873) 2026-04-30 11:29:10 +03:00
.gitignore whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
CMakeLists.txt Fix Q8_0 reorder: garbage on 2nd prompt + crash on full VRAM (llama/21638) 2026-04-30 11:29:10 +03:00