whisper.cpp

History

Reese Levine 092330b474 ggml-webgpu: compute pass batching and removing profiling overhead (llama/21873) * Update register tiling matmul to use f32 accumulation * fix profiling code * Fix register tiling matmul for chrome, i'm blaming dawn * Update batch tuning value for iOS * compile fix * Fix use of new load function * Move to a single query set for GPU profiling * Move to batching compute passes when not profiling * Refactor build_multi * remove iOS throttling now that we're batching compute passes		2026-04-30 11:29:10 +03:00
..
wgsl-shaders	ggml-webgpu: Fix dequantization helpers to not pass in pointers (llama/21872)	2026-04-30 11:29:10 +03:00
CMakeLists.txt	ggml webgpu: add support for emscripten builds (llama/17184)	2025-12-12 17:53:16 +02:00
ggml-webgpu-shader-lib.hpp	ggml-webgpu: address quantization precision and backend lifecycle managment (llama/21521)	2026-04-30 11:29:05 +03:00
ggml-webgpu.cpp	ggml-webgpu: compute pass batching and removing profiling overhead (llama/21873)	2026-04-30 11:29:10 +03:00
pre_wgsl.hpp	ggml webgpu: initial flashattention implementation (llama/18610)	2026-01-14 09:11:59 +02:00