whisper.cpp/ggml
Jeff Bolz 49b505bcc5 vulkan: change gated_delta_net to shard a column across a subgroup (llama/20662)
* vulkan: change gated_delta_net to shard a column across a subgroup

This is based on https://github.com/ggml-org/llama.cpp/pull/20391, I used an
LLM to port the CUDA code to Vulkan, and guided to it to make various fixes to
work with Vulkan (e.g. handling different subgroup sizes, unknown mapping of
subgroup to invocation id, using subgroupAdd optionally, etc.).

This fixes a perf regression from the transposing of the values in memory
(!20443).

* vulkan: Spread columns across fewer lanes to reduce the number of workgroups
2026-03-29 15:04:36 +03:00
..
cmake cmake : remove unused file (ggml/1419) 2026-02-08 09:29:10 +02:00
include ggml : restore ggml_type_sizef() to aboid major version bump (ggml/1441) 2026-03-18 15:18:24 +02:00
src vulkan: change gated_delta_net to shard a column across a subgroup (llama/20662) 2026-03-29 15:04:36 +03:00
.gitignore whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
CMakeLists.txt ggml : bump version to 0.9.8 (ggml/1442) 2026-03-18 15:18:24 +02:00