whisper.cpp/ggml/src/vulkan-shaders
Jeff Bolz 21b01a21b6 vulkan: Optimize contiguous copies (llama/10254)
* tests: Fix memory bandwidth calculation for perf tests

Add a flops calculation for flash attention.

Add one GGML_OP_CPY perf test.

* vulkan: Optimize contiguous copies

Add a variant of the copy shader for when the tensors are contiguous. Avoid
the complex addressing calculations, and do four elements per invocation
to hide some other overhead.

Apply similar changes to the scale shader, since scale is always contiguous.

Add a "progress bar" for shader compiles.
2024-11-15 15:21:04 +02:00
..
CMakeLists.txt sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
acc.comp sync : vulkan (skip) (llama/0) 2024-08-28 13:22:20 +03:00
add.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
argsort.comp vulkan : argsort barriers must be under uniform control flow (ggml/951) 2024-10-03 12:22:17 +03:00
clamp.comp vulkan: Optimize contiguous copies (llama/10254) 2024-11-15 15:21:04 +02:00
concat.comp sync : vulkan (skip) (llama/0) 2024-08-28 13:22:20 +03:00
contig_copy.comp vulkan: Optimize contiguous copies (llama/10254) 2024-11-15 15:21:04 +02:00
copy.comp vulkan: Optimize contiguous copies (llama/10254) 2024-11-15 15:21:04 +02:00
cos.comp vulkan: Optimize contiguous copies (llama/10254) 2024-11-15 15:21:04 +02:00
dequant_f32.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
dequant_funcs.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
dequant_head.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
dequant_iq4_nl.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
dequant_q2_k.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
dequant_q3_k.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
dequant_q4_0.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
dequant_q4_1.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
dequant_q4_k.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
dequant_q5_0.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
dequant_q5_1.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
dequant_q5_k.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
dequant_q6_k.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
dequant_q8_0.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
diag_mask_inf.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
div.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
gelu.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
gelu_quick.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
generic_binary_head.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
generic_head.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
generic_unary_head.comp vulkan: Optimize contiguous copies (llama/10254) 2024-11-15 15:21:04 +02:00
get_rows.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
get_rows_quant.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
group_norm.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
im2col.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
leaky_relu.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
mul.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
mul_mat_split_k_reduce.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
mul_mat_vec.comp sync : vulkan (skip) (llama/0) 2024-08-28 13:22:20 +03:00
mul_mat_vec_base.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
mul_mat_vec_nc.comp sync : vulkan (skip) (llama/0) 2024-08-28 13:22:20 +03:00
mul_mat_vec_p021.comp sync : vulkan (skip) (llama/0) 2024-08-28 13:22:20 +03:00
mul_mat_vec_q2_k.comp sync : vulkan (skip) (llama/0) 2024-08-28 13:22:20 +03:00
mul_mat_vec_q3_k.comp sync : vulkan (skip) (llama/0) 2024-08-28 13:22:20 +03:00
mul_mat_vec_q4_k.comp sync : vulkan (skip) (llama/0) 2024-08-28 13:22:20 +03:00
mul_mat_vec_q5_k.comp sync : vulkan (skip) (llama/0) 2024-08-28 13:22:20 +03:00
mul_mat_vec_q6_k.comp sync : vulkan (skip) (llama/0) 2024-08-28 13:22:20 +03:00
mul_mm.comp sync : vulkan (skip) (llama/0) 2024-08-28 13:22:20 +03:00
norm.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
pad.comp vulkan: Optimize contiguous copies (llama/10254) 2024-11-15 15:21:04 +02:00
pool2d.comp ggml: Add POOL2D OP for GPU acceleration to the Vulkan backend in the MobileVLM model. (llama/9763) 2024-11-15 15:21:04 +02:00
relu.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
repeat.comp vulkan: Optimize contiguous copies (llama/10254) 2024-11-15 15:21:04 +02:00
rms_norm.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
rope_head.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
rope_neox.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
rope_norm.comp whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
scale.comp vulkan: Optimize contiguous copies (llama/10254) 2024-11-15 15:21:04 +02:00
silu.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
sin.comp vulkan: Optimize contiguous copies (llama/10254) 2024-11-15 15:21:04 +02:00
soft_max.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
square.comp vulkan: Optimize contiguous copies (llama/10254) 2024-11-15 15:21:04 +02:00
sum_rows.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
tanh.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
timestep_embedding.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
types.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
upscale.comp sync : ggml vulkan (ggml/0) 2024-08-21 11:07:13 +03:00
vulkan-shaders-gen.cpp vulkan: Optimize contiguous copies (llama/10254) 2024-11-15 15:21:04 +02:00