whisper.cpp/ggml/src/ggml-vulkan
leejet 2228462b19
ggml: add ops for WAN video model (cuda && cpu) (llama/15669)
* add conv3d support

* add ggml_pad_ext for cpu & cuda backend

* cuda/cpu: add im2col_3d support

* cuda: make im2col a little faster

* fix cuda pad/scale/im2col3d

* make im2col_3d faster

* gguf: support loading tensors which n_dims > GGML_MAX_DIMS

* fix cuda get_rows

* avoid ggml_conv_3d conflict

* correct GGML_OP_COUNT assertion

* avoid build failure

* avoid build failure on MacOS

* cuda: remove unnecessary MIN define

* fix cpu im2col_3d

* adjust the code style

* cuda: use simpler loop in get_rows

* add test_im2col_3d to test-backend-ops

* test-backend-ops.cpp: remove trailing whitespace

* cpu: im2col_3d support non continuous src

Co-authored-by: Jeff Bolz <jbolz@nvidia.com>

* fix test_im2col_3d

* remove unused variables

* cuda: get_rows: dfloat2 -> float2

* add test_pad_ext to test-backend-ops.cpp

* add gguf_init_from_file_ext impl

* Revert "gguf: support loading tensors which n_dims > GGML_MAX_DIMS"

This reverts commit d8377a0a37f314bd3713fe043b4333ad661610c1.

* Revert "add gguf_init_from_file_ext impl"

This reverts commit d9f1d13208c68ef83b3538201ac7f31614fb1994.

* update ggml_backend_vk_device_supports_op

* fix ggml_backend_vk_device_supports_op

* update other backend supports op for ggml_pad_ext

* metal/opencl/sycl/vulkan: fix GGML_OP_PAD check in supports_op

---------

Co-authored-by: Jeff Bolz <jbolz@nvidia.com>
2025-09-20 13:42:49 +03:00
..
cmake cmake: fix ggml-shaders-gen compiler paths containing spaces (llama/12747) 2025-04-24 20:39:16 +03:00
vulkan-shaders ggml vulkan: add hardsigmoid and hardswish operations (llama/15762) 2025-09-20 13:42:48 +03:00
CMakeLists.txt vulkan: Fix GGML_VULKAN_SHADER_DEBUG_INFO (llama/14427) 2025-07-01 17:54:53 +03:00
ggml-vulkan.cpp ggml: add ops for WAN video model (cuda && cpu) (llama/15669) 2025-09-20 13:42:49 +03:00