whisper.cpp/ggml
Jake Karnes f72ec185fb
CUDA: fix im2col_3d to respect non-contiguous inputs (views) (llama/15956)
* fix im2col_3d to respect non-contiguous inputs (views)

The CUDA 3D im2col kernel computed source addresses assuming compact layout (products of dims), ignoring nb[] strides.

This patch switches im2col_3d source indexing to use true strides derived from src1->nb[] (in elements), mirroring the approach used in the 2D CUDA im2col path. Destination indexing is unchanged.

* use ggml_element_size() for src strides

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

---------

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2025-09-20 13:45:30 +03:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (llama/15094) 2025-08-18 20:30:45 +03:00
include ggml-zdnn: fix #15414, activate FP16 and BF16 acceleration and incorrect zTensor free (llama/15839) 2025-09-20 13:45:28 +03:00
src CUDA: fix im2col_3d to respect non-contiguous inputs (views) (llama/15956) 2025-09-20 13:45:30 +03:00
.gitignore
CMakeLists.txt ggml-cpu: drop support for nnpa intrinsics (llama/15821) 2025-09-20 13:42:50 +03:00