whisper.cpp/ggml/src/ggml-sycl
Neo Zhang f2a8e65ea7 sycl : fix wrong variable check by assert (llama/20903)
* fix wrong variable check by assert

* use GGML api
2026-03-29 15:04:36 +03:00
..
dpct supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) 2026-03-16 13:10:15 +02:00
template-instances supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) 2026-03-16 13:10:15 +02:00
CMakeLists.txt supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) 2026-03-16 13:10:15 +02:00
add-id.cpp sycl : fix wrong variable check by assert (llama/20903) 2026-03-29 15:04:36 +03:00
add-id.hpp Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (llama/17826) 2025-12-18 08:20:56 +02:00
backend.hpp ehance UPSCALE to support all UT cases (llama/20637) 2026-03-29 15:04:36 +03:00
binbcast.cpp support permuted, remove check s0/s10 (llama/19889) 2026-02-27 20:57:58 +02:00
binbcast.hpp fix UT fault cases: count-equal, argsort, pad OPs (llama/16521) 2025-10-15 09:29:17 +03:00
common.cpp
common.hpp add op gated_delta_net (llama/20455) 2026-03-16 13:10:15 +02:00
concat.cpp sycl: add CONCAT operator support (llama/16047) 2025-11-09 23:38:03 +02:00
concat.hpp
conv.cpp
conv.hpp
convert.cpp supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) 2026-03-16 13:10:15 +02:00
convert.hpp fix op rope, add rope_back (llama/20293) 2026-03-16 13:10:15 +02:00
count-equal.cpp supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) 2026-03-16 13:10:15 +02:00
count-equal.hpp fix UT fault cases: count-equal, argsort, pad OPs (llama/16521) 2025-10-15 09:29:17 +03:00
cpy.cpp sycl : support to malloc memory on device more than 4GB, update the doc and script (llama/17566) 2025-12-12 17:53:13 +02:00
cpy.hpp
dequantize.hpp Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (llama/17826) 2025-12-18 08:20:56 +02:00
dmmv.cpp
dmmv.hpp
element_wise.cpp ehance UPSCALE to support all UT cases (llama/20637) 2026-03-29 15:04:36 +03:00
element_wise.hpp ehance UPSCALE to support all UT cases (llama/20637) 2026-03-29 15:04:36 +03:00
fattn-common.hpp supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) 2026-03-16 13:10:15 +02:00
fattn-tile.cpp supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) 2026-03-16 13:10:15 +02:00
fattn-tile.hpp supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) 2026-03-16 13:10:15 +02:00
fattn-vec.hpp supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) 2026-03-16 13:10:15 +02:00
fattn.cpp supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) 2026-03-16 13:10:15 +02:00
fattn.hpp supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) 2026-03-16 13:10:15 +02:00
gated_delta_net.cpp sycl : fix for untransposed GDA recurrent state (llama/20583) 2026-03-29 15:04:36 +03:00
gated_delta_net.hpp add op gated_delta_net (llama/20455) 2026-03-16 13:10:15 +02:00
gemm.hpp
getrows.cpp
getrows.hpp
ggml-sycl.cpp support bf16 and quantized type (llama/20803) 2026-03-29 15:04:36 +03:00
gla.cpp
gla.hpp
im2col.cpp
im2col.hpp
mmq.cpp
mmq.hpp
mmvq.cpp Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (llama/17826) 2025-12-18 08:20:56 +02:00
mmvq.hpp
norm.cpp fix for failed UT case: ACC, L2_NORM, UPSCALE, fused_glu, unary (llama/20283) 2026-03-16 13:10:15 +02:00
norm.hpp sycl: add RMS_NORM_BACK operation support (llama/16808) 2025-11-09 23:38:03 +02:00
outprod.cpp Remove support for Nvidia & AMD GPU, because the oneAPI plugin for Nvidia & AMD GPU is unavailable: download/installation channels are out of work. (llama/19246) 2026-02-08 09:29:10 +02:00
outprod.hpp
pad.cpp Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (llama/17826) 2025-12-18 08:20:56 +02:00
pad.hpp fix UT fault cases: count-equal, argsort, pad OPs (llama/16521) 2025-10-15 09:29:17 +03:00
pad_reflect_1d.cpp refactor pad_reflect_1d to make the UT case pass (llama/17204) 2025-12-12 17:53:10 +02:00
pad_reflect_1d.hpp refactor pad_reflect_1d to make the UT case pass (llama/17204) 2025-12-12 17:53:10 +02:00
presets.hpp supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) 2026-03-16 13:10:15 +02:00
quantize.hpp
quants.hpp chore : correct typos [no ci] (llama/20041) 2026-03-16 13:10:15 +02:00
repeat_back.cpp SYCL: optimized repeat_back kernel (3× fewer asm instructions, 2× faster)Feature/sycl repeat back opt (#16869) 2025-11-09 23:38:03 +02:00
repeat_back.hpp sycl: add REPEAT_BACK operation support (llama/16734) 2025-11-09 23:38:03 +02:00
roll.cpp sycl: add ROLL operation support (llama/16665) 2025-11-09 23:38:03 +02:00
roll.hpp sycl: add ROLL operation support (llama/16665) 2025-11-09 23:38:03 +02:00
rope.cpp fix op rope, add rope_back (llama/20293) 2026-03-16 13:10:15 +02:00
rope.hpp fix op rope, add rope_back (llama/20293) 2026-03-16 13:10:15 +02:00
set.cpp SYCL SET operator optimized for F32 tensors (llama/16350) 2025-10-22 12:58:11 +03:00
set.hpp SYCL SET operator optimized for F32 tensors (llama/16350) 2025-10-22 12:58:11 +03:00
set_rows.cpp
set_rows.hpp
softmax.cpp supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) 2026-03-16 13:10:15 +02:00
softmax.hpp
ssm_conv.cpp Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (llama/17826) 2025-12-18 08:20:56 +02:00
ssm_conv.hpp sycl: add SSM_CONV operation support (llama/16800) 2025-11-09 23:38:03 +02:00
sycl_hw.cpp
sycl_hw.hpp
tsembd.cpp
tsembd.hpp
upscale.cpp ehance UPSCALE to support all UT cases (llama/20637) 2026-03-29 15:04:36 +03:00
upscale.hpp ehance UPSCALE to support all UT cases (llama/20637) 2026-03-29 15:04:36 +03:00
vecdotq.hpp supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) 2026-03-16 13:10:15 +02:00
wkv.cpp Remove support for Nvidia & AMD GPU, because the oneAPI plugin for Nvidia & AMD GPU is unavailable: download/installation channels are out of work. (llama/19246) 2026-02-08 09:29:10 +02:00
wkv.hpp