| .. |
|
dpct
|
supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190)
|
2026-03-16 13:10:15 +02:00 |
|
template-instances
|
supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190)
|
2026-03-16 13:10:15 +02:00 |
|
CMakeLists.txt
|
supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190)
|
2026-03-16 13:10:15 +02:00 |
|
add-id.cpp
|
sycl : fix wrong variable check by assert (llama/20903)
|
2026-03-29 15:04:36 +03:00 |
|
add-id.hpp
|
Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (llama/17826)
|
2025-12-18 08:20:56 +02:00 |
|
backend.hpp
|
ehance UPSCALE to support all UT cases (llama/20637)
|
2026-03-29 15:04:36 +03:00 |
|
binbcast.cpp
|
support permuted, remove check s0/s10 (llama/19889)
|
2026-02-27 20:57:58 +02:00 |
|
binbcast.hpp
|
fix UT fault cases: count-equal, argsort, pad OPs (llama/16521)
|
2025-10-15 09:29:17 +03:00 |
|
common.cpp
|
…
|
|
|
common.hpp
|
add op gated_delta_net (llama/20455)
|
2026-03-16 13:10:15 +02:00 |
|
concat.cpp
|
sycl: add CONCAT operator support (llama/16047)
|
2025-11-09 23:38:03 +02:00 |
|
concat.hpp
|
…
|
|
|
conv.cpp
|
…
|
|
|
conv.hpp
|
…
|
|
|
convert.cpp
|
supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190)
|
2026-03-16 13:10:15 +02:00 |
|
convert.hpp
|
fix op rope, add rope_back (llama/20293)
|
2026-03-16 13:10:15 +02:00 |
|
count-equal.cpp
|
supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190)
|
2026-03-16 13:10:15 +02:00 |
|
count-equal.hpp
|
fix UT fault cases: count-equal, argsort, pad OPs (llama/16521)
|
2025-10-15 09:29:17 +03:00 |
|
cpy.cpp
|
sycl : support to malloc memory on device more than 4GB, update the doc and script (llama/17566)
|
2025-12-12 17:53:13 +02:00 |
|
cpy.hpp
|
…
|
|
|
dequantize.hpp
|
Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (llama/17826)
|
2025-12-18 08:20:56 +02:00 |
|
dmmv.cpp
|
…
|
|
|
dmmv.hpp
|
…
|
|
|
element_wise.cpp
|
ehance UPSCALE to support all UT cases (llama/20637)
|
2026-03-29 15:04:36 +03:00 |
|
element_wise.hpp
|
ehance UPSCALE to support all UT cases (llama/20637)
|
2026-03-29 15:04:36 +03:00 |
|
fattn-common.hpp
|
supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190)
|
2026-03-16 13:10:15 +02:00 |
|
fattn-tile.cpp
|
supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190)
|
2026-03-16 13:10:15 +02:00 |
|
fattn-tile.hpp
|
supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190)
|
2026-03-16 13:10:15 +02:00 |
|
fattn-vec.hpp
|
supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190)
|
2026-03-16 13:10:15 +02:00 |
|
fattn.cpp
|
supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190)
|
2026-03-16 13:10:15 +02:00 |
|
fattn.hpp
|
supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190)
|
2026-03-16 13:10:15 +02:00 |
|
gated_delta_net.cpp
|
sycl : fix for untransposed GDA recurrent state (llama/20583)
|
2026-03-29 15:04:36 +03:00 |
|
gated_delta_net.hpp
|
add op gated_delta_net (llama/20455)
|
2026-03-16 13:10:15 +02:00 |
|
gemm.hpp
|
…
|
|
|
getrows.cpp
|
…
|
|
|
getrows.hpp
|
…
|
|
|
ggml-sycl.cpp
|
support bf16 and quantized type (llama/20803)
|
2026-03-29 15:04:36 +03:00 |
|
gla.cpp
|
…
|
|
|
gla.hpp
|
…
|
|
|
im2col.cpp
|
…
|
|
|
im2col.hpp
|
…
|
|
|
mmq.cpp
|
…
|
|
|
mmq.hpp
|
…
|
|
|
mmvq.cpp
|
Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (llama/17826)
|
2025-12-18 08:20:56 +02:00 |
|
mmvq.hpp
|
…
|
|
|
norm.cpp
|
fix for failed UT case: ACC, L2_NORM, UPSCALE, fused_glu, unary (llama/20283)
|
2026-03-16 13:10:15 +02:00 |
|
norm.hpp
|
sycl: add RMS_NORM_BACK operation support (llama/16808)
|
2025-11-09 23:38:03 +02:00 |
|
outprod.cpp
|
Remove support for Nvidia & AMD GPU, because the oneAPI plugin for Nvidia & AMD GPU is unavailable: download/installation channels are out of work. (llama/19246)
|
2026-02-08 09:29:10 +02:00 |
|
outprod.hpp
|
…
|
|
|
pad.cpp
|
Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (llama/17826)
|
2025-12-18 08:20:56 +02:00 |
|
pad.hpp
|
fix UT fault cases: count-equal, argsort, pad OPs (llama/16521)
|
2025-10-15 09:29:17 +03:00 |
|
pad_reflect_1d.cpp
|
refactor pad_reflect_1d to make the UT case pass (llama/17204)
|
2025-12-12 17:53:10 +02:00 |
|
pad_reflect_1d.hpp
|
refactor pad_reflect_1d to make the UT case pass (llama/17204)
|
2025-12-12 17:53:10 +02:00 |
|
presets.hpp
|
supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190)
|
2026-03-16 13:10:15 +02:00 |
|
quantize.hpp
|
…
|
|
|
quants.hpp
|
chore : correct typos [no ci] (llama/20041)
|
2026-03-16 13:10:15 +02:00 |
|
repeat_back.cpp
|
SYCL: optimized repeat_back kernel (3× fewer asm instructions, 2× faster)Feature/sycl repeat back opt (#16869)
|
2025-11-09 23:38:03 +02:00 |
|
repeat_back.hpp
|
sycl: add REPEAT_BACK operation support (llama/16734)
|
2025-11-09 23:38:03 +02:00 |
|
roll.cpp
|
sycl: add ROLL operation support (llama/16665)
|
2025-11-09 23:38:03 +02:00 |
|
roll.hpp
|
sycl: add ROLL operation support (llama/16665)
|
2025-11-09 23:38:03 +02:00 |
|
rope.cpp
|
fix op rope, add rope_back (llama/20293)
|
2026-03-16 13:10:15 +02:00 |
|
rope.hpp
|
fix op rope, add rope_back (llama/20293)
|
2026-03-16 13:10:15 +02:00 |
|
set.cpp
|
SYCL SET operator optimized for F32 tensors (llama/16350)
|
2025-10-22 12:58:11 +03:00 |
|
set.hpp
|
SYCL SET operator optimized for F32 tensors (llama/16350)
|
2025-10-22 12:58:11 +03:00 |
|
set_rows.cpp
|
…
|
|
|
set_rows.hpp
|
…
|
|
|
softmax.cpp
|
supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190)
|
2026-03-16 13:10:15 +02:00 |
|
softmax.hpp
|
…
|
|
|
ssm_conv.cpp
|
Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (llama/17826)
|
2025-12-18 08:20:56 +02:00 |
|
ssm_conv.hpp
|
sycl: add SSM_CONV operation support (llama/16800)
|
2025-11-09 23:38:03 +02:00 |
|
sycl_hw.cpp
|
…
|
|
|
sycl_hw.hpp
|
…
|
|
|
tsembd.cpp
|
…
|
|
|
tsembd.hpp
|
…
|
|
|
upscale.cpp
|
ehance UPSCALE to support all UT cases (llama/20637)
|
2026-03-29 15:04:36 +03:00 |
|
upscale.hpp
|
ehance UPSCALE to support all UT cases (llama/20637)
|
2026-03-29 15:04:36 +03:00 |
|
vecdotq.hpp
|
supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190)
|
2026-03-16 13:10:15 +02:00 |
|
wkv.cpp
|
Remove support for Nvidia & AMD GPU, because the oneAPI plugin for Nvidia & AMD GPU is unavailable: download/installation channels are out of work. (llama/19246)
|
2026-02-08 09:29:10 +02:00 |
|
wkv.hpp
|
…
|
|