..
ggml-amx
ggml : adapt AMX to tensor->grad removal (llama/0)
2024-11-20 21:00:08 +02:00
ggml-blas
ggml : build backends as libraries (llama/10256)
2024-11-20 21:00:08 +02:00
ggml-cann
ggml : build backends as libraries (llama/10256)
2024-11-20 21:00:08 +02:00
ggml-cpu
ggml : fix undefined reference to 'getcpu' (llama/10354)
2024-11-20 21:00:08 +02:00
ggml-cuda
CUDA: remove DMMV, consolidate F16 mult mat vec (llama/10318)
2024-11-20 21:00:08 +02:00
ggml-hip
CUDA: remove DMMV, consolidate F16 mult mat vec (llama/10318)
2024-11-20 21:00:08 +02:00
ggml-kompute
ggml : build backends as libraries (llama/10256)
2024-11-20 21:00:08 +02:00
ggml-metal
metal : refactor kernel args into structs (llama/10238)
2024-11-20 21:00:08 +02:00
ggml-musa /ggml
CUDA: remove DMMV, consolidate F16 mult mat vec (llama/10318)
2024-11-20 21:00:08 +02:00
ggml-rpc
ggml : build backends as libraries (llama/10256)
2024-11-20 21:00:08 +02:00
ggml-sycl
sycl: Use syclcompat::dp4a (llama/10267)
2024-11-20 21:00:08 +02:00
ggml-vulkan
vulkan: Optimize some mat-vec mul quant shaders (llama/10296)
2024-11-20 21:00:08 +02:00
kompute-shaders
whisper : reorganize source code + improve CMake ( #2256 )
2024-06-26 19:34:09 +03:00
vulkan-shaders
vulkan: Optimize contiguous copies (llama/10254)
2024-11-15 15:21:04 +02:00
CMakeLists.txt
ggml: new optimization interface (ggml/988)
2024-11-20 21:00:08 +02:00
ggml-aarch64.c
ggml : optimize Q4_0 into Q4_0_X_Y repack (llama/10324)
2024-11-20 21:00:08 +02:00
ggml-aarch64.h
ggml : build backends as libraries (llama/10256)
2024-11-20 21:00:08 +02:00
ggml-alloc.c
ggml: new optimization interface (ggml/988)
2024-11-20 21:00:08 +02:00
ggml-amx.cpp
llama : refactor model loader with backend registry (llama/10026)
2024-11-15 15:21:04 +02:00
ggml-backend-impl.h
llama : refactor model loader with backend registry (llama/10026)
2024-11-15 15:21:04 +02:00
ggml-backend-reg.cpp
ggml : build backends as libraries (llama/10256)
2024-11-20 21:00:08 +02:00
ggml-backend.cpp
ggml : fix possible buffer use after free in sched reserve (llama/9930)
2024-11-20 21:00:08 +02:00
ggml-blas.cpp
llama : refactor model loader with backend registry (llama/10026)
2024-11-15 15:21:04 +02:00
ggml-cann.cpp
CANN: adjust backend registry refactor. (llama/10158)
2024-11-15 15:21:04 +02:00
ggml-common.h
ggml-quants : ternary packing for TriLMs and BitNet b1.58 (llama/8151)
2024-09-24 19:45:08 +03:00
ggml-cpu-impl.h
ggml : add ggml-cpu-impl.h (skip) ( #0 )
2024-09-24 19:45:08 +03:00
ggml-cpu.c
fix q4_0_8_8 format for corrupted tokens issue (llama/10198)
2024-11-15 15:21:04 +02:00
ggml-cuda.cu
metal : optimize FA kernels (llama/10171)
2024-11-15 15:21:04 +02:00
ggml-impl.h
ggml: new optimization interface (ggml/988)
2024-11-20 21:00:08 +02:00
ggml-kompute.cpp
kompute: add mul_mat_q4_k shader (llama/10097)
2024-11-15 15:21:04 +02:00
ggml-metal.m
metal : fix build and some more comments (llama/10229)
2024-11-15 15:21:04 +02:00
ggml-metal.metal
metal : more precise Q*K in FA vec kernel (llama/10247)
2024-11-15 15:21:04 +02:00
ggml-opt.cpp
ggml : inttypes.h -> cinttypes (llama/0)
2024-11-20 21:00:08 +02:00
ggml-quants.c
ggml : build backends as libraries (llama/10256)
2024-11-20 21:00:08 +02:00
ggml-quants.h
ggml : build backends as libraries (llama/10256)
2024-11-20 21:00:08 +02:00
ggml-rpc.cpp
ggml : move CPU backend to a separate file (llama/10144)
2024-11-15 15:21:04 +02:00
ggml-sycl.cpp
sycl : Fixes to broken builds and test-backend-ops (llama/10257)
2024-11-15 15:21:04 +02:00
ggml-threading.cpp
ggml : build backends as libraries (llama/10256)
2024-11-20 21:00:08 +02:00
ggml-threading.h
ggml : build backends as libraries (llama/10256)
2024-11-20 21:00:08 +02:00
ggml-vulkan.cpp
vulkan: Optimize contiguous copies (llama/10254)
2024-11-15 15:21:04 +02:00
ggml.c
ggml : fix compile warnings (llama/0)
2024-11-20 21:00:08 +02:00
sgemm.cpp
whisper : reorganize source code + improve CMake ( #2256 )
2024-06-26 19:34:09 +03:00
sgemm.h
whisper : reorganize source code + improve CMake ( #2256 )
2024-06-26 19:34:09 +03:00