whisper.cpp/ggml/include
Radoslav Gerganov d26d1c8b85
rpc : cache and reuse compute graphs (llama/15405)
Store the last computed graph and reuse it when possible.
Also do not return response from GRAPH_COMPUTE and assume it always
completes successfully. If this this is not the case, the server closes
the connection. This saves us a network round trip to the server.
2025-12-12 17:53:11 +02:00
..
ggml-alloc.h
ggml-backend.h rpc : add support for multiple devices (llama/16276) 2025-10-12 11:16:23 +03:00
ggml-blas.h
ggml-cann.h
ggml-cpp.h
ggml-cpu.h
ggml-cuda.h
ggml-hexagon.h Add experimental ggml-hexagon backend for the Hexagon NPU (llama/16547) 2025-11-09 23:38:03 +02:00
ggml-metal.h
ggml-opencl.h
ggml-opt.h
ggml-rpc.h rpc : cache and reuse compute graphs (llama/15405) 2025-12-12 17:53:11 +02:00
ggml-sycl.h
ggml-vulkan.h
ggml-webgpu.h
ggml-zdnn.h
ggml.h ggml : add ggml_top_k (llama/17365) 2025-12-12 17:53:08 +02:00
gguf.h