whisper.cpp/ggml/src/ggml-musa
Johannes Gäßler b5fb9b9f58 CUDA: faster tile FA, add oob checks, more HSs (llama/16492) 2025-10-15 09:29:17 +03:00
..
CMakeLists.txt CUDA: faster tile FA, add oob checks, more HSs (llama/16492) 2025-10-15 09:29:17 +03:00
mudnn.cu musa: Upgrade MUSA SDK version to rc4.0.1 and use mudnn::Unary::IDENTITY op to accelerate D2D memory copy (llama/13647) 2025-05-27 18:03:00 +03:00
mudnn.cuh musa: enable fp16 mma (all) and cublas on qy2 (llama/13842) 2025-07-01 17:54:53 +03:00