Georgi Gerganov
|
1e5ad50f8f
|
bench : add rtx 5090 [no ci]
|
2025-09-30 13:58:15 +03:00 |
Georgi Gerganov
|
e4bf87b0e9
|
bench : update [no ci]
|
2025-09-30 12:51:25 +03:00 |
Georgi Gerganov
|
32be14f8eb
|
bench : update [no ci] (#3439)
|
2025-09-29 17:42:38 +03:00 |
Georgi Gerganov
|
06bdaa6c0c
|
bench : update benches
|
2025-06-25 16:45:19 +03:00 |
Georgi Gerganov
|
503a786c9a
|
bench : update numbers [no ci] (#2993)
|
2025-04-02 16:27:36 +03:00 |
Georgi Gerganov
|
7094ea5e75
|
whisper : use flash attention (#2152)
* whisper : use flash attention in the encoder
* whisper : add kv_pad
* whisper : remove extra backend instance (huh?)
* whisper : use FA for cross-attention
* whisper : use FA for self-attention
* whisper : simplify encoder FA
* whisper : add flash_attn runtime parameter
* scripts : add bench log
* scripts : add M1 Pro bench log
|
2024-05-15 09:38:19 +03:00 |