Georgi Gerganov
|
4bbce1e5b2
|
benches : update
|
2026-03-18 22:34:51 +02:00 |
Georgi Gerganov
|
21c1765fcb
|
benches : update
|
2026-01-15 11:53:09 +02:00 |
Georgi Gerganov
|
ea174c62bc
|
bench : update [no ci]
|
2025-10-12 11:16:23 +03:00 |
Georgi Gerganov
|
8c0855fd6b
|
bench : update [no ci]
|
2025-09-30 21:40:32 +03:00 |
Georgi Gerganov
|
1e5ad50f8f
|
bench : add rtx 5090 [no ci]
|
2025-09-30 13:58:15 +03:00 |
Georgi Gerganov
|
e4bf87b0e9
|
bench : update [no ci]
|
2025-09-30 12:51:25 +03:00 |
Georgi Gerganov
|
32be14f8eb
|
bench : update [no ci] (#3439)
|
2025-09-29 17:42:38 +03:00 |
Georgi Gerganov
|
06bdaa6c0c
|
bench : update benches
|
2025-06-25 16:45:19 +03:00 |
Georgi Gerganov
|
503a786c9a
|
bench : update numbers [no ci] (#2993)
|
2025-04-02 16:27:36 +03:00 |
Georgi Gerganov
|
7094ea5e75
|
whisper : use flash attention (#2152)
* whisper : use flash attention in the encoder
* whisper : add kv_pad
* whisper : remove extra backend instance (huh?)
* whisper : use FA for cross-attention
* whisper : use FA for self-attention
* whisper : simplify encoder FA
* whisper : add flash_attn runtime parameter
* scripts : add bench log
* scripts : add M1 Pro bench log
|
2024-05-15 09:38:19 +03:00 |