whisper.cpp

History

Georgi Gerganov b6c5f49b78 whisper : add batched decoding (#1486 ) * whisper : add whisper_batch * whisper : move kv_self to whisper_state * whisper : full batched decoding support * whisper : fix memory leak in whisper_batch * whisper : fix mem leak again + remove oboslete function * whisper : clear kv cache when using whisper_decode API * whisper : speed-up sampling * whisper : fix decoders initializer * bench : add batch size 5 bench * whisper : add comment about the KV cache size * whisper : add check for max number of decoders * whisper : avoid starting sampling threads with bs=1 * whisper : enable beam-search by default * cuda : sync llama.cpp fixes		2023-11-15 16:12:52 +02:00
..
bench-all.sh	whisper : add batched decoding (#1486 )	2023-11-15 16:12:52 +02:00
bench-wts.sh	bench-wts.sh : rename script + add execute permission	2023-03-06 21:02:24 +02:00
bench.py	extra: Add benchmark script implemented in Python (#1298 )	2023-09-25 23:45:15 +08:00
convert-all.sh	whisper : add support for large v3 (#1444 )	2023-11-07 15:30:18 +02:00
deploy-wasm.sh	Node.js package (#260 )	2022-12-12 20:17:27 +02:00
quantize-all.sh	whisper : add full CUDA and Metal offloading (#1472 )	2023-11-12 15:31:08 +02:00
sha-all.sh	extra : compute SHA of all models files	2022-11-02 18:31:55 +02:00
sync-ggml.sh	cuda : fix HIPBLAS build	2023-11-05 19:41:15 +02:00