Commit Graph

37 Commits

Author SHA1 Message Date
Karthick f897eb7670
whisper : support no_speech_thold (#2625)
* Implement no_speech_thold

no_speech_thold functionality is on par with OpenAI's whisper

* Addressed review comments
2024-12-17 19:15:47 +02:00
Karthick 2f2841bfce
whisper : add single-timestamp logic (#2629)
* Fix hallucinations during silence

When the predicted tokens end with a single timestamp the the entire 30 segment should be considered as done, to avoid hallucinations for the remaining part of segment.
This behaviour is on par with openai's whisper. Refer to logic related to `single_timestamp_ending` in https://github.com/openai/whisper/blob/main/whisper/transcribe.py

* Accept review comments related to formatting.

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-12-17 19:07:08 +02:00
Georgi Gerganov 37c88027e1 whisper : use backend registry (#0) 2024-11-20 21:00:08 +02:00
Georgi Gerganov 7fd8d9c220 whisper : adapt to new ggml (wip) 2024-11-20 21:00:08 +02:00
Georgi Gerganov 5089ab2d6a whisper : fix build (#0) 2024-11-15 15:21:04 +02:00
Jhen-Jie Hong 5f8a086e22
whisper.swiftui : add model download list & bench methods (#2546)
* swift : fix resources & exclude build

* whisper : impl whisper_timings struct & api

* whisper.swiftui : model list & bench methods

* whisper : return ptr for whisper_get_timings

* revert unnecessary change

* whisper : avoid designated initializer

* whisper.swiftui: code style changes

* whisper.swiftui : get device name / os from UIDevice

* whisper.swiftui : fix UIDevice usage

* whisper.swiftui : add memcpy and ggml_mul_mat (commented)
2024-11-13 21:51:34 +02:00
thewh1teagle 5ccca19f0c
ggml : vulkan logs (#2547) 2024-11-13 21:47:15 +02:00
Vin Misra 31aea563a8
whisper : fix extra memory usage (#2534)
* passing samples_padded by ref to the threads.

* passing samples_padded by ref to the threads.

---------

Co-authored-by: Vinith Misra <physicsdemon@gmail.com>
2024-11-06 23:02:11 +02:00
Georgi Gerganov 0377596b77 whisper : backend registry init before model load 2024-11-01 10:19:05 +02:00
Georgi Gerganov aa037a60f3
ggml : alloc ggml_contexts on the heap (#2525)
* whisper : reduce ggml_context usage

* ggml : allocate contexts on the heap (v2)

* ggml : aligned malloc -> malloc
2024-10-31 22:00:09 +02:00
Georgi Gerganov 3f020fac9d
whisper : minor compile warning 2024-10-29 19:30:26 +02:00
jettoblack 1626b73b03
whisper : move new-segment callback after DTW step (#2515) 2024-10-29 08:47:21 +02:00
Josscii 0fbaac9c89
whisper : fix index overflow in token-level timestamp logic (#2505) 2024-10-23 15:14:03 +03:00
Rotem Dan b6049060dd
whisper : add dtw preset for large-v3-turbo (#2481) 2024-10-15 21:00:21 +03:00
Sandro Hanea fdbfb460ed
whisper : add OpenVINO init with state (#2464)
* Fixed OpenVino init on state

* Removed an empty line

* Fixed typo

* Replaced tabs with spaces

---------

Co-authored-by: Sandro Hanea <sandrohanea@users.noreply.github.com>
2024-10-08 20:08:00 +03:00
Georgi Gerganov 847f94fdeb whisper : zero-out the KV cache upon clear (#2445) 2024-10-05 15:23:51 +03:00
Georgi Gerganov 396089f3cf whisper : revert mel-related changes (#0)
too much extra logic and complexity for small benefit
2024-10-05 15:23:51 +03:00
Georgi Gerganov 941912467d whisper : adapt to latest ggml (skip) (#0) 2024-10-05 15:23:51 +03:00
Georgi Gerganov f62a546e03
whisper : fix excessive memory usage (#2443)
* whisper : fix KV cache allocation

* whisper : reduce memory overhead from unused input tensors
2024-10-05 12:36:40 +03:00
Georgi Gerganov ccc2547210 talk-llama : sync llama.cpp 2024-10-03 12:22:17 +03:00
Georgi Gerganov fe18c29ab8 talk-llama : sync llama.cpp 2024-09-24 19:45:08 +03:00
Georgi Gerganov 34291099fb ggml : refactoring (llama/#0)
- d6a04f87
- 23e0d70b
2024-09-24 19:45:08 +03:00
Georgi Gerganov 9d754a56cf whisper : update FA call 2024-08-28 13:22:20 +03:00
Georgi Gerganov 6e9596f6de
whisper : fix compile warning for unused params 2024-08-28 11:40:11 +03:00
Mengqing Cao 81c999fe0a
cann : add Ascend NPU support (#2336)
* enable Ascend NPU in src/whisper.cpp
  * sync test-backend-ops with llama.cpp
2024-08-09 15:21:56 +03:00
Georgi Gerganov 4b7de08bfd whisper : fix compile warning (#0) 2024-08-09 09:58:16 +03:00
Daven Sanassy fe36c90971
cmake : fix compile in xcode (#2311) 2024-08-05 09:48:26 +03:00
Georgi Gerganov 6739eb83c3
whisper : handle empty mel (#2324) 2024-07-27 20:35:04 +03:00
Matt Stephenson f68298ce06
whisper : use vulkan as gpu backend when available (#2302)
* ggml: use vulkan as gpu backend when available

Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com>

* whisper: enable using vk as default buffer type

Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com>

---------

Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com>
2024-07-16 10:21:09 +03:00
arizhih 7ae885c1ef
whisper : fix DTW assert (#2299) 2024-07-15 15:50:36 +03:00
Georgi Gerganov d207c68822
cmake : use WHISPER_EXTRA_FLAGS (#2294) 2024-07-09 18:54:18 +03:00
Georgi Gerganov 1c31f9d4a8
cmake : try to fix openvino build (#2281) 2024-07-08 15:36:51 +03:00
Georgi Gerganov dbf9c15e30 talk-llama : sync llama.cpp 2024-07-08 14:53:55 +03:00
Georgi Gerganov dc8cc2dd6f
whisper : disable CUDA mel + fix FFMPEG 2024-06-26 20:11:38 +03:00
Georgi Gerganov e30c679928
whisper : reorganize source code + improve CMake (#2256)
* scripts : update sync [no ci]

* files : reorganize [no ci]

* sync : llama.cpp

* cmake : link math library

* cmake : build normal ggml library

* files : move headers to include

* objc : fix path to ggml-metal.h

* ci : fix WHISPER_CUDA -> GGML_CUDA

* scripts : sync LICENSE [no ci]
2024-06-26 19:34:09 +03:00
Georgi Gerganov 820446e230 fix : remove extra files 2024-06-18 09:39:40 +03:00
slaren de29b193f6 move BLAS to a separate backend (cont) (llama/6210)
ggml-ci
2024-06-18 09:39:40 +03:00