Commit Graph

50 Commits

Author SHA1 Message Date
Daniel Bevenius f92bd59951
whisper : remove unnecessary GGML_UNUSED macro (#2960) 2025-03-30 05:56:10 +02:00
Dan Johansson 21d890d534
whisper : add support for backends with multiple ggml_backend_buffer_type (#2863)
* whisper : add support for ggml_backend_buffer_type

Signed-off-by: Dan Johansson <dan.johansson@arm.com>

* fix compile error when building on Ubuntu

Signed-off-by: Dan Johansson <dan.johansson@arm.com>

* remove copyright header from include file

Signed-off-by: Dan Johansson <dan.johansson@arm.com>

---------

Signed-off-by: Dan Johansson <dan.johansson@arm.com>
2025-03-26 16:54:02 +02:00
Daniel Bevenius cf5ddb8c21
whisper : initialize decoder's rng with unique seed (#2932)
This change initializes each decoder's random number generator with a
unique seed.

The motivation for this is that currently all decoders are initialized
with the same seed value, 0. The result of this is that for the same
state (logits, probs, and logprobs) they will produce the same output.
2025-03-24 09:36:07 +01:00
Daniel Bevenius be9de81171
whisper : add check for CPU backend initialization (#2918)
This commit adds a check for the CPU backend initialization in the
whisper library. If the initialization fails, an exception is thrown.

The motivation for this change is to make the library more robust and
handle the case when the CPU backend initialization fails.

Resolves: https://github.com/ggerganov/whisper.cpp/issues/2917
2025-03-21 09:53:26 +01:00
Daniel Bevenius 215990abde
whisper : fix compiler warnings in whisper.cpp (#2895)
This commit fixes compiler warnings in whisper.cpp by changing the type
of the loop index variable from int64_t to size_t.

Currently the following warnings are generated by the compiler:
```console
/whisper.cpp/src/whisper.cpp:209:27: warning: comparison of integers of different signs: 'int64_t' (aka 'long long') and 'size_t' (aka 'unsigned long') [-Wsign-compare]
  209 |     for (int64_t i = 0; i < nels; ++i) {
      |                         ~ ^ ~~~~
/whisper.cpp/src/whisper.cpp:219:27: warning: comparison of integers of different signs: 'int64_t' (aka 'long long') and 'size_t' (aka 'unsigned long') [-Wsign-compare]
  219 |     for (int64_t i = 0; i < nels; ++i) {
      |                         ~ ^ ~~~~
```
2025-03-18 13:38:41 +01:00
Daniel Bevenius 740bf7f6a1
whisper : enable compiler warnings for src (#2891)
* whisper : enable compiler warnings for src

This commit enables compiler warnings for the src directory. Currently
when the WHISPER_ALL_WARNINGS flag is set to ON is only enables warnings
in ggml, by setting GGML_ALL_WARNINGS to ON. This commit adds the same
compiler flags for whisper's src directory.

The motivation for this is to catch potential bugs and issues early on
in the development process.

* squash! whisper : enable compiler warnings for src

Remove GF_C_FLAGS and GF_CXX_FLAGS from add_compile_options.
2025-03-18 05:19:18 +01:00
Diego Devesa 339a1cba5d
whisper : support GGML_BACKEND_DL (#2843)
* whisper : support GGML_BACKEND_DL

* fix DTW crash

* whisper.objc : fix build - add ggml-cpp.h

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-02-27 13:35:07 +01:00
Thomas Fitzsimmons 47e14c0529
whisper : restore big endian support (#2816)
* whisper : fix BYTESWAP whitespace

* whisper : make byteswap useable with C++17

* cmake : define WHISPER_BIG_ENDIAN for big-endian targets

* ci : fix (again) arm64 build fails

* docker : attempt fixing arm64 build on ci

* qemu v7.0.0-28

[imported from
https://github.com/ggml-org/llama.cpp
/commit/818a340ea8be55b3706e1772527cb8738e90a8c7
(#11895)]

---------

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
2025-02-25 11:38:13 +02:00
Georgi Gerganov 589b40810a
ci : dummy commit to trigger CI 2025-02-03 16:32:48 +02:00
Georgi Gerganov eb68324c86
whisper : fix gpu device selection (#2728) 2025-01-13 13:11:37 +02:00
Sandro Hanea 2ab2eb5110
whisper : add whisper_full_get_segment_no_speech_prob_from_state (#2716) 2025-01-09 16:21:07 +02:00
Sacha Arbonel 4183517076
server : add no-speech threshold parameter and functionality (#2654) 2024-12-21 17:00:08 +02:00
Georgi Gerganov f4668169a0
whisper : rename suppress_non_speech_tokens to suppress_nst (#2653) 2024-12-21 12:54:35 +02:00
Karthick f897eb7670
whisper : support no_speech_thold (#2625)
* Implement no_speech_thold

no_speech_thold functionality is on par with OpenAI's whisper

* Addressed review comments
2024-12-17 19:15:47 +02:00
Karthick 2f2841bfce
whisper : add single-timestamp logic (#2629)
* Fix hallucinations during silence

When the predicted tokens end with a single timestamp the the entire 30 segment should be considered as done, to avoid hallucinations for the remaining part of segment.
This behaviour is on par with openai's whisper. Refer to logic related to `single_timestamp_ending` in https://github.com/openai/whisper/blob/main/whisper/transcribe.py

* Accept review comments related to formatting.

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-12-17 19:07:08 +02:00
Georgi Gerganov 37c88027e1 whisper : use backend registry (#0) 2024-11-20 21:00:08 +02:00
Georgi Gerganov 7fd8d9c220 whisper : adapt to new ggml (wip) 2024-11-20 21:00:08 +02:00
Georgi Gerganov 5089ab2d6a whisper : fix build (#0) 2024-11-15 15:21:04 +02:00
Jhen-Jie Hong 5f8a086e22
whisper.swiftui : add model download list & bench methods (#2546)
* swift : fix resources & exclude build

* whisper : impl whisper_timings struct & api

* whisper.swiftui : model list & bench methods

* whisper : return ptr for whisper_get_timings

* revert unnecessary change

* whisper : avoid designated initializer

* whisper.swiftui: code style changes

* whisper.swiftui : get device name / os from UIDevice

* whisper.swiftui : fix UIDevice usage

* whisper.swiftui : add memcpy and ggml_mul_mat (commented)
2024-11-13 21:51:34 +02:00
thewh1teagle 5ccca19f0c
ggml : vulkan logs (#2547) 2024-11-13 21:47:15 +02:00
Vin Misra 31aea563a8
whisper : fix extra memory usage (#2534)
* passing samples_padded by ref to the threads.

* passing samples_padded by ref to the threads.

---------

Co-authored-by: Vinith Misra <physicsdemon@gmail.com>
2024-11-06 23:02:11 +02:00
Georgi Gerganov 0377596b77 whisper : backend registry init before model load 2024-11-01 10:19:05 +02:00
Georgi Gerganov aa037a60f3
ggml : alloc ggml_contexts on the heap (#2525)
* whisper : reduce ggml_context usage

* ggml : allocate contexts on the heap (v2)

* ggml : aligned malloc -> malloc
2024-10-31 22:00:09 +02:00
Georgi Gerganov 3f020fac9d
whisper : minor compile warning 2024-10-29 19:30:26 +02:00
jettoblack 1626b73b03
whisper : move new-segment callback after DTW step (#2515) 2024-10-29 08:47:21 +02:00
Josscii 0fbaac9c89
whisper : fix index overflow in token-level timestamp logic (#2505) 2024-10-23 15:14:03 +03:00
Rotem Dan b6049060dd
whisper : add dtw preset for large-v3-turbo (#2481) 2024-10-15 21:00:21 +03:00
Sandro Hanea fdbfb460ed
whisper : add OpenVINO init with state (#2464)
* Fixed OpenVino init on state

* Removed an empty line

* Fixed typo

* Replaced tabs with spaces

---------

Co-authored-by: Sandro Hanea <sandrohanea@users.noreply.github.com>
2024-10-08 20:08:00 +03:00
Georgi Gerganov 847f94fdeb whisper : zero-out the KV cache upon clear (#2445) 2024-10-05 15:23:51 +03:00
Georgi Gerganov 396089f3cf whisper : revert mel-related changes (#0)
too much extra logic and complexity for small benefit
2024-10-05 15:23:51 +03:00
Georgi Gerganov 941912467d whisper : adapt to latest ggml (skip) (#0) 2024-10-05 15:23:51 +03:00
Georgi Gerganov f62a546e03
whisper : fix excessive memory usage (#2443)
* whisper : fix KV cache allocation

* whisper : reduce memory overhead from unused input tensors
2024-10-05 12:36:40 +03:00
Georgi Gerganov ccc2547210 talk-llama : sync llama.cpp 2024-10-03 12:22:17 +03:00
Georgi Gerganov fe18c29ab8 talk-llama : sync llama.cpp 2024-09-24 19:45:08 +03:00
Georgi Gerganov 34291099fb ggml : refactoring (llama/#0)
- d6a04f87
- 23e0d70b
2024-09-24 19:45:08 +03:00
Georgi Gerganov 9d754a56cf whisper : update FA call 2024-08-28 13:22:20 +03:00
Georgi Gerganov 6e9596f6de
whisper : fix compile warning for unused params 2024-08-28 11:40:11 +03:00
Mengqing Cao 81c999fe0a
cann : add Ascend NPU support (#2336)
* enable Ascend NPU in src/whisper.cpp
  * sync test-backend-ops with llama.cpp
2024-08-09 15:21:56 +03:00
Georgi Gerganov 4b7de08bfd whisper : fix compile warning (#0) 2024-08-09 09:58:16 +03:00
Daven Sanassy fe36c90971
cmake : fix compile in xcode (#2311) 2024-08-05 09:48:26 +03:00
Georgi Gerganov 6739eb83c3
whisper : handle empty mel (#2324) 2024-07-27 20:35:04 +03:00
Matt Stephenson f68298ce06
whisper : use vulkan as gpu backend when available (#2302)
* ggml: use vulkan as gpu backend when available

Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com>

* whisper: enable using vk as default buffer type

Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com>

---------

Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com>
2024-07-16 10:21:09 +03:00
arizhih 7ae885c1ef
whisper : fix DTW assert (#2299) 2024-07-15 15:50:36 +03:00
Georgi Gerganov d207c68822
cmake : use WHISPER_EXTRA_FLAGS (#2294) 2024-07-09 18:54:18 +03:00
Georgi Gerganov 1c31f9d4a8
cmake : try to fix openvino build (#2281) 2024-07-08 15:36:51 +03:00
Georgi Gerganov dbf9c15e30 talk-llama : sync llama.cpp 2024-07-08 14:53:55 +03:00
Georgi Gerganov dc8cc2dd6f
whisper : disable CUDA mel + fix FFMPEG 2024-06-26 20:11:38 +03:00
Georgi Gerganov e30c679928
whisper : reorganize source code + improve CMake (#2256)
* scripts : update sync [no ci]

* files : reorganize [no ci]

* sync : llama.cpp

* cmake : link math library

* cmake : build normal ggml library

* files : move headers to include

* objc : fix path to ggml-metal.h

* ci : fix WHISPER_CUDA -> GGML_CUDA

* scripts : sync LICENSE [no ci]
2024-06-26 19:34:09 +03:00
Georgi Gerganov 820446e230 fix : remove extra files 2024-06-18 09:39:40 +03:00
slaren de29b193f6 move BLAS to a separate backend (cont) (llama/6210)
ggml-ci
2024-06-18 09:39:40 +03:00