Commit Graph

557 Commits

Author SHA1 Message Date
Georgi Gerganov 7d14005717 objc : fix build, tmp remove GPU support, use C++17 2025-03-08 15:13:01 +02:00
Ivy233 ef40950c4a
common : more general m_audio_len update logic (#2855)
Co-authored-by: Ivy233 <wangjinrun@uniontech.com>
2025-03-07 10:10:03 +02:00
Dmitry Atamanov 5b481a27a6
common : fix audio loading by miniaudio (#2862) 2025-03-04 19:05:21 +02:00
Lin Xiaodong fc7b1ee521
fix: missing include common-whisper (#2858) 2025-03-02 20:55:11 +02:00
Diego Devesa 339a1cba5d
whisper : support GGML_BACKEND_DL (#2843)
* whisper : support GGML_BACKEND_DL

* fix DTW crash

* whisper.objc : fix build - add ggml-cpp.h

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-02-27 13:35:07 +01:00
Georgi Gerganov c64f3e8ada
common : separate whisper sources (#2846)
* common : separate whisper sources

* examples : add chrono

* examples : add more headers
2025-02-27 12:50:32 +02:00
Georgi Gerganov 9f83f67221
common : fix build min/max (#2845)
* common : try to fix build

* cont : try another fix
2025-02-27 10:39:13 +02:00
Dmitry Atamanov 7d3da68f79
examples : use miniaudio for direct decoding flac, mp3, ogg and wav (#2759) 2025-02-27 09:06:54 +02:00
petterreinholdtsen b5d21359c1
stream : stop on ^C when no audio is received (#2822)
Add check for ctrl-c in potentially endless loop while calling audio.get()
to receive sound.

Co-authored-by: Petter Reinholdtsen <pere@debian.org>
2025-02-27 08:59:51 +02:00
masahji dfc6ca62f3
stream : add beam size parameter(#2836)
* feat: Add beam size parameter to stream.cpp for beam search configuration

* feat: Add beam size parameter to whisper full params in stream example

* fix: Remove duplicate beam search size assignment in server.cpp
2025-02-25 11:39:33 +02:00
Judd d682e15090
Fixes for Windows (#2790)
Fixes for Windows:

* MSVC default to utf-8 without BOM.
* Console output code page changed to utf-8.

---------

Co-authored-by: Judd <foldl@boxvest.com>
2025-02-06 15:37:21 +08:00
billyct cadfc50eab
node : add max_len params in node addon (#2760) 2025-02-03 22:49:06 +02:00
Georgi Gerganov 3f91832352
talk-llama : sync llama.cpp 2025-02-03 22:42:26 +02:00
Corey Earwood 7a423f1c00
whisper.objc : fix build and CI 2025-01-18 12:06:06 +02:00
Georgi Gerganov 99b011a9f5 talk-llama : sync llama.cpp 2025-01-14 10:38:01 +02:00
Georgi Gerganov e940fbf283
server : fix build (#2718) 2025-01-13 08:57:33 +02:00
Georgi Gerganov 35d0e02c72
talk-llama : sync llama.cpp (#2709) 2025-01-13 08:55:48 +02:00
NETZkultur GmbH 45d3faf961
server : generate unique tmp filenames (#2718)
#Summary

This Merge Request adds a mechanism to generate unique filenames for FFmpeg conversions in whisper_server.cpp. Previously, a single fixed filename was used (e.g., whisper-server-tmp.wav), which could result in unexpected file overwrites under certain circumstances. By generating a unique filename per request, any risk of overwriting temporary files is eliminated.

#Background / Motivation
	•	Problem: Relying on a static filename for temporary audio files may lead to overwrites if multiple operations occur simultaneously or if the same file name is reused.
	•	Goal: Dynamically generate unique filenames, ensuring each request or operation uses an isolated temporary file.
2025-01-13 08:55:21 +02:00
Yusuf Redžić ece3ff88f6
cli : fix segfault on missing argument (#2700) 2025-01-04 10:47:41 +02:00
Alter c81b8b910b
objc : rename ggml-cpu-aarch64.c to .cpp (#2687) 2025-01-02 12:05:09 +02:00
Georgi Gerganov 5136fd92c2
examples : handle "main.exe" deprecation 2024-12-30 13:00:18 +02:00
Andreas Lubbe 7d55637f0b
cli : add --suppress_nst support (#2664) 2024-12-24 09:30:07 +02:00
Andreas Lubbe 0994506054
cli : add no_speech_thold (#2663) 2024-12-24 09:29:19 +02:00
Georgi Gerganov ed09075ca0
server : fix help print 2024-12-22 15:32:05 +02:00
Sacha Arbonel 4183517076
server : add no-speech threshold parameter and functionality (#2654) 2024-12-21 17:00:08 +02:00
Georgi Gerganov f4668169a0
whisper : rename suppress_non_speech_tokens to suppress_nst (#2653) 2024-12-21 12:54:35 +02:00
Sacha Arbonel 944ce49439
server : add option to suppress non-speech tokens (#2649)
* The parameter will suppress non-speech tokens like [LAUGH], [SIGH], etc. from the output when enabled.

* add to whisper_params_parse

* add missing param
2024-12-21 12:05:05 +02:00
Georgi Gerganov 2e59dced12
whisper : rename binaries + fix install (#2648)
* whisper : rename binaries + fix install

* cont : try to fix ci

* cont : fix emscripten builds
2024-12-21 09:43:49 +02:00
Georgi Gerganov ba6c2a8fd9 android : try to fix build 2024-12-18 12:52:16 +02:00
Georgi Gerganov 6576af00d7 files : remove old sources 2024-12-18 12:52:16 +02:00
Georgi Gerganov 61edb117a0 talk-llama : sync llama.cpp 2024-12-18 12:52:16 +02:00
Georgi Gerganov 60dc6d003f common : remove old types
ggml-ci
2024-12-18 12:52:16 +02:00
crummyh d34445e960
stream : improve consistency in README (#2642) 2024-12-18 08:43:48 +02:00
Georgi Gerganov 199579652e
common : add cstdio header 2024-12-16 08:57:04 +02:00
Georgi Gerganov d17e7139d8
stream : update build instructions 2024-12-15 21:55:36 +02:00
Thamster 6a52eaea74
android : fix build and ci (#2624)
* Adding missing CMakeLists.txt include for ggm-cpu needed by whisper.android

* attempt to re-enable CI for JNI android

---------

Co-authored-by: Your Name <you@example.com>
2024-12-14 17:25:53 +02:00
Georgi Gerganov 472464453d ci : disable CUDA and Android builds 2024-12-08 20:14:35 +02:00
Georgi Gerganov 11dddfbc9e ci : disable Obj-C build + fixes 2024-12-08 20:14:35 +02:00
Georgi Gerganov f2c680f893 talk-llama : sync llama.cpp 2024-12-08 20:14:35 +02:00
Georgi Gerganov 02c6fcbc2c common : fix compile warning
ggml-ci
2024-12-08 20:14:35 +02:00
Georgi Gerganov 7fd8d9c220 whisper : adapt to new ggml (wip) 2024-11-20 21:00:08 +02:00
Georgi Gerganov 06e059b8f8 talk-llama : sync llama.cpp 2024-11-20 21:00:08 +02:00
Stefan Sydow d24f981fb2
sycl: fix example build (#2570) 2024-11-18 14:57:23 +02:00
Jhen-Jie Hong c4e95fb74d
whisper.swiftui : switch Mac dest to Mac (Designed for iPad) (#2562) 2024-11-15 15:21:53 +02:00
Georgi Gerganov 6477b84eb6 build : fixes 2024-11-15 15:21:04 +02:00
Georgi Gerganov 24d706774d talk-llama : sync llama.cpp 2024-11-15 15:21:04 +02:00
Jhen-Jie Hong 5f8a086e22
whisper.swiftui : add model download list & bench methods (#2546)
* swift : fix resources & exclude build

* whisper : impl whisper_timings struct & api

* whisper.swiftui : model list & bench methods

* whisper : return ptr for whisper_get_timings

* revert unnecessary change

* whisper : avoid designated initializer

* whisper.swiftui: code style changes

* whisper.swiftui : get device name / os from UIDevice

* whisper.swiftui : fix UIDevice usage

* whisper.swiftui : add memcpy and ggml_mul_mat (commented)
2024-11-13 21:51:34 +02:00
Stefan Sydow 300c07b94d
examples : fix ffmpeg v5 build (#2543)
remove call to 'av_register_all()' which does not exist in ffmpeg v5
anymore.
2024-11-13 21:41:52 +02:00
Georgi Gerganov c65d0fd3c8 talk-llama : sync llama.cpp 2024-11-01 10:19:05 +02:00
Rotem Dan b6049060dd
whisper : add dtw preset for large-v3-turbo (#2481) 2024-10-15 21:00:21 +03:00
Georgi Gerganov 6e40108a59 objc : fix build 2024-10-05 15:23:51 +03:00
Georgi Gerganov 941912467d whisper : adapt to latest ggml (skip) (#0) 2024-10-05 15:23:51 +03:00
Rahul Vadhyar 2944cb72d9
examples : update dr_wav.h to newer version (#2449) 2024-10-04 11:04:51 +03:00
Georgi Gerganov ccc2547210 talk-llama : sync llama.cpp 2024-10-03 12:22:17 +03:00
gilbertgong ede1718f6d
server : ffmpeg overwrite leftover temp file (#2431)
* Remove possible leftover ffmpeg temp file from a previous failed conversion

* Revert "Remove possible leftover ffmpeg temp file from a previous failed conversion"

This reverts commit 00797403bd.

* Flag to force ffmpeg to overwrite output file if it exists
2024-10-02 15:06:40 +03:00
Georgi Gerganov 2ef717b293
whisper : add large-v3-turbo (#2440) 2024-10-01 15:57:06 +03:00
Georgi Gerganov 451e9ee92c make : remove "talk" target until updated 2024-09-24 19:45:08 +03:00
Georgi Gerganov fe18c29ab8 talk-llama : sync llama.cpp 2024-09-24 19:45:08 +03:00
Georgi Gerganov 54e5095765 examples : adapt to ggml.h changes (ggml/0)
ggml-ci
2024-09-24 19:45:08 +03:00
Toliver 5b1ce40fa8
server : use OS-generated temp file name for converted files (#2419) 2024-09-17 15:56:32 +03:00
UsernamesLame 9600fc3eb1
readme : remove invalid flag from Python example (#2396)
* Update README.md

Fix broken C-style API link

* Update whisper_processor.py

Update examples/python/whisper_processor.py to remove nonexistent flag "-np" from subprocess.Popen call.

* Add pywhispercpp to the Pybind11 Python wrapper list

abdeladim-s/pywhispercpp wasn't added to the list / was removed at some point (?)

It was referenced in issue #9, so I feel like it's worthy of being added as it's the first if not one of the first Python wrappers for whisper.cpp
2024-08-30 14:00:38 +03:00
Georgi Gerganov da9809f243 talk-llama : sync llama.cpp 2024-08-28 13:22:20 +03:00
Justine Tunney 7f78675008
examples : use colorblind friendly TTY color scheme (#2360)
This change updates the -pc flag, so that a new xterm256 color scheme is
used. This color scheme is believed to be better for three reasons:

1. It should be friendlier to the colorblind. The scheme was designed by
   Paul Tol (see: https://personal.sron.nl/~pault/). TensorBoard uses it
   since 2017, so it's already popular in the machine learning community

2. It should appear to be the same colors as before to people who aren't
   i.e. it's still a red-green spectrum like before but lightly modified

3. It is readable in both white and black background terminals. The neon
   colors before were probably a bit too intense for white backgrounds.
2024-08-20 10:49:10 +03:00
Georgi Gerganov 58323bf8ed build : fix aarch64 (#0) 2024-08-08 22:48:46 +03:00
Georgi Gerganov 22058f2dbc talk-llama : sync llama.cpp 2024-08-08 22:48:46 +03:00
Georgi Gerganov c7ea4fd235 common : handle new quant types (ggml/0) 2024-08-08 22:48:46 +03:00
Georgi Gerganov dbf9c15e30 talk-llama : sync llama.cpp 2024-07-08 14:53:55 +03:00
Georgi Gerganov d3f6c34976 examples : fix compile warnings [no ci] (#0) 2024-07-08 14:53:55 +03:00
Emmanuel Schmidbauer bec9836849
server : add inference path to make OAI API compatible (#2270) 2024-07-08 14:24:58 +03:00
Georgi Gerganov 4a62efbb95
cmake : minor fixes 2024-06-26 21:42:39 +03:00
Georgi Gerganov dc8cc2dd6f
whisper : disable CUDA mel + fix FFMPEG 2024-06-26 20:11:38 +03:00
Georgi Gerganov e30c679928
whisper : reorganize source code + improve CMake (#2256)
* scripts : update sync [no ci]

* files : reorganize [no ci]

* sync : llama.cpp

* cmake : link math library

* cmake : build normal ggml library

* files : move headers to include

* objc : fix path to ggml-metal.h

* ci : fix WHISPER_CUDA -> GGML_CUDA

* scripts : sync LICENSE [no ci]
2024-06-26 19:34:09 +03:00
Georgi Gerganov e293f17d34
talk-llama : sync llama.cpp 2024-06-18 09:45:37 +03:00
slaren de29b193f6 move BLAS to a separate backend (cont) (llama/6210)
ggml-ci
2024-06-18 09:39:40 +03:00
Georgi Gerganov 3b1ac03828 ggml : remove OpenCL (#0) 2024-06-16 18:19:48 +03:00
Georgi Gerganov 061eeb9f61 talk-llama : sync llama.cpp 2024-06-16 18:19:48 +03:00
Borislav Stanimirov af5833e298
whisper : remove `speed_up` and `phase_vocoder*` functions (#2198)
* whisper : fix cast warning

* whisper : remove phase_vocoder functions, ref #2195

* whisper : remove speed_up from whisper_full_params, closes #2195
2024-05-31 11:37:29 +03:00
Daniel Valdivia a7dc2aab16
server : fix typo (#2181)
A simple comment typo, PR can be dismissed
2024-05-25 10:46:22 +03:00
William Tambellini 1b51fdf170
examples : add support for decoding input with ffmpeg (Linux) (#2133)
- search for ffmpeg libs/headers at cmake time
- added ffmpeg-transcode.cpp into libcommon if ffmpeg on
- hooked ffmpeg trancoding in common read_wav(...)
- passed test:
./main -m ggml-base.en.bin -f samples/jfk.mp3
2024-05-21 18:31:41 +03:00
Pedro Probst adee3f9c1f
node : add flash_attn param (#2170) 2024-05-20 09:08:48 +03:00
Georgi Gerganov 7094ea5e75
whisper : use flash attention (#2152)
* whisper : use flash attention in the encoder

* whisper : add kv_pad

* whisper : remove extra backend instance (huh?)

* whisper : use FA for cross-attention

* whisper : use FA for self-attention

* whisper : simplify encoder FA

* whisper : add flash_attn runtime parameter

* scripts : add bench log

* scripts : add M1 Pro bench log
2024-05-15 09:38:19 +03:00
petterreinholdtsen 9d5771ae43
talk-llama : reject runs without required arguments (#2153)
* Extended talk-llama example to reject runs without required arguments.

Print warning and exit if models are not specified on the command line.

* Update examples/talk-llama/talk-llama.cpp

* Update examples/talk-llama/talk-llama.cpp

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-05-14 21:32:41 +03:00
Georgi Gerganov 4ef8d9f44e
server : return utf-8 (#2138) 2024-05-13 15:33:46 +03:00
Pedro Probst 3928dbd206
node : add audio_ctx and audio buffer params (#2123)
* node : add audio_ctx param

* node : support passing audio buffer directly

* node : parse audio_ctx in index.js

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-05-13 15:22:23 +03:00
valVk 30f73109b8
node : add additional params (#2000)
* Add additional params to addon.node

* Add comma_in_time as parameter

* Fix tests
2024-05-13 15:15:43 +03:00
Mark Karpelès 17fa62d3d3
js : remove un-needed request header from fetchRemote (#2119) 2024-05-13 15:13:19 +03:00
Daniel Ziegenberg 0bb05b113d
main : dont print timings with --no-prints (#2108)
Signed-off-by: Daniel Ziegenberg <daniel@ziegenberg.at>
2024-05-13 15:00:19 +03:00
Daniel Ziegenberg f141b2b938
main : add options for temperature control (#2088)
Add two options:

```
-tp,       --temperature N     [0.00   ] The sampling temperature, between 0 and 1
-tpi,      --temperature-inc N [0.20   ] The increment of temperature, between 0 and 1
```

The sampling temperature, between 0 and 1. Higher values like 0.8 will
make the output more random, while lower values like 0.2 will make it
more focused and deterministic. If set to 0, the model will use log
probability to automatically increase the temperature until certain
thresholds are hit.

Signed-off-by: Daniel Ziegenberg <daniel@ziegenberg.at>
2024-05-13 14:59:44 +03:00
zhangjixiong e93081f83f
whisper.android : update example, add field to print timestamp (#2072) 2024-05-13 14:30:03 +03:00
Xingchen Song(宋星辰) b6bbce4ae9
cmake : fix json INTERFACE library (#2069) 2024-05-13 14:29:39 +03:00
mashizora 7705dc52da
main : fix double quote escaping in csv output (#2090) 2024-05-13 11:55:32 +03:00
Georgi Gerganov 3fa7d29876 talk-llama : sync llama.cpp 2024-05-13 11:02:26 +03:00
Georgi Gerganov accada542a ggml : resolve merge (ggml/0)
ggml-ci
2024-05-13 11:02:26 +03:00
Pedro Probst 58210d6a76
examples : fix node compilation (#2115)
* node : fix compilation and update examples

* node : fix readme

* Update addon.node test
2024-05-02 22:52:55 +01:00
Georgi Gerganov b0c3cbf2e8
main : pass nullptr when regex is empty (#2070) 2024-04-17 12:23:47 +03:00
Emmanuel Schmidbauer 9fab28135c
server : add dtw (#2044)
* server.cpp: add dtw

* Update examples/server/server.cpp

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-15 22:16:58 +03:00
Pedro Probst 1b5439a6c2
node : support no timestamps (#2048)
* fix: node: do not compute timestamps if you do not need them

* feat: add no_timestamps parameter to node addon
2024-04-15 20:03:34 +03:00
Kendrick Taylor 5c554c04ff
whisper.nvim : fix missing reference to "model" variable (#2049) 2024-04-15 19:41:28 +03:00
Ikko Eltociear Ashimine c383f091a1
whisper : update grammar-parser.cpp (#2058)
preceeding -> preceding
2024-04-15 19:40:27 +03:00
ulatekh c15b4cda7d
common : fix file-handle leak in read_wav() (#2026)
Now it cleans up in case of error.
2024-04-09 18:34:34 +03:00
Rotem Dan d3cfb6ca2b
main : set stdin to binary mode on Windows (#2025) 2024-04-09 18:33:32 +03:00
ulatekh 671b4bde6c
main : allow a response-file as the sole parameter (#2019)
* The "main" example now allows a response-file as the sole parameter.

A response-file is a text file with command-line parameters, one per line.
Prefix the name of the response-file with "@" to identify it as such.
It's used under MS Windows to work around command-line length limits.
It may be useful under other platforms to simplify character-escaping.

* minor : style

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-09 18:31:16 +03:00
ulatekh c8eeb93a6a
whisper : suppress tokens with a regex (#1997)
* Allow a regular expression to describe tokens to suppress.

Example: --suppress-tokens-re "[,\.]|[ ]?[0-9]+" will suppress commas, periods, and numeric tokens.

Technique inspired by https://github.com/openai/whisper/discussions/1041

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Blind change to fix Java test.

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-09 18:27:28 +03:00
ulatekh 319fe5146e
cmake : create solution folders (#2004)
* Create solution folders in the CMake build.

* Fixed non-SDL2 build.

* Fixed emscripten build.
2024-04-09 18:23:33 +03:00
Georgi Gerganov 81a3c41aa0
talk-llama : sync llama.cpp 2024-04-07 16:21:08 +03:00
ulatekh fc366b807a
main : add command-style grammar (#1998)
* Implemented command-style grammar in the main example.

Mostly just copied the relevant parts from the command example.

* main : code style

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-28 12:02:10 +02:00
Georgi Gerganov 9fb308d90f
make : add grammar parser to common objects 2024-03-28 11:59:48 +02:00
Georgi Gerganov 2948c740a2
sync : ggml (#2001)
* sync : update scripts

* sync : ggml

* talk-llama : sync llama.cpp

* make : WHISPER_CUBLAS -> WHISPER_CUDA

* ci : try to fix sycl build

* talk-llama : fix make build
2024-03-27 18:55:10 +02:00
Georgi Gerganov 1558ec5a16
whisper : improve handling of prompts (#1981)
* whisper : improve handling of prompts

* whisper : add whisper_token_count helper
2024-03-25 14:48:19 +02:00
Mohammadreza Hendiani 04e48094e4
readme : add Fedora dependencies (#1970)
* README.md

fix documentaion and added fedora liunx dependencies for stream build

* fix documentaion and added fedora liunx dependencies for command build

* fix documentaion and added fedora liunx dependencies for talk build

* fix documentaion and added fedora liunx dependencies for talk-llama build

* reverted back mistakenly removed MacOS documentaion
2024-03-20 18:42:11 +02:00
denersc 741abb162c
whisper : token-level timestamps with DTW (#1485)
* whisper.cpp: impl dtw algo

* WIP: producing and placing DTW timestamps on tokens

* Fix compile and assertion errors. Attempt to DTW timestamp with single_segment=false.

* Fix mistake causing incorrect alignment of dtw timestamps

* implement N_TOP_MOST and CUSTOM alignment heads setting

* whisper: fix typo on alignment heads enum

* Fix issues related to changes in whisper.cpp

* Fixed excessive memory use when using DTW timestamps. Other minor fixes to DTW timestamping function

* decoder: save cross QKs only if requested

* Calling median filter with ggml_map_custom1

* Reimpl aheads n_top_most and custom. Sanity checks on chosen aheads

* Copying cross QKs from decoder backend correctly

* dtw: cleanup

* Fix incorrect n_frames passed to dtw when near end of audio

* Fix aheads_masks_init for backend != CPU

* whisper : minor style

* main : add dtw (wip)

* whisper: fix invalid memory access in aheads_masks_init

* main : add dtw (cont)

* whisper : minor

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-20 18:25:26 +02:00
Jo Liss e7794a868f
examples : rename --audio-context to --audio-ctx per help text (#1953) 2024-03-18 17:53:33 +02:00
Georgi Gerganov de4d067f1e
talk-llama : sync llama.cpp 2024-03-15 14:21:59 +02:00
slaren f60ccfd83b
update examples and tests 2024-03-15 14:01:14 +02:00
Georgi Gerganov 2f5a5a66dd
talk-llama : use llama_decode instead of llama_eval 2024-03-08 12:04:43 +02:00
Georgi Gerganov 8e409d1113
talk-llama : sync llama.cpp 2024-03-08 11:55:50 +02:00
Georgi Gerganov 05d1b61af4
talk-llama : sync llama.cpp 2024-03-08 11:52:47 +02:00
F1L1P 2e2626b167
examples : Auto lowercase language parameter in main.cpp (#1928)
* Auto lowercase language parameter

* Update examples/main/main.cpp

Co-authored-by: bobqianic <129547291+bobqianic@users.noreply.github.com>

---------

Co-authored-by: bobqianic <129547291+bobqianic@users.noreply.github.com>
2024-03-06 22:25:10 +00:00
zhouwg c0c0ae2dea
examples : fix typo in bench.cpp (#1933) 2024-03-06 22:21:44 +00:00
zhouwg f22d27a385
whisper.android.java : fix returns in JNI (#1929) 2024-03-05 15:59:26 +02:00
Georgi Gerganov 25d313b38b
talk-llama : sync llama.cpp 2024-02-28 13:04:05 +02:00
Georgi Gerganov 1711bb3881
sync : llama.cpp (ggml/0) 2024-02-28 13:00:30 +02:00
Andrew S 0d8fd8483a
stream.wasm : fix invalid memory access when no segments (#1902)
No segments may be returned when a smaller sample buffer (EG 2048 samples) is sent to the worker.
2024-02-26 10:12:35 +02:00
Georgi Gerganov 3170841ed9
talk-llama : sync llama.cpp 2024-02-25 20:00:10 +02:00
Georgi Gerganov 578e47e70c
sync : llama.cpp (ggml/0) 2024-02-25 19:58:46 +02:00
Tamotsu Takahashi f18738f247
talk, talk-llama : pass text_to_speak as a file (#1865)
* talk-llama: pass file instead of arg

it is too hard to quote text in a portable way

* talk-llama: pass heard_ok as a file

* talk-llama: let eleven-labs.py accept options

Options: -v voice, -s savefile, -p (--play)

* talk-llama: check installed commands in "speak"

Pass "-q" to eleven-labs.py to skip checking whether elevenlabs is installed

* talk-llama: pass voice_id again

in order to sync talk with talk-llama

* talk: sync with talk-llama

Passing text_to_speak as a file is safer and more portable
cf. https://stackoverflow.com/a/59036879/45375

* talk and talk-llama: get all installed voices in speak.ps1

* talk and talk-llama: get voices from api

* talk and talk-llama: add more options to eleven-labs.py

and remove DEFAULT_VOICE because it is deprecated (https://www.reddit.com/r/ElevenLabs/comments/1830abt/what_happened_to_bella/)

```
usage: eleven-labs.py [-q] [-l] [-h] [-n NAME | -v NUMBER] [-f KEY=VAL] [-s FILE | -p] [TEXTFILE]

options:
  -q, --quick           skip checking the required library

action:
  TEXTFILE              read the text file (default: stdin)
  -l, --list            show the list of voices and exit
  -h, --help            show this help and exit

voice selection:
  -n NAME, --name NAME  get a voice object by name (default: Arnold)
  -v NUMBER, --voice NUMBER
                        get a voice object by number (see --list)
  -f KEY=VAL, --filter KEY=VAL
                        filter voices by labels (default: "use case=narration")
                        this option can be used multiple times
                        filtering will be disabled if the first -f has no "=" (e.g. -f "any")

output:
  -s FILE, --save FILE  save the TTS to a file (default: audio.mp3)
  -p, --play            play the TTS with ffplay
```

* examples: add speak_with_file()

as suggested in the review

* talk and talk-llama: ignore to_speak.txt
2024-02-24 09:24:47 +02:00
Abhilash Majumder a0ddd8392c
whisper : add SYCL support (#1863)
* add changes from llama upstream

* add sycl abstraction

* add sycl build

* update cmake

* add sycl build config

* fix bug

* fix bug

* refactor build

* fix bug

* update build

* call build

* use sycl header

* add examples

* add target

* fix typecast in quant.c

* readd fp16 and readme

* fix quant typecast

* add sample

* add readme

* remove cxx file check
2024-02-23 09:22:24 +02:00
Georgi Gerganov a2506909b1
talk-llama : sync llama.cpp 2024-02-22 23:30:53 +02:00
Georgi Gerganov 5fdb27ff80
ggml : 32-bit arm compat (#1891)
* ggml : 32-bit arm compat

* ggml : add ggml_vqtbl1q_s8 impl

* ggml : cont
2024-02-22 18:31:40 +02:00
Georgi Gerganov ce411498f6
sync : llama.cpp (ggml/0)
ggml-ci
2024-02-22 15:12:36 +02:00
Davidson Francis c56344b509
main : fix file existence check in main.cpp (#1889)
In commit dda4b0e of PR #1872, I've introduced a check for the
existence of files before loading the model. However, I haven't
considered the case where whisper.cpp might read from stdin as well,
and in such cases, the checks should ignore the "-" argument as it
does not represent a regular file.

Additionally, this commit removes the usage of 'stat()' in favor of
the recently introduced function 'is_file_exist()' in common.cpp from
PR #1871.

Apologies for the bug introduced in the previous PR and any
inconvenience it may have caused.
2024-02-22 15:01:08 +02:00
Georgi Gerganov 59119f4f20
talk-llama : sync llama.cpp 2024-02-20 12:09:57 +02:00
Georgi Gerganov 83afebe872
common : add IQ1_S (ggml/0)
ggml-ci
2024-02-19 15:53:25 +02:00
Davidson Francis dda4b0ed06
main : check if input files exist before proceeding (#1872)
Until the most recent commit (3d42463), the main.cpp sample file does
not check whether the input files exist or not. Consequently, the
model is loaded first before reporting whether there was a failure or
not when processing a file. In environments with HDD, this can take
about 50 seconds or more, depending on the loaded model.

This commit addresses this issue by checking in advance whether the
input files exist or not.
2024-02-19 10:51:26 +02:00
Felix 07d04280be
examples : clean up common code (#1871)
move some utility functions into common.h
2024-02-19 10:50:15 +02:00
Georgi Gerganov 551529290d
talk-llama : sync llama.cpp 2024-02-12 10:39:58 +02:00
dscripka a6fb6ab597
examples : added audio_ctx argument to main and server (#1857)
* added audio_ctx argument to main and server examples

* Better default value

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* better default value (again)

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-02-12 09:19:07 +02:00
Georgi Gerganov f273e66dc6
examples : initialize context params properly (#1852) 2024-02-11 16:39:12 +02:00
Georgi Gerganov 02b4c52c12
talk-llama : sync llama.cpp 2024-02-10 10:10:59 +02:00
Valentin Gosu 80e8a2ea39
server : allow CORS request with authorization headers (#1850)
Whisper plugin in Obsidian requires an API key which is
then sent as an authorization header.
However, the presence of an authorization header requires
a CORS Preflight, so both the OPTIONS method and
the Access-Control-Allow-Headers: authorization must be
handled.
2024-02-09 17:42:41 +02:00
Neuman Vong 19f8048139
whisper.android : how to build with CLBlast (#1809)
* FetchContent

* OpenCL

* Documentation and make optional

* Specify GGML build options in build.gradle

* Use gradle properties

* @ggerganov

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* @gpokat

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-02-09 17:39:05 +02:00
Georgi Gerganov 434b8f3b96
talk-llama : stream response (#1121) 2024-02-06 19:56:12 +02:00
Georgi Gerganov 7a74e929c8
sync : ggml (#0) 2024-01-30 21:30:26 +02:00
JacobLinCool ae5c4f7340
common : fix wav buffer detection (#1819) 2024-01-30 19:35:08 +02:00
JacobLinCool baa30bacdb
server : add fields to `verbose_json` response (#1802)
* server: include additional fields in the verbose_json response as OpenAI does

* server: show request examples on home page

* server: todo note for compression_ratio and no_speech_prob

* server: add simple demo form to the homepage
2024-01-30 14:15:55 +02:00
Georgi Gerganov e72e4158de
talk-llama : sync llama.cpp 2024-01-28 19:44:10 +02:00
Georgi Gerganov 52cce82493
common : fix input buffer check (#1812) 2024-01-27 17:33:09 +02:00
Georgi Gerganov ef3c9ed9eb
talk-llama : sync llama.cpp 2024-01-27 17:24:53 +02:00
Michael Rienstra 4bbb60efce
docs : make model options / model install methods clearer (#1806)
* Make models more "discoverable"

* Clean up code block language identifiers

* make 3 options clearer

* undo Prettier formatter change

* docs: `$` shell prompt, consistently

* docs: minor changes
2024-01-26 17:39:54 +02:00
Neuman Vong d6b9be21d7
whisper.android : return output from benchmarks (#1785)
Benchmarks are failing because JNI expects a jstring and the benchmarks
are missing a return statement (i.e., returning null). The functions
actually build a jstring but don't return it, so this seems to have been
an oversight.

This patch returns the jstring and now the benchmarks run successfully.

Fixes #1783.
2024-01-19 16:17:38 +02:00