Commit Graph

594 Commits

Author SHA1 Message Date
petterreinholdtsen 47b9eb37a3
examples : fix memory leak in read_audio_data (#3810)
This commit addresses a memory leak in the `read_audio_data` function
where it is currently possible that a call to `ma_decoder_init_file`
succeeds and the function returns early without calling
`ma_decoder_uninit`. A similar situation can occur with
`ma_decoder_init_memory`.

Refs: https://bugs.debian.org/1124796

Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2026-05-18 12:16:39 +02:00
Andreas Lubbe 6227a0ef73
server : Return speaker information in JSON (#3782) 2026-05-18 09:18:04 +02:00
Andreas Lubbe 968eebe772
server: add support for carry_initial_prompt (#3781)
* Add support for carry_initial_prompt on the server

* Update README
2026-05-15 14:03:17 +02:00
Georgi Gerganov 46ca43d639 talk-llama : sync llama.cpp 2026-05-14 21:26:48 +03:00
Georgi Gerganov 54ecc9dba4 talk-llama : sync llama.cpp 2026-05-14 21:26:48 +03:00
Andreas Lubbe 3e9b7d0fef
server : fix no_speech_thold not being read (#3783) 2026-05-13 10:37:28 +02:00
Andreas Lubbe a604a9b5b0
server: fix params leak between requests (#3784) 2026-05-13 08:54:56 +02:00
Andreas Lubbe 338cce1e58
server: Add support for controlling token_timestamps directly (#3785) 2026-05-12 07:36:00 +02:00
Georgi Gerganov 4bf733672b talk-llama : sync llama.cpp 2026-05-02 15:02:42 +03:00
Georgi Gerganov c59a773605
examples : update to Q1_0 2026-05-01 13:07:33 +03:00
jinweihan fc674574ca
bench : sync submit-results URL to ggml-org (#3769)
The project moved from ggerganov/ to ggml-org/ and the README already
references the new URL in both places it mentions issue #89 (README.md
and examples/bench/README.md). Syncing the two remaining hardcoded URLs
in examples/bench/bench.cpp and examples/bench.wasm/emscripten.cpp.

The old URL still redirects, so this is cosmetic.
2026-04-20 07:12:57 +02:00
Georgi Gerganov 4bbce1e5b2
benches : update 2026-03-18 22:34:51 +02:00
Gaël James 21665eab4c
examples : Allow max_len to be used for any output format (#3679) 2026-03-16 13:33:56 +02:00
Igor Loskutov 136dc2eb12
server: return proper HTTP status codes for error responses (#3707)
Several error paths in the /inference and /load endpoints returned
HTTP 200 with a JSON error body, making it impossible for clients
to distinguish errors from successful responses by status code.

Set 400 for client errors (missing file field, unreadable audio,
missing/invalid model) and 500 for server errors (ffmpeg conversion
failure). The two existing status-code sites (499 for client
disconnect, 500 for processing failure) are unchanged.
2026-03-16 13:33:06 +02:00
Georgi Gerganov 2bc630f197 talk-llama : sync llama.cpp 2026-03-16 13:10:15 +02:00
Georgi Gerganov 81ea958719 common : add nvfp4 (ggml/0) 2026-03-16 13:10:15 +02:00
Georgi Gerganov 84f8db71d8 talk-llama : sync llama.cpp 2026-02-27 20:57:58 +02:00
Dmitry Atamanov cec1dd9d12
examples : update miniaudio library to 0.11.24 (#3672) 2026-02-27 11:15:15 +01:00
Georgi Gerganov 364c77f4ca talk-llama : sync llama.cpp 2026-02-15 21:44:37 +02:00
Sid Mohan eb27fa2252
server : fix hardcoded /inference path in default HTML page (#3639)
Closes #3596
2026-02-09 10:10:13 +02:00
Georgi Gerganov 4b23ff249e talk-llama : sync llama.cpp 2026-02-08 09:29:10 +02:00
Georgi Gerganov 953e503fd9 talk-llama : sync llama.cpp 2026-01-30 15:56:40 +02:00
Bráulio Oliveira 7aa8818647
examples : use -dev/--device and WHISPER_ARG_DEVICE (#3557)
Align device selection naming with llama.cpp.
2026-01-21 08:40:30 +01:00
Georgi Gerganov ecfcc65fbf talk-llama : sync llama.cpp 2026-01-14 09:11:59 +02:00
Peter A. a96310871a
examples : fix executable example targets (#3600)
* cmake:
    - added `whisper-` prefix to unprefixed targets: `quantize`, `lsp`,
      `vad-speech-segments`
    - added `install(TARGETS ${TARGET} RUNTIME)` where it was missing

Signed-off-by: Peter A. <ink.splatters@pm.me>

* .github/workflows/build.yml: quantize -> whisper-quantize

Signed-off-by: Peter A. <ink.splatters@pm.me>

---------

Signed-off-by: Peter A. <ink.splatters@pm.me>
2026-01-13 08:08:18 +01:00
Georgi Gerganov 7359ac94d5 talk-llama : sync llama.cpp 2025-12-31 17:52:09 +02:00
Georgi Gerganov 6c22e792cb talk-llama : sync llama.cpp 2025-12-18 08:20:56 +02:00
Marcos Del Sol Vives 2551e4ce98
server: allow custom temp directory for ffmpeg (#3564) 2025-12-13 09:37:44 +02:00
Georgi Gerganov 179d8b1c9c
talk-llama : sync llama.cpp 2025-12-12 18:15:27 +02:00
Daniel Bevenius 19ceec8eac
examples : fix typo in vad-speech-segments command [no ci] (#3535)
This commit corrects a typo the command-line argument for specifying the
VAD model in the vad-speech-segments example.
2025-11-20 13:35:11 +01:00
Georgi Gerganov b12abefa9b sync : llama.cpp 2025-11-17 21:05:46 +02:00
KITAITI Makoto 27f485a14c
vad : Silero VAD v6.2.0 (#3524)
* Add ggml-silero-v6.2.0 to download candidates

* Make default VAD model ggml-silero-v6.2.0

* Make VAD model in documentations ggml-silero-v6.2.0
2025-11-17 22:26:17 +09:00
Georgi Gerganov a1867e0dad sync : llama.cpp 2025-11-09 23:38:03 +02:00
Orel-A f16c12f3f5
wasm : fix Hebrew ID (#3487)
whisper_lang_id: unknown language 'iw'
2025-10-27 08:49:32 +02:00
Georgi Gerganov 322c2adb75 talk-llama : sync llama.cpp 2025-10-22 12:58:11 +03:00
Georgi Gerganov 23c19308d8
server : set no_context == true (#3482) 2025-10-20 15:39:48 +03:00
Georgi Gerganov 8ba3c13b0c talk-llama : sync llama.cpp 2025-10-15 09:29:17 +03:00
Georgi Gerganov ff4c1a5a53 talk-llama : sync llama.cpp 2025-10-12 11:16:23 +03:00
Andreas Lubbe 85871a9469
whisper : add support for --carry-initial-prompt (#3395)
* Add support for --carry-initial-prompt

* PR fixes for ruby and go

* Refactoring for readability

* WIP 1

* WIP 2

* PR fixes

* More PR fixes

* PR fix

* Further simplification

* d'oh

* One more logic fix

* Update src/whisper.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Truncate prompt_past0 upon initialization

* Slight simplification

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-10-10 19:51:15 +03:00
Andreas Lubbe a0ca50f3b9
cli: Fix assignment for vad_min_silence_duration_ms (#3467)
* cli: Fix assignment for vad_min_silence_duration_ms

Found and fixed this simple copy/paste error

* server : fix vad_min_silence_duration_ms assignment

---------

Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2025-10-10 15:21:03 +02:00
Georgi Gerganov 8a67c55c8a
wchess : fix link [no ci] 2025-09-30 21:28:03 +03:00
Daniel Bevenius 5904d00dbb
examples : add wchess.wasm to wasm examples build (#3443)
* examples : add wchess.wasm to wasm examples build

This commit add the wchess.wasm example to the wasm examples that are
deployed to https://ggml.ai/whisper.cpp.

Refs: https://github.com/ggml-org/whisper.cpp/issues/3434#issuecomment-3346980420
2025-09-30 16:23:01 +02:00
Georgi Gerganov 0b3587acdd
whisper : enable flash attention by default (#3441) 2025-09-30 15:47:20 +03:00
Georgi Gerganov a77d11d91e
bench : warm-up all kernels (#3438) 2025-09-29 17:27:53 +03:00
Georgi Gerganov fcf0181ee2
talk-llama : sync llama.cpp 2025-09-29 15:18:41 +03:00
Georgi Gerganov 36778bd8b8
talk-llama : sync llama.cpp 2025-09-20 13:58:28 +03:00
Georgi Gerganov fc45bb8625 talk-llama : sync llama.cpp
ggml-ci
2025-08-18 20:30:45 +03:00
Georgi Gerganov 7fd2fbde45 common : handle mxfp4 enum
ggml-ci
2025-08-18 20:30:45 +03:00
Daniel Bevenius 040510a132
node : add win platform check for require path (#3363)
This commit adds a check to the platform in use and adjust the path to
the addon.node shared library.

The motivation for this change is that on windows addon.node library is
built into build\bin\Release and on linux into build/Release.

Resolves: https://github.com/ggml-org/whisper.cpp/issues/3360
2025-08-15 14:54:23 +02:00
Georgi Gerganov b02242d0ad
wasm : change ggml model host to HF (#3369) 2025-08-10 13:00:17 +03:00