Commit Graph

601 Commits

Author SHA1 Message Date
Noah Lyons e5d4412578
server : merge split utf-8 token text in verbose json (#3850) 2026-06-02 13:10:27 +02:00
Georgi Gerganov 6c343e7a4e
common : pass sample rate to `ffmpeg_decode_audio()` 2026-05-31 15:49:13 +03:00
Georgi Gerganov f39cc71282
common : re-implement `ffmpeg-transcode.cpp` + clarify ffmpeg usage (#3846)
* examples : remove ffmpeg-transcode.cpp

* examples : implement ffmpeg-transcode.cpp

Assisted-by: llama.cpp:local pi

* common : switch from WHISPER_FFMPEG -> WHISPER_COMMON_FFMPEG
2026-05-31 15:44:07 +03:00
Georgi Gerganov 5828fba79f talk-llama : sync llama.cpp 2026-05-29 09:47:30 +03:00
texasich 27101c01dc
cli : merge tokens split across UTF-8 boundaries in JSON output (#3751)
* cli : merge tokens split across UTF-8 boundaries in JSON output

When a multi-byte UTF-8 codepoint (most commonly a CJK character, 3 bytes)
is split across multiple whisper tokens, the -ojf/--output-json-full
writer emitted each token's partial bytes as its own JSON string, producing
invalid UTF-8 that chokes downstream parsers.

Merge adjacent tokens in output_json whenever the accumulated text still
ends on an incomplete UTF-8 sequence. The merged entry keeps the first
token's id/p/t_dtw and extends t1 to the last absorbed token, which
matches how segment text is assembled elsewhere.

Refs #1798

* fix: address review — add braces for consistency, use full issue URL

- Add braces to if/else chain for codebase consistency
- Use full URL for issue #1798 reference

Review: @danbev

---------

Co-authored-by: texasich <texasich@users.noreply.github.com>
Co-authored-by: texasich <texasich@gmail.com>
2026-05-26 06:23:41 +02:00
Georgi Gerganov 865ec171aa talk-llama : sync llama.cpp 2026-05-25 12:26:07 +03:00
Pascal 0ccd896f5b
common : fix server /inference fails to decode in-memory audio (regression) (#3818)
* common: add memory buffer overload of read_audio_data

whisper-server /inference without --convert passed the uploaded file
bytes to read_audio_data as a filename, so ma_decoder_init_file tried
to open a path starting with "RIFF" and failed. every request returned
HTTP 400 "Invalid request" on builds without WHISPER_FFMPEG, which is
the default.

factor the PCM extraction into a shared helper and add an overload that
decodes straight from a memory buffer via ma_decoder_init_memory, which
the function already used for the stdin path. server now calls it with
the upload content. the filename overload behavior is unchanged.
2026-05-22 08:27:35 +02:00
petterreinholdtsen 47b9eb37a3
examples : fix memory leak in read_audio_data (#3810)
This commit addresses a memory leak in the `read_audio_data` function
where it is currently possible that a call to `ma_decoder_init_file`
succeeds and the function returns early without calling
`ma_decoder_uninit`. A similar situation can occur with
`ma_decoder_init_memory`.

Refs: https://bugs.debian.org/1124796

Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2026-05-18 12:16:39 +02:00
Andreas Lubbe 6227a0ef73
server : Return speaker information in JSON (#3782) 2026-05-18 09:18:04 +02:00
Andreas Lubbe 968eebe772
server: add support for carry_initial_prompt (#3781)
* Add support for carry_initial_prompt on the server

* Update README
2026-05-15 14:03:17 +02:00
Georgi Gerganov 46ca43d639 talk-llama : sync llama.cpp 2026-05-14 21:26:48 +03:00
Georgi Gerganov 54ecc9dba4 talk-llama : sync llama.cpp 2026-05-14 21:26:48 +03:00
Andreas Lubbe 3e9b7d0fef
server : fix no_speech_thold not being read (#3783) 2026-05-13 10:37:28 +02:00
Andreas Lubbe a604a9b5b0
server: fix params leak between requests (#3784) 2026-05-13 08:54:56 +02:00
Andreas Lubbe 338cce1e58
server: Add support for controlling token_timestamps directly (#3785) 2026-05-12 07:36:00 +02:00
Georgi Gerganov 4bf733672b talk-llama : sync llama.cpp 2026-05-02 15:02:42 +03:00
Georgi Gerganov c59a773605
examples : update to Q1_0 2026-05-01 13:07:33 +03:00
jinweihan fc674574ca
bench : sync submit-results URL to ggml-org (#3769)
The project moved from ggerganov/ to ggml-org/ and the README already
references the new URL in both places it mentions issue #89 (README.md
and examples/bench/README.md). Syncing the two remaining hardcoded URLs
in examples/bench/bench.cpp and examples/bench.wasm/emscripten.cpp.

The old URL still redirects, so this is cosmetic.
2026-04-20 07:12:57 +02:00
Georgi Gerganov 4bbce1e5b2
benches : update 2026-03-18 22:34:51 +02:00
Gaël James 21665eab4c
examples : Allow max_len to be used for any output format (#3679) 2026-03-16 13:33:56 +02:00
Igor Loskutov 136dc2eb12
server: return proper HTTP status codes for error responses (#3707)
Several error paths in the /inference and /load endpoints returned
HTTP 200 with a JSON error body, making it impossible for clients
to distinguish errors from successful responses by status code.

Set 400 for client errors (missing file field, unreadable audio,
missing/invalid model) and 500 for server errors (ffmpeg conversion
failure). The two existing status-code sites (499 for client
disconnect, 500 for processing failure) are unchanged.
2026-03-16 13:33:06 +02:00
Georgi Gerganov 2bc630f197 talk-llama : sync llama.cpp 2026-03-16 13:10:15 +02:00
Georgi Gerganov 81ea958719 common : add nvfp4 (ggml/0) 2026-03-16 13:10:15 +02:00
Georgi Gerganov 84f8db71d8 talk-llama : sync llama.cpp 2026-02-27 20:57:58 +02:00
Dmitry Atamanov cec1dd9d12
examples : update miniaudio library to 0.11.24 (#3672) 2026-02-27 11:15:15 +01:00
Georgi Gerganov 364c77f4ca talk-llama : sync llama.cpp 2026-02-15 21:44:37 +02:00
Sid Mohan eb27fa2252
server : fix hardcoded /inference path in default HTML page (#3639)
Closes #3596
2026-02-09 10:10:13 +02:00
Georgi Gerganov 4b23ff249e talk-llama : sync llama.cpp 2026-02-08 09:29:10 +02:00
Georgi Gerganov 953e503fd9 talk-llama : sync llama.cpp 2026-01-30 15:56:40 +02:00
Bráulio Oliveira 7aa8818647
examples : use -dev/--device and WHISPER_ARG_DEVICE (#3557)
Align device selection naming with llama.cpp.
2026-01-21 08:40:30 +01:00
Georgi Gerganov ecfcc65fbf talk-llama : sync llama.cpp 2026-01-14 09:11:59 +02:00
Peter A. a96310871a
examples : fix executable example targets (#3600)
* cmake:
    - added `whisper-` prefix to unprefixed targets: `quantize`, `lsp`,
      `vad-speech-segments`
    - added `install(TARGETS ${TARGET} RUNTIME)` where it was missing

Signed-off-by: Peter A. <ink.splatters@pm.me>

* .github/workflows/build.yml: quantize -> whisper-quantize

Signed-off-by: Peter A. <ink.splatters@pm.me>

---------

Signed-off-by: Peter A. <ink.splatters@pm.me>
2026-01-13 08:08:18 +01:00
Georgi Gerganov 7359ac94d5 talk-llama : sync llama.cpp 2025-12-31 17:52:09 +02:00
Georgi Gerganov 6c22e792cb talk-llama : sync llama.cpp 2025-12-18 08:20:56 +02:00
Marcos Del Sol Vives 2551e4ce98
server: allow custom temp directory for ffmpeg (#3564) 2025-12-13 09:37:44 +02:00
Georgi Gerganov 179d8b1c9c
talk-llama : sync llama.cpp 2025-12-12 18:15:27 +02:00
Daniel Bevenius 19ceec8eac
examples : fix typo in vad-speech-segments command [no ci] (#3535)
This commit corrects a typo the command-line argument for specifying the
VAD model in the vad-speech-segments example.
2025-11-20 13:35:11 +01:00
Georgi Gerganov b12abefa9b sync : llama.cpp 2025-11-17 21:05:46 +02:00
KITAITI Makoto 27f485a14c
vad : Silero VAD v6.2.0 (#3524)
* Add ggml-silero-v6.2.0 to download candidates

* Make default VAD model ggml-silero-v6.2.0

* Make VAD model in documentations ggml-silero-v6.2.0
2025-11-17 22:26:17 +09:00
Georgi Gerganov a1867e0dad sync : llama.cpp 2025-11-09 23:38:03 +02:00
Orel-A f16c12f3f5
wasm : fix Hebrew ID (#3487)
whisper_lang_id: unknown language 'iw'
2025-10-27 08:49:32 +02:00
Georgi Gerganov 322c2adb75 talk-llama : sync llama.cpp 2025-10-22 12:58:11 +03:00
Georgi Gerganov 23c19308d8
server : set no_context == true (#3482) 2025-10-20 15:39:48 +03:00
Georgi Gerganov 8ba3c13b0c talk-llama : sync llama.cpp 2025-10-15 09:29:17 +03:00
Georgi Gerganov ff4c1a5a53 talk-llama : sync llama.cpp 2025-10-12 11:16:23 +03:00
Andreas Lubbe 85871a9469
whisper : add support for --carry-initial-prompt (#3395)
* Add support for --carry-initial-prompt

* PR fixes for ruby and go

* Refactoring for readability

* WIP 1

* WIP 2

* PR fixes

* More PR fixes

* PR fix

* Further simplification

* d'oh

* One more logic fix

* Update src/whisper.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Truncate prompt_past0 upon initialization

* Slight simplification

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-10-10 19:51:15 +03:00
Andreas Lubbe a0ca50f3b9
cli: Fix assignment for vad_min_silence_duration_ms (#3467)
* cli: Fix assignment for vad_min_silence_duration_ms

Found and fixed this simple copy/paste error

* server : fix vad_min_silence_duration_ms assignment

---------

Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2025-10-10 15:21:03 +02:00
Georgi Gerganov 8a67c55c8a
wchess : fix link [no ci] 2025-09-30 21:28:03 +03:00
Daniel Bevenius 5904d00dbb
examples : add wchess.wasm to wasm examples build (#3443)
* examples : add wchess.wasm to wasm examples build

This commit add the wchess.wasm example to the wasm examples that are
deployed to https://ggml.ai/whisper.cpp.

Refs: https://github.com/ggml-org/whisper.cpp/issues/3434#issuecomment-3346980420
2025-09-30 16:23:01 +02:00
Georgi Gerganov 0b3587acdd
whisper : enable flash attention by default (#3441) 2025-09-30 15:47:20 +03:00