The initial --seg-len-hint commit wired the flag into whisper-cli but not
whisper-server. Mirrors the existing best_of / beam_size pattern at
server.cpp:221-222 (CLI) and :505-511 (POST form field) and assigns the
value to wparams.seg_len_hint during inference setup.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When processing long audio, whisper tends to produce progressively
shorter segments because timestamp tokens in the decoder prompt context
condition the model to insert more frequent segment breaks.
Add a seg_len_hint parameter (in ms) that thins timestamp tokens in
the rolling prompt context, keeping at most one per seg_len_hint
interval. This breaks the feedback loop while preserving text tokens
for continuity. The model can still break on natural boundaries
(speaker turns, pauses) — the hint only affects context conditioning,
not the actual segment creation.
Usage: --seg-len-hint 2000 (for ~2 second target segments)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Several error paths in the /inference and /load endpoints returned
HTTP 200 with a JSON error body, making it impossible for clients
to distinguish errors from successful responses by status code.
Set 400 for client errors (missing file field, unreadable audio,
missing/invalid model) and 500 for server errors (ffmpeg conversion
failure). The two existing status-code sites (499 for client
disconnect, 500 for processing failure) are unchanged.
* cmake:
- added `whisper-` prefix to unprefixed targets: `quantize`, `lsp`,
`vad-speech-segments`
- added `install(TARGETS ${TARGET} RUNTIME)` where it was missing
Signed-off-by: Peter A. <ink.splatters@pm.me>
* .github/workflows/build.yml: quantize -> whisper-quantize
Signed-off-by: Peter A. <ink.splatters@pm.me>
---------
Signed-off-by: Peter A. <ink.splatters@pm.me>
* Add support for --carry-initial-prompt
* PR fixes for ruby and go
* Refactoring for readability
* WIP 1
* WIP 2
* PR fixes
* More PR fixes
* PR fix
* Further simplification
* d'oh
* One more logic fix
* Update src/whisper.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Truncate prompt_past0 upon initialization
* Slight simplification
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* cli: Fix assignment for vad_min_silence_duration_ms
Found and fixed this simple copy/paste error
* server : fix vad_min_silence_duration_ms assignment
---------
Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>
This commit adds a check to the platform in use and adjust the path to
the addon.node shared library.
The motivation for this change is that on windows addon.node library is
built into build\bin\Release and on linux into build/Release.
Resolves: https://github.com/ggml-org/whisper.cpp/issues/3360
* stream.wasm : add language selection support
This commit adds support for selecting the language in the stream.wasm
example. This is includes adding the model `base` which supports
multilingual transcription, and allowing the user to select a language
from a dropdown menu in the HTML interface.
The motivation for this is that it allows users to transcribe audio in
various languages.
Refs: https://github.com/ggml-org/whisper.cpp/issues/3347
* squash! stream.wasm : add language selection support
Remove strdup() for language in stream.wasm and update butten text for
base (should not be "base.en" but just "base").
This commit adds a note to the README files of the WASM examples
about the `WHISPER_WASM_SINGLE_FILE` option.
The motivation for this is that currently this option is not documented
and might be surprising to users who expect a separate .wasm file to be
generated.
Refs: https://github.com/ggml-org/whisper.cpp/issues/3290
* fix 404 link
* update link in whisper.wasm example
* update example in command.wasm
* update link in bench.wasm example
* update link in stream.wasm example
* Add DTW model large-v3-turbo parameters to server.cpp example
DTW support is available in whispercpp and the large-v3-turbo model has already been added to the sources, but the large-v3-turbo model hasn't been added to the server.cpp file to make use of it. This commit hopefully corrects that issue.
* match original linebreak of original server.cpp file after adding large.v3.turbo dtw