whisper.cpp

Commit Graph

Author	SHA1	Message	Date
Liz Fong-Jones	4f2b6ff9ea	whisper-server : expose --seg-len-hint as CLI flag and POST form field The initial --seg-len-hint commit wired the flag into whisper-cli but not whisper-server. Mirrors the existing best_of / beam_size pattern at server.cpp:221-222 (CLI) and :505-511 (POST form field) and assigns the value to wparams.seg_len_hint during inference setup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 14:53:49 -07:00
Liz Fong-Jones	24a436d350	whisper : add --seg-len-hint to discourage progressively shorter segments When processing long audio, whisper tends to produce progressively shorter segments because timestamp tokens in the decoder prompt context condition the model to insert more frequent segment breaks. Add a seg_len_hint parameter (in ms) that thins timestamp tokens in the rolling prompt context, keeping at most one per seg_len_hint interval. This breaks the feedback loop while preserving text tokens for continuity. The model can still break on natural boundaries (speaker turns, pauses) — the hint only affects context conditioning, not the actual segment creation. Usage: --seg-len-hint 2000 (for ~2 second target segments) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 21:29:08 -07:00
Georgi Gerganov	4bbce1e5b2	benches : update	2026-03-18 22:34:51 +02:00
Gaël James	21665eab4c	examples : Allow max_len to be used for any output format (#3679 )	2026-03-16 13:33:56 +02:00
Igor Loskutov	136dc2eb12	server: return proper HTTP status codes for error responses (#3707 ) Several error paths in the /inference and /load endpoints returned HTTP 200 with a JSON error body, making it impossible for clients to distinguish errors from successful responses by status code. Set 400 for client errors (missing file field, unreadable audio, missing/invalid model) and 500 for server errors (ffmpeg conversion failure). The two existing status-code sites (499 for client disconnect, 500 for processing failure) are unchanged.	2026-03-16 13:33:06 +02:00
Georgi Gerganov	2bc630f197	talk-llama : sync llama.cpp	2026-03-16 13:10:15 +02:00
Georgi Gerganov	81ea958719	common : add nvfp4 (ggml/0)	2026-03-16 13:10:15 +02:00
Georgi Gerganov	84f8db71d8	talk-llama : sync llama.cpp	2026-02-27 20:57:58 +02:00
Dmitry Atamanov	cec1dd9d12	examples : update miniaudio library to 0.11.24 (#3672 )	2026-02-27 11:15:15 +01:00
Georgi Gerganov	364c77f4ca	talk-llama : sync llama.cpp	2026-02-15 21:44:37 +02:00
Sid Mohan	eb27fa2252	server : fix hardcoded /inference path in default HTML page (#3639 ) Closes #3596	2026-02-09 10:10:13 +02:00
Georgi Gerganov	4b23ff249e	talk-llama : sync llama.cpp	2026-02-08 09:29:10 +02:00
Georgi Gerganov	953e503fd9	talk-llama : sync llama.cpp	2026-01-30 15:56:40 +02:00
Bráulio Oliveira	7aa8818647	examples : use -dev/--device and WHISPER_ARG_DEVICE (#3557 ) Align device selection naming with llama.cpp.	2026-01-21 08:40:30 +01:00
Georgi Gerganov	ecfcc65fbf	talk-llama : sync llama.cpp	2026-01-14 09:11:59 +02:00
Peter A.	a96310871a	examples : fix executable example targets (#3600 ) * cmake: - added `whisper-` prefix to unprefixed targets: `quantize`, `lsp`, `vad-speech-segments` - added `install(TARGETS ${TARGET} RUNTIME)` where it was missing Signed-off-by: Peter A. <ink.splatters@pm.me> * .github/workflows/build.yml: quantize -> whisper-quantize Signed-off-by: Peter A. <ink.splatters@pm.me> --------- Signed-off-by: Peter A. <ink.splatters@pm.me>	2026-01-13 08:08:18 +01:00
Georgi Gerganov	7359ac94d5	talk-llama : sync llama.cpp	2025-12-31 17:52:09 +02:00
Georgi Gerganov	6c22e792cb	talk-llama : sync llama.cpp	2025-12-18 08:20:56 +02:00
Marcos Del Sol Vives	2551e4ce98	server: allow custom temp directory for ffmpeg (#3564 )	2025-12-13 09:37:44 +02:00
Georgi Gerganov	179d8b1c9c	talk-llama : sync llama.cpp	2025-12-12 18:15:27 +02:00
Daniel Bevenius	19ceec8eac	examples : fix typo in vad-speech-segments command [no ci] (#3535 ) This commit corrects a typo the command-line argument for specifying the VAD model in the vad-speech-segments example.	2025-11-20 13:35:11 +01:00
Georgi Gerganov	b12abefa9b	sync : llama.cpp	2025-11-17 21:05:46 +02:00
KITAITI Makoto	27f485a14c	vad : Silero VAD v6.2.0 (#3524 ) * Add ggml-silero-v6.2.0 to download candidates * Make default VAD model ggml-silero-v6.2.0 * Make VAD model in documentations ggml-silero-v6.2.0	2025-11-17 22:26:17 +09:00
Georgi Gerganov	a1867e0dad	sync : llama.cpp	2025-11-09 23:38:03 +02:00
Orel-A	f16c12f3f5	wasm : fix Hebrew ID (#3487 ) whisper_lang_id: unknown language 'iw'	2025-10-27 08:49:32 +02:00
Georgi Gerganov	322c2adb75	talk-llama : sync llama.cpp	2025-10-22 12:58:11 +03:00
Georgi Gerganov	23c19308d8	server : set no_context == true (#3482 )	2025-10-20 15:39:48 +03:00
Georgi Gerganov	8ba3c13b0c	talk-llama : sync llama.cpp	2025-10-15 09:29:17 +03:00
Georgi Gerganov	ff4c1a5a53	talk-llama : sync llama.cpp	2025-10-12 11:16:23 +03:00
Andreas Lubbe	85871a9469	whisper : add support for --carry-initial-prompt (#3395 ) * Add support for --carry-initial-prompt * PR fixes for ruby and go * Refactoring for readability * WIP 1 * WIP 2 * PR fixes * More PR fixes * PR fix * Further simplification * d'oh * One more logic fix * Update src/whisper.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Truncate prompt_past0 upon initialization * Slight simplification --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-10-10 19:51:15 +03:00
Andreas Lubbe	a0ca50f3b9	cli: Fix assignment for vad_min_silence_duration_ms (#3467 ) * cli: Fix assignment for vad_min_silence_duration_ms Found and fixed this simple copy/paste error * server : fix vad_min_silence_duration_ms assignment --------- Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>	2025-10-10 15:21:03 +02:00
Georgi Gerganov	8a67c55c8a	wchess : fix link [no ci]	2025-09-30 21:28:03 +03:00
Daniel Bevenius	5904d00dbb	examples : add wchess.wasm to wasm examples build (#3443 ) * examples : add wchess.wasm to wasm examples build This commit add the wchess.wasm example to the wasm examples that are deployed to https://ggml.ai/whisper.cpp. Refs: https://github.com/ggml-org/whisper.cpp/issues/3434#issuecomment-3346980420	2025-09-30 16:23:01 +02:00
Georgi Gerganov	0b3587acdd	whisper : enable flash attention by default (#3441 )	2025-09-30 15:47:20 +03:00
Georgi Gerganov	a77d11d91e	bench : warm-up all kernels (#3438 )	2025-09-29 17:27:53 +03:00
Georgi Gerganov	fcf0181ee2	talk-llama : sync llama.cpp	2025-09-29 15:18:41 +03:00
Georgi Gerganov	36778bd8b8	talk-llama : sync llama.cpp	2025-09-20 13:58:28 +03:00
Georgi Gerganov	fc45bb8625	talk-llama : sync llama.cpp ggml-ci	2025-08-18 20:30:45 +03:00
Georgi Gerganov	7fd2fbde45	common : handle mxfp4 enum ggml-ci	2025-08-18 20:30:45 +03:00
Daniel Bevenius	040510a132	node : add win platform check for require path (#3363 ) This commit adds a check to the platform in use and adjust the path to the addon.node shared library. The motivation for this change is that on windows addon.node library is built into build\bin\Release and on linux into build/Release. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3360	2025-08-15 14:54:23 +02:00
Georgi Gerganov	b02242d0ad	wasm : change ggml model host to HF (#3369 )	2025-08-10 13:00:17 +03:00
Daniel Bevenius	0becabc8d6	stream.wasm : add language selection support (#3354 ) * stream.wasm : add language selection support This commit adds support for selecting the language in the stream.wasm example. This is includes adding the model `base` which supports multilingual transcription, and allowing the user to select a language from a dropdown menu in the HTML interface. The motivation for this is that it allows users to transcribe audio in various languages. Refs: https://github.com/ggml-org/whisper.cpp/issues/3347 * squash! stream.wasm : add language selection support Remove strdup() for language in stream.wasm and update butten text for base (should not be "base.en" but just "base").	2025-08-02 07:03:04 +02:00
Georgi Gerganov	d0a9d8c7f8	talk-llama : sync llama.cpp	2025-07-28 13:02:32 +03:00
Daniel Bevenius	7de8dd783f	examples : add note about WHISPER_WASM_SINGLE_FILE [no ci] (#3332 ) This commit adds a note to the README files of the WASM examples about the `WHISPER_WASM_SINGLE_FILE` option. The motivation for this is that currently this option is not documented and might be surprising to users who expect a separate .wasm file to be generated. Refs: https://github.com/ggml-org/whisper.cpp/issues/3290	2025-07-24 16:06:48 +02:00
Sacha Arbonel	1f5cf0b288	server : hide language probabilities option behind flag (#3328 ) * examples/server: hide language probabilities option behind flag * code review * fix	2025-07-21 13:03:54 +02:00
Greg Sadetsky	a16da91365	examples : update links in wasm examples (#3318 ) * fix 404 link * update link in whisper.wasm example * update example in command.wasm * update link in bench.wasm example * update link in stream.wasm example	2025-07-12 23:22:35 +02:00
Georgi Gerganov	6ddff4d96a	talk-llama : sync llama.cpp ggml-ci	2025-07-12 19:23:56 +03:00
accessiblepixel	869335f2d5	server : add dtw.params for v3-large-turbo (#3307 ) * Add DTW model large-v3-turbo parameters to server.cpp example DTW support is available in whispercpp and the large-v3-turbo model has already been added to the sources, but the large-v3-turbo model hasn't been added to the server.cpp file to make use of it. This commit hopefully corrects that issue. * match original linebreak of original server.cpp file after adding large.v3.turbo dtw	2025-07-07 12:51:15 +03:00
Lin Xiaodong	d9999d54c8	feat: support vad for addon.node (#3301 ) Co-authored-by: linxiaodong <calm.lin@wukongsch.com>	2025-07-02 13:14:29 +03:00
Georgi Gerganov	1f816de7da	talk-llama : sync llama.cpp	2025-07-01 17:54:53 +03:00

1 2 3 4 5 ...

585 Commits