whisper.cpp

Commit Graph

Author	SHA1	Message	Date
Noah Lyons	e5d4412578	server : merge split utf-8 token text in verbose json (#3850 )	2026-06-02 13:10:27 +02:00
Georgi Gerganov	6c343e7a4e	common : pass sample rate to `ffmpeg_decode_audio()`	2026-05-31 15:49:13 +03:00
Georgi Gerganov	f39cc71282	common : re-implement `ffmpeg-transcode.cpp` + clarify ffmpeg usage (#3846 ) * examples : remove ffmpeg-transcode.cpp * examples : implement ffmpeg-transcode.cpp Assisted-by: llama.cpp:local pi * common : switch from WHISPER_FFMPEG -> WHISPER_COMMON_FFMPEG	2026-05-31 15:44:07 +03:00
Georgi Gerganov	5828fba79f	talk-llama : sync llama.cpp	2026-05-29 09:47:30 +03:00
texasich	27101c01dc	cli : merge tokens split across UTF-8 boundaries in JSON output (#3751 ) * cli : merge tokens split across UTF-8 boundaries in JSON output When a multi-byte UTF-8 codepoint (most commonly a CJK character, 3 bytes) is split across multiple whisper tokens, the -ojf/--output-json-full writer emitted each token's partial bytes as its own JSON string, producing invalid UTF-8 that chokes downstream parsers. Merge adjacent tokens in output_json whenever the accumulated text still ends on an incomplete UTF-8 sequence. The merged entry keeps the first token's id/p/t_dtw and extends t1 to the last absorbed token, which matches how segment text is assembled elsewhere. Refs #1798 * fix: address review — add braces for consistency, use full issue URL - Add braces to if/else chain for codebase consistency - Use full URL for issue #1798 reference Review: @danbev --------- Co-authored-by: texasich <texasich@users.noreply.github.com> Co-authored-by: texasich <texasich@gmail.com>	2026-05-26 06:23:41 +02:00
Georgi Gerganov	865ec171aa	talk-llama : sync llama.cpp	2026-05-25 12:26:07 +03:00
Pascal	0ccd896f5b	common : fix server /inference fails to decode in-memory audio (regression) (#3818 ) * common: add memory buffer overload of read_audio_data whisper-server /inference without --convert passed the uploaded file bytes to read_audio_data as a filename, so ma_decoder_init_file tried to open a path starting with "RIFF" and failed. every request returned HTTP 400 "Invalid request" on builds without WHISPER_FFMPEG, which is the default. factor the PCM extraction into a shared helper and add an overload that decodes straight from a memory buffer via ma_decoder_init_memory, which the function already used for the stdin path. server now calls it with the upload content. the filename overload behavior is unchanged.	2026-05-22 08:27:35 +02:00
petterreinholdtsen	47b9eb37a3	examples : fix memory leak in read_audio_data (#3810 ) This commit addresses a memory leak in the `read_audio_data` function where it is currently possible that a call to `ma_decoder_init_file` succeeds and the function returns early without calling `ma_decoder_uninit`. A similar situation can occur with `ma_decoder_init_memory`. Refs: https://bugs.debian.org/1124796 Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>	2026-05-18 12:16:39 +02:00
Andreas Lubbe	6227a0ef73	server : Return speaker information in JSON (#3782 )	2026-05-18 09:18:04 +02:00
Andreas Lubbe	968eebe772	server: add support for carry_initial_prompt (#3781 ) * Add support for carry_initial_prompt on the server * Update README	2026-05-15 14:03:17 +02:00
Georgi Gerganov	46ca43d639	talk-llama : sync llama.cpp	2026-05-14 21:26:48 +03:00
Georgi Gerganov	54ecc9dba4	talk-llama : sync llama.cpp	2026-05-14 21:26:48 +03:00
Andreas Lubbe	3e9b7d0fef	server : fix no_speech_thold not being read (#3783 )	2026-05-13 10:37:28 +02:00
Andreas Lubbe	a604a9b5b0	server: fix params leak between requests (#3784 )	2026-05-13 08:54:56 +02:00
Andreas Lubbe	338cce1e58	server: Add support for controlling token_timestamps directly (#3785 )	2026-05-12 07:36:00 +02:00
Georgi Gerganov	4bf733672b	talk-llama : sync llama.cpp	2026-05-02 15:02:42 +03:00
Georgi Gerganov	c59a773605	examples : update to Q1_0	2026-05-01 13:07:33 +03:00
jinweihan	fc674574ca	bench : sync submit-results URL to ggml-org (#3769 ) The project moved from ggerganov/ to ggml-org/ and the README already references the new URL in both places it mentions issue #89 (README.md and examples/bench/README.md). Syncing the two remaining hardcoded URLs in examples/bench/bench.cpp and examples/bench.wasm/emscripten.cpp. The old URL still redirects, so this is cosmetic.	2026-04-20 07:12:57 +02:00
Georgi Gerganov	4bbce1e5b2	benches : update	2026-03-18 22:34:51 +02:00
Gaël James	21665eab4c	examples : Allow max_len to be used for any output format (#3679 )	2026-03-16 13:33:56 +02:00
Igor Loskutov	136dc2eb12	server: return proper HTTP status codes for error responses (#3707 ) Several error paths in the /inference and /load endpoints returned HTTP 200 with a JSON error body, making it impossible for clients to distinguish errors from successful responses by status code. Set 400 for client errors (missing file field, unreadable audio, missing/invalid model) and 500 for server errors (ffmpeg conversion failure). The two existing status-code sites (499 for client disconnect, 500 for processing failure) are unchanged.	2026-03-16 13:33:06 +02:00
Georgi Gerganov	2bc630f197	talk-llama : sync llama.cpp	2026-03-16 13:10:15 +02:00
Georgi Gerganov	81ea958719	common : add nvfp4 (ggml/0)	2026-03-16 13:10:15 +02:00
Georgi Gerganov	84f8db71d8	talk-llama : sync llama.cpp	2026-02-27 20:57:58 +02:00
Dmitry Atamanov	cec1dd9d12	examples : update miniaudio library to 0.11.24 (#3672 )	2026-02-27 11:15:15 +01:00
Georgi Gerganov	364c77f4ca	talk-llama : sync llama.cpp	2026-02-15 21:44:37 +02:00
Sid Mohan	eb27fa2252	server : fix hardcoded /inference path in default HTML page (#3639 ) Closes #3596	2026-02-09 10:10:13 +02:00
Georgi Gerganov	4b23ff249e	talk-llama : sync llama.cpp	2026-02-08 09:29:10 +02:00
Georgi Gerganov	953e503fd9	talk-llama : sync llama.cpp	2026-01-30 15:56:40 +02:00
Bráulio Oliveira	7aa8818647	examples : use -dev/--device and WHISPER_ARG_DEVICE (#3557 ) Align device selection naming with llama.cpp.	2026-01-21 08:40:30 +01:00
Georgi Gerganov	ecfcc65fbf	talk-llama : sync llama.cpp	2026-01-14 09:11:59 +02:00
Peter A.	a96310871a	examples : fix executable example targets (#3600 ) * cmake: - added `whisper-` prefix to unprefixed targets: `quantize`, `lsp`, `vad-speech-segments` - added `install(TARGETS ${TARGET} RUNTIME)` where it was missing Signed-off-by: Peter A. <ink.splatters@pm.me> * .github/workflows/build.yml: quantize -> whisper-quantize Signed-off-by: Peter A. <ink.splatters@pm.me> --------- Signed-off-by: Peter A. <ink.splatters@pm.me>	2026-01-13 08:08:18 +01:00
Georgi Gerganov	7359ac94d5	talk-llama : sync llama.cpp	2025-12-31 17:52:09 +02:00
Georgi Gerganov	6c22e792cb	talk-llama : sync llama.cpp	2025-12-18 08:20:56 +02:00
Marcos Del Sol Vives	2551e4ce98	server: allow custom temp directory for ffmpeg (#3564 )	2025-12-13 09:37:44 +02:00
Georgi Gerganov	179d8b1c9c	talk-llama : sync llama.cpp	2025-12-12 18:15:27 +02:00
Daniel Bevenius	19ceec8eac	examples : fix typo in vad-speech-segments command [no ci] (#3535 ) This commit corrects a typo the command-line argument for specifying the VAD model in the vad-speech-segments example.	2025-11-20 13:35:11 +01:00
Georgi Gerganov	b12abefa9b	sync : llama.cpp	2025-11-17 21:05:46 +02:00
KITAITI Makoto	27f485a14c	vad : Silero VAD v6.2.0 (#3524 ) * Add ggml-silero-v6.2.0 to download candidates * Make default VAD model ggml-silero-v6.2.0 * Make VAD model in documentations ggml-silero-v6.2.0	2025-11-17 22:26:17 +09:00
Georgi Gerganov	a1867e0dad	sync : llama.cpp	2025-11-09 23:38:03 +02:00
Orel-A	f16c12f3f5	wasm : fix Hebrew ID (#3487 ) whisper_lang_id: unknown language 'iw'	2025-10-27 08:49:32 +02:00
Georgi Gerganov	322c2adb75	talk-llama : sync llama.cpp	2025-10-22 12:58:11 +03:00
Georgi Gerganov	23c19308d8	server : set no_context == true (#3482 )	2025-10-20 15:39:48 +03:00
Georgi Gerganov	8ba3c13b0c	talk-llama : sync llama.cpp	2025-10-15 09:29:17 +03:00
Georgi Gerganov	ff4c1a5a53	talk-llama : sync llama.cpp	2025-10-12 11:16:23 +03:00
Andreas Lubbe	85871a9469	whisper : add support for --carry-initial-prompt (#3395 ) * Add support for --carry-initial-prompt * PR fixes for ruby and go * Refactoring for readability * WIP 1 * WIP 2 * PR fixes * More PR fixes * PR fix * Further simplification * d'oh * One more logic fix * Update src/whisper.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Truncate prompt_past0 upon initialization * Slight simplification --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-10-10 19:51:15 +03:00
Andreas Lubbe	a0ca50f3b9	cli: Fix assignment for vad_min_silence_duration_ms (#3467 ) * cli: Fix assignment for vad_min_silence_duration_ms Found and fixed this simple copy/paste error * server : fix vad_min_silence_duration_ms assignment --------- Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>	2025-10-10 15:21:03 +02:00
Georgi Gerganov	8a67c55c8a	wchess : fix link [no ci]	2025-09-30 21:28:03 +03:00
Daniel Bevenius	5904d00dbb	examples : add wchess.wasm to wasm examples build (#3443 ) * examples : add wchess.wasm to wasm examples build This commit add the wchess.wasm example to the wasm examples that are deployed to https://ggml.ai/whisper.cpp. Refs: https://github.com/ggml-org/whisper.cpp/issues/3434#issuecomment-3346980420	2025-09-30 16:23:01 +02:00
Georgi Gerganov	0b3587acdd	whisper : enable flash attention by default (#3441 )	2025-09-30 15:47:20 +03:00

1 2 3 4 5 ...

601 Commits