* ci : remove base-devel and git from msys2 job
This commit removes the above packages as they might not be required and
could help reduce the github cache size.
* ci : try reducing the installs to only the compilers
This commit updates the setup emscripten sdk jobs to use emscripten-core
instead of mymindstorm and also pins the commit sha for the version
instead of using a version tag.
This commit updates the Install cache step to use ggml-org/ccache-action
and switched to use ccache instead of sccache.
The motivation for switching to ccache is that this is what llama.cpp
does and also there is an issue with later version of sscache:
```console
sccache C:\PROGRA~1\NVIDIA~1\CUDA\v\bin\nvcc.exe -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_CRT_SECURE_NO_WARNINGS -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -DCMAKE_INTDIR=\"Release\" -ID:\a\whisper.cpp\whisper.cpp\ggml\src\ggml-cuda\.. -ID:\a\whisper.cpp\whisper.cpp\ggml\src\..\include -isystem "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v\include" -Xcompiler="-MD -O2 -Ob2" -DNDEBUG -std=c++17 -arch=native -use_fast_math -extended-lambda -Xcompiler /Zc:preprocessor -MD -MT ggml\src\ggml-cuda\CMakeFiles\ggml-cuda.dir\Release\allreduce.cu.obj -MF ggml\src\ggml-cuda\CMakeFiles\ggml-cuda.dir\Release\allreduce.cu.obj.d -x cu -c D:\a\whisper.cpp\whisper.cpp\ggml\src\ggml-cuda\allreduce.cu -o ggml\src\ggml-cuda\CMakeFiles\ggml-cuda.dir\Release\allreduce.cu.obj -Xcompiler=-Fdggml\src\ggml-cuda\CMakeFiles\ggml-cuda.dir\Release\,-FS
sccache: encountered fatal error
sccache: error: Could not parse shell line
sccache: caused by: Could not parse shell line
```
```
* ci : add ccache clear action
* ci : split self-hosted GPU jobs into build-self-hosted.yml
Extract self-hosted runner jobs from build.yml into a dedicated
build-self-hosted.yml following the llama.cpp pattern:
- gpu-cuda (NVIDIA Linux)
- gpu-vulkan-nvidia-cm (NVIDIA Linux)
- gpu-vulkan-nvidia-cm2 (NVIDIA Linux + COOPMAT2)
- gpu-metal (macOS ARM64)
- gpu-vulkan (macOS ARM64)
GitHub-hosted CPU jobs remain in build.yml.
Assisted-by: llama.cpp:local pi
* ci : split release jobs into release.yml
Extract release-related jobs from build.yml into a dedicated
release.yml following the llama.cpp pattern:
- determine-tag
- windows (Win32/x64, SDL2)
- windows-blas (Win32/x64, OpenBLAS)
- windows-cublas (x64, CUDA 11.8/12.4)
- ios-xcode-build
- bindings-java (depends on windows)
- release (artifact aggregation + GitHub release)
CoreML job stays in build.yml with its own local tag calculation.
Assisted-by: llama.cpp:local pi
* ci : remove bindings-java job from release.yml
Assisted-by: llama.cpp:local pi
* cont : add manual trigger for build.yml
* cont : remove obsolete ifs
* ci : extract sanitizer job to bild-sanitize.yml
* ci : extract linux jobs into build-linux.yml
* ci : extract macos jobs to build-macos.yml
* ci : extract gcc jobs to build-gcc.yml
* ci : extract clang jobs to build-clang.yml
* ci : extract sycl jobs to build-sycl.yml
* ci : extract windows jobs to build-windows.yml
* ci : extract emscripten job to build-wasm.yml
* ci : extract android jobs into build-android.yml
* ci : extract quantize job to quantize.yml
* ci : extract coreml job into coreml.yml
* ci : extract vad job to vad.yml
* ci : extract cpu jobs to build-cpu.yml
* ci : make naming of yml files consistent
* ci : add --fail to curl download and propagate
This commit adds the --fail option to the model download scripts so that
if the model download returns a server error this is picked up. This is
then detected in run.sh and a error message is displayed and the script
stops and returns an error.
The motivation for this is that currently it is possible for the model
download to fail but this script proceeds and instead of a model file
the contents will be an html page probably with the error. This will
then cause the model to not be able to load due to a missing magic
number. I'm not sure we can do much about the downloading failing,
perhaps a retry but at least this will give a clearer error message.
Refs: https://github.com/danbev/whisper.cpp/actions/runs/26866349389/job/79230794512
* ci : enable command traces to see download command in use
* ci : add retry functionality to download model script
This commit adds curl retry options to the model download script.
The motivation is that currently when CI jobs run huggingface rate limit
the requests and return:
```console
curl: (22) The requested URL returned error: 429
```
This is an attempt to work around this and if it does not work then we
can an authorization token.
* ci : extract freebsd job to build-freebsd.yml
This job has been commented out as it has been flaky in the past. I'll
monitor this and if it continues to be unreliable we can disable it in
the github actions GUI instead of commenting it out like we did before.
* ci : add ccache to jobs (non-docker builds)
The ccache will only be saved on pushed to master.
* ci : bump ccache-action version to v1.2.21
The motivation for this is that the save parameter does not seem to work
with the current version.
* ci : add ccache to docker jobs in build-linux.yml
* ci : add debug statements to linux docker build
* ci : set CCACHE_DIR for build-linux.yml
* ci : add ccache to the remaining docker jobs
* ci : remove build-linux.yml
This commit remove build-linux.yml as the same jobs are also run by
build-gcc.yml, with the exception that build-gcc.yml also run ctest).
So keeping build-gcc.yml and removing the redundant build-linux.yml.
* ci : add linux build artifacts to release
* ci : revert to hendrikmuhs/ccache-action for win job
This is currently causing the following failure:
```console
sccache C:\PROGRA~1\NVIDIA~1\CUDA\v\bin\nvcc.exe -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_CRT_SECURE_NO_WARNINGS -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -DCMAKE_INTDIR=\"Release\" -ID:\a\whisper.cpp\whisper.cpp\ggml\src\ggml-cuda\.. -ID:\a\whisper.cpp\whisper.cpp\ggml\src\..\include -isystem "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v\include" -Xcompiler="-MD -O2 -Ob2" -DNDEBUG -std=c++17 -arch=native -use_fast_math -extended-lambda -Xcompiler /Zc:preprocessor -MD -MT ggml\src\ggml-cuda\CMakeFiles\ggml-cuda.dir\Release\allreduce.cu.obj -MF ggml\src\ggml-cuda\CMakeFiles\ggml-cuda.dir\Release\allreduce.cu.obj.d -x cu -c D:\a\whisper.cpp\whisper.cpp\ggml\src\ggml-cuda\allreduce.cu -o ggml\src\ggml-cuda\CMakeFiles\ggml-cuda.dir\Release\allreduce.cu.obj -Xcompiler=-Fdggml\src\ggml-cuda\CMakeFiles\ggml-cuda.dir\Release\,-FS
sccache: encountered fatal error
sccache: error: Could not parse shell line
sccache: caused by: Could not parse shell line
```
Refs: https://github.com/danbev/whisper.cpp/actions/runs/26883673904/job/79290017353
* ci : make static linux artifacts
* ci : make linux release artifact names consistent
This commit removes the tag form the linux release artifacts to be
consistent with the existing artifacts.
If we want to include the tag then we can do that in a follow-up PR.
* ci : fix linux zip files to have a directory
* ci : add HF_TOKEN secret for HF download authorization
This is to avoid the HR rate limiting when downloading model.
---------
Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>
This commit adds an ignore for bindings-ruby and bindings-go in
build.yml as these are handled by separate .yml file (separate jobs)
and don't need to trigger a full CI build.
* ci : add on push/pull_request paths ruby job
This commit adds paths to bindings-ruby to only build if changes where
made to bindings/ruby or to include/whisper.h.
* ci : add additional paths [no ci]
This commit re-enables the arm64 docker images builds which were removed
in Commit 9366544991
("ci : fix arm builds"). It also uses ubuntu-24.04-arm as the runner
which enables us to avoid QEMU.
Resolves: https://github.com/ggml-org/whisper.cpp/issues/2859
* ci : set GGML_NATIVE=OFF for bindings-java
This commit attempts to address an issue with the bindings-java job
which is currently failing.
I've not been able to reproduce this locally my windows machine and I
suspect that what might be happning is that windows job compiles on a
runner where it has different CPU features, for example AVX512 and when
this dll is used on a different runner that does not have that feature
it will crash.
Refs: https://github.com/ggml-org/whisper.cpp/actions/runs/26496174929/job/78059073255?pr=3829
* ci : also disable BMI2
* ci : use github ubuntu-22.04-arm runner instead of qemu
This commit updates the ubuntu-22-gcc-arm64 job to use a arm github
runner instead of QEMU.
The motivation for this is that we get intermittent failure specifically
related to QEMU. For example:
```console
Segmentation fault (core dumped)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)
dpkg: error processing package libc-bin (--configure):
installed libc-bin package post-installation script subprocess returned error exit status 139
Processing triggers for ca-certificates (20240203~22.04.1) ...
Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
Errors were encountered while processing:
libc-bin
E: Sub-process /usr/bin/dpkg returned an error code (1)
```
This is an attempt to try to avoid QEMU and hence avoid this issue.
* ci : remove QEMU where possible
* cmake:
- added `whisper-` prefix to unprefixed targets: `quantize`, `lsp`,
`vad-speech-segments`
- added `install(TARGETS ${TARGET} RUNTIME)` where it was missing
Signed-off-by: Peter A. <ink.splatters@pm.me>
* .github/workflows/build.yml: quantize -> whisper-quantize
Signed-off-by: Peter A. <ink.splatters@pm.me>
---------
Signed-off-by: Peter A. <ink.splatters@pm.me>
This commit remove the brew install of cmake for macos-latest
as this now seems to be pre-installed on the runner.
The motivation for this is that this job is failing with the following
error:
```console
Error: cmake was installed from the local/pinned tap
but you are trying to install it from the homebrew/core tap.
Formulae with the same name from different taps cannot be installed at the same time.
```
This commit adds specific paths to the GitHub Actions workflow file
`.github/workflows/build.yml`.
The motivation for this to avoid unnecessary builds when unrelated files
are changed, which can save resources and time during the CI process.
Refs: https://github.com/ggml-org/whisper.cpp/issues/3285
This commit modified the musa docker file to selectively copy
directories needed for the container image.
This commit also added a step to the docker workflow to free up disk
space in attempt to make enough room for the large musa build
containers.
The motivation for this change is to reduce the size of the container
image and try to avoid disk usage issues in CI.
* ci: set fail-fast to false in docker.yml
This commit modifies the GitHub Actions workflow for Docker builds to
disable the fail-fast behavior.
The motivation for this is that currently if one of the strategy jobs
fails any other job that is in progress will be cancelled. There is no
need for this as the jobs are independent.
* ci : update docker.yml to use a single build
This commit updates the docker job to only build the image once instead
of twice (only happens when pushing to the master branch). Instead this
will tag the image with the commit SHA when pushing to master.
The motivation for this change is to reduce the time it takes to run
this job and also it might help with the disk space issues we are
experiencing for this job when it runs on pushes to master.
* ci : add should_release variable
This commit adds a `should_release` variable to the GitHub Actions
workflow to determine if a release should be created based on the tag or
branch conditions.
The motivation for this that it simplifies the logic for deciding
whether to upload artifacts or not, making it easier to maintain if we
need to change the conditions in the future.
* ci : set release draft to true
This commit modifies the GitHub Actions workflow to support
tag-based releases. When a tag is pushed that starts with 'v', the
workflow will use that tag name for the release process.
I think this was the once the behavior, but it was lost in updates that
I've made to the workflow. This commit restores that functionality.
This commit re-enables the main-cuda Docker build in the CI workflow.
The main-cuda Dockerfile has been updated to remove build artifacts
and also print the size of the /app directory after the build. A similar
change was recently made to the musa Dockerfile, and perhaps this job
was also having similar disk space issues.
The motivation for this change is that this configuration has been
disabled for a while due to persistent build failures. However, the
actual logs are now longer available.
Resolves: https://github.com/ggml-org/whisper.cpp/issues/3040
* ci : update windows runner to windows-2022
This commit changes the windows-2019 runner to windows-2022.
The motiation for this is that the windows-2019 runner is scheduled for
deprection and will be removed 2025-06-30. There are currently "burnout"
periods that started 2025-06-01 and during these times jobs with
windows-2019 will fail which has happened lately on our CI.
Refs: https://github.com/actions/runner-images/issues/12045
This commit updates the build workflow to replace `ports.ubuntu.com`
with `mirror.kumi.systems` in the apt sources list for ARM64 builds.
The motivation for this change is intended to improve package download
reliability and speed by using a more stable mirror for ARM64 packages.
This commit modifies windows-blas which was updated previously to use
the zip functionality provided by `actions/upload-artifact`. This turned
out to be incorrect and I should not have done that. The reason for
zipping the archives first is that otherwise the artifacts when
downloaded will be unzipped and just be simple directories. In our case
the release task depends on the artifacts having a .zip extension so
that those archives are include in the release.
* ci : use dynamic libopenblas.dll for window-blas
This commit updates the windows-blas job to use the dynamic (can load
different kernels depending of the CPU arch) libopenblas.dll instead of
the "static" openblas.dll that get installed by vcpgk.
The motivation for this change is that there have been reports of
performance drops in later version specifically related to blas. Please
see the links below for more details.
Resolves: https://github.com/ggml-org/whisper.cpp/issues/3166
Refs: https://github.com/ggml-org/whisper.cpp/issues/2666#issuecomment-2885978811
* vad : add initial Voice Activity Detection (VAD) support
This commit add support for Voice Activity Detection (VAD). When enabled
this feature will process the audio input and detect speech segments.
This information is then used to reduce the number of samples that need
to be processed by whisper_full.
Resolves: https://github.com/ggml-org/whisper.cpp/issues/3003
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit adds steps to the windows jobs to zip and upload
artifacts produced.
The motivation for this is that currently the artifacts are not zipped
which means that will not be picked up by the release job and hence not
be included in github releases.
Resolves: https://github.com/ggml-org/whisper.cpp/issues/3119
This commit add the .zip extension to the xcframework artifact name in
the GitHub Actions workflow.
The motivation for this that the release job will look for .zip files
and will not find the xcframework artifact without the extension, and
hence will not upload it to the release.
This commit disables the publishing of the Java binding to the Maven
repository.
The motivation for this is that this job was disabled for some time and
recently it was re-enabled, but the publishing of the Java binding
caused the build to fail and needs to be investigated further.
Refs: https://github.com/ggml-org/whisper.cpp/issues/3079
* Update PATH for main/main-cuda container
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* Add Dockerfile for musa, .dockerignore and update CI
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* Add Moore Threads GPU Support in README.md and replace ./main with whisper-cli
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* Forward GGML_CUDA/GGML_MUSA to cmake in Makefile
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* Minor updates for PATH ENV in Dockerfiles
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* Address comments
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
This commit disables the FreeBSD job in build.yml of the GitHub Actions
workflow.
The motivation for this is that this job seems to stall and timeout from
time to time, taking up to 6 hours to complete/cancel.