From afa2ea544fb4b0448916b4a31ecd33c8685bd482 Mon Sep 17 00:00:00 2001
From: Daniel Bevenius <daniel.bevenius@gmail.com>
Date: Tue, 19 May 2026 08:58:43 +0200
Subject: [PATCH] whisper : set bench data for each iteration (#3812)

* whisper : set bench data for each iteration

This commit updates whisper_bench_ggml_mul_mat_str to intialize the
tensors data for each iteration.

The motivation for this is that is currently possible for a previous
run's results, F32 values, to leak into the next run. When it is time
for the F16 iteration then F32 results can cause NaN values to appear
in the tensor values causing the F16 iteration to fail.

Refs:https://github.com/ggml-org/whisper.cpp/actions/runs/25901678402/job/76152894644?pr=3735

* ci : set GGML_NATIVE=OFF if x86_64

This commit sets GGML_NATIVE=OFF for x86_64 architectures.

The motivation for this is to try to get CI to pass and the theory is
that the libggml-cpu.so library in the ccache might have been built by a
runner that supports a different instruction set. When another runner
that does not support that instruction set tries to use it, it will fail
with a segmentation fault.

I'm not sure about this yet but going to try this out and if it does not
work I'll ssh into the runner to debug further.
---
 ci/run.sh       |  4 ++++
 src/whisper.cpp | 12 +++++++++---
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/ci/run.sh b/ci/run.sh
index cbe28442e..b03fdf1c6 100644
--- a/ci/run.sh
+++ b/ci/run.sh
@@ -50,6 +50,10 @@ fi
 
 CMAKE_EXTRA="-DWHISPER_FATAL_WARNINGS=ON"
 
+if [[ "$(uname -m)" == "x86_64" ]]; then
+    CMAKE_EXTRA="${CMAKE_EXTRA} -DGGML_NATIVE=OFF"
+fi
+
 if [ ! -z ${GG_BUILD_METAL} ]; then
     CMAKE_EXTRA="${CMAKE_EXTRA} -DGGML_METAL=ON"
 fi
diff --git a/src/whisper.cpp b/src/whisper.cpp
index 210ca597f..0fe29a454 100644
--- a/src/whisper.cpp
+++ b/src/whisper.cpp
@@ -8258,9 +8258,6 @@ WHISPER_API const char * whisper_bench_ggml_mul_mat_str(int n_threads) {
     // when F16 is used, there is an extra work buffer of size N*N*sizeof(float)
     std::vector<uint8_t> buf(3llu*N_max*N_max*sizeof(float) + 3*ggml_tensor_overhead() + ggml_graph_overhead());
 
-    // put a bunch of random data in the buffer
-    for (size_t i = 0; i < buf.size(); i++) buf[i] = i;
-
     for (int j = 0; j < (int) sizes.size(); j++) {
         int n_q4_0 = 0;
         int n_q4_1 = 0;
@@ -8304,6 +8301,15 @@ WHISPER_API const char * whisper_bench_ggml_mul_mat_str(int n_threads) {
             struct ggml_tensor * a = ggml_new_tensor_2d(ctx0, wtype,         N, N);
             struct ggml_tensor * b = ggml_new_tensor_2d(ctx0, GGML_TYPE_F32, N, N);
 
+            // set tensor data after allocation so previous iteration results don't corrupt it.
+            {
+                uint8_t * a_data = (uint8_t *) a->data;
+                for (size_t ii = 0; ii < ggml_nbytes(a); ii++) a_data[ii] = ii & 0x3F;
+
+                uint8_t * b_data = (uint8_t *) b->data;
+                for (size_t ii = 0; ii < ggml_nbytes(b); ii++) b_data[ii] = ii & 0x3F;
+            }
+
             struct ggml_tensor * c = ggml_mul_mat(ctx0, a, b);
 
             struct ggml_cgraph * gf = ggml_new_graph(ctx0);