whisper.cpp/benchmark/references
shaihi 40a46f9ed1 benchmark: add harness artifacts and handoff record 2026-03-09 10:13:34 +02:00
..
README.md benchmark: add harness artifacts and handoff record 2026-03-09 10:13:34 +02:00
long.txt benchmark: add harness artifacts and handoff record 2026-03-09 10:13:34 +02:00
medium.txt benchmark: add harness artifacts and handoff record 2026-03-09 10:13:34 +02:00
short.txt benchmark: add harness artifacts and handoff record 2026-03-09 10:13:34 +02:00

README.md

Reference Transcripts

This directory contains canonical reference transcripts used by the benchmark correctness gate.

Expected files:

  • short.txt
  • medium.txt
  • long.txt

How they are used:

  • benchmark/parse_results.py extracts transcript text from each measured run log.
  • Text is normalized (case, punctuation, spacing).
  • WER and CER are computed against these reference files.
  • benchmark/bench.sh enforces correctness thresholds by default:
    • MAX_WER=0.02
    • MAX_CER=0.02

Notes:

  • Keep references fixed once baseline is established.
  • If audio inputs change, regenerate references intentionally and document why.