Whisper Hugging Face Inference API bug

John6666 · February 10, 2026, 2:38am

Oh… There are discrepancies in the document’s claims…

What the `File name too long: 'AAAAGGZ0eXBtcDQy...'` error means

That string (AAAAGGZ0eXBtcDQy...) is the base64 of your .m4a file header (MP4 container). The server is not decoding it as base64 audio. Instead, it is treating the inputs string as a file path and trying to open it as a filename; because it’s thousands of characters long, the OS raises “file name too long”.

This is a known failure mode for ASR requests where a base64 string is accidentally interpreted as a path. (Hugging Face Forums)

So: your audio length (15 seconds) is not the issue. The request is being interpreted incorrectly on the server side.

Why this happens on this endpoint

1) The ASR schema says `inputs` is a base64 string (but some backends still treat strings as paths)

Hugging Face’s Inference Providers ASR documentation states:

inputs: base64-encoded audio string
or raw bytes if you don’t send parameters (Hugging Face)

However, some ASR serving wrappers (and older endpoint implementations) handle a string inputs as “path/URL to an audio file” first, and only decode bytes in other branches. When that happens, base64 gets misread as a path → “file name too long”. (Hugging Face Forums)

2) `.m4a` decoding and content-type handling can be brittle

For binary-audio tasks, HF has historically relied on “content-type guessing” or backend-specific decoding paths; inconsistencies between serverless inference and other deployments are documented as a practical pitfall. (Hugging Face)

Fixes (choose based on whether you must send parameters)

Fix A — Most reliable: send raw audio bytes (no JSON, no parameters)

Per the ASR docs, if you omit parameters, you can send raw bytes directly. (Hugging Face)

Dart (raw bytes)

final url = Uri.parse(
  '/static-proxy?url=https%3A%2F%2Frouter.huggingface.co%2Fhf-inference%2Fmodels%2Fopenai%2Fwhisper-large-v3-turbo',
);

final req = http.Request('POST', url)
  ..headers['Authorization'] = 'Bearer $_hfToken'
  ..headers['Accept'] = 'application/json'
  // For .m4a in an MP4 container, audio/mp4 is generally safer than audio/m4a.
  ..headers['Content-Type'] = 'audio/mp4'
  ..bodyBytes = audioBytes;

final streamed = await req.send();
final body = await streamed.stream.bytesToString();

if (streamed.statusCode != 200) {
  throw Exception('HTTP ${streamed.statusCode}: $body');
}

Why this works: the server receives bytes and can’t misinterpret them as a filename. (Hugging Face)

Tradeoff: you can’t pass generation_parameters in this “raw bytes” mode (the docs only describe raw bytes when no parameters are provided). (Hugging Face)

Fix B — If you need `generation_parameters`: pass a URL as `inputs` (JSON)

If your backend is treating string inputs as a path, then use that intentionally: provide an HTTPS URL to the audio file (ideally a short-lived signed URL), and keep your JSON parameters.

This aligns with HF’s client documentation: ASR inputs can be raw bytes, a local file, or a URL. (Hugging Face)

JSON body

{
  "inputs": "https://<signed-url>/audio.m4a",
  "parameters": {
    "generation_parameters": {
      "do_sample": false,
      "temperature": 0,
      "top_p": 1,
      "num_beams": 1,
      "max_new_tokens": 448
    }
  }
}

Dart (URL input + parameters)

final payload = jsonEncode({
  "inputs": signedAudioUrl, // https://...
  "parameters": {
    "generation_parameters": {
      "do_sample": false,
      "temperature": 0,
      "top_p": 1,
      "num_beams": 1,
      "max_new_tokens": 448,
    }
  }
});

final req = http.Request('POST', url)
  ..headers['Authorization'] = 'Bearer $_hfToken'
  ..headers['Content-Type'] = 'application/json'
  ..headers['Accept'] = 'application/json'
  ..body = payload;

Why this works: it avoids base64 entirely, and it matches the “string interpreted as path/URL” behavior that is causing your error. (Hugging Face Forums)

Fix C — If you need to force “transcribe” vs “translate” reliably: use a deployment that exposes Whisper’s task/language controls

Whisper supports explicit generation controls:

task: "transcribe" or "translate"
language: tokens like "en" / "english" (Hugging Face)

But the serverless ASR interface you’re calling is a generic ASR wrapper; even when generation_parameters is supported, Whisper-specific task/language may not be exposed the way Transformers exposes them. (Hugging Face)

If you must guarantee “never translate, always transcribe (and optionally force English)”, the robust approach is to run Whisper behind an endpoint where you control the inference code (so you can set task="transcribe" and language="english" explicitly). (Hugging Face)

Practical stability tips (to reduce “random languages”)

Even after you fix the request shape, “random language” output is often caused by audio decoding / language-ID instability. The highest-impact change is:

Convert to WAV PCM, mono, 16 kHz before sending (then use Content-Type: audio/wav).
If you keep .m4a, use audio/mp4 rather than audio/m4a (some stacks handle it more consistently).
Make decoding deterministic (temperature: 0, do_sample: false) — but that requires Fix B (URL input) or an endpoint that accepts parameters with bytes. (Hugging Face)

Recommendation for your exact situation

First, switch to Fix A (raw bytes) and confirm transcription works consistently (this isolates request-format issues). (Hugging Face)
If you need deterministic decoding knobs, move to Fix B (URL input) and keep generation_parameters. (Hugging Face)
If you need a hard guarantee on transcribe vs translate, use a setup that exposes Whisper’s task/language controls directly. (Hugging Face)

Topic		Replies	Views
Duration of audio sequence ingested by Whisper Inference Endpoints on the Hub	2	1777	January 17, 2023
Interface API deployment 🤗Transformers	0	104	April 21, 2024
SpeechBrain EncoderDecoderASR transcribe_file() Runs out of Memory Models	0	518	April 17, 2022
Using inference api on model that returns an audio file Models	0	402	November 23, 2021
Error Curl Unity API Speech to Text Models	2	30	October 31, 2025

Whisper Hugging Face Inference API bug

What the File name too long: 'AAAAGGZ0eXBtcDQy...' error means