Automatic Speech Recognition
Transformers
PyTorch
JAX
Safetensors
whisper
audio
hf-asr-leaderboard
Eval Results
Instructions to use openai/whisper-large-v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openai/whisper-large-v3 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="openai/whisper-large-v3")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("openai/whisper-large-v3") model = AutoModelForSpeechSeq2Seq.from_pretrained("openai/whisper-large-v3") - Inference
- Notebooks
- Google Colab
- Kaggle
Incorrect feature_size in preprocessor_config.json (should be 80)
#199
by alexg1802 - opened
Hello,
It seems the preprocessor_config.json file for openai/whisper-large-v3 currently contains:
"feature_size": 128
However, Whisper models use 80 mel filterbanks, and this causes runtime crashes in pipelines using WhisperProcessor.from_pretrained.
Expected value: 80
Can you please correct this? Thank you!
Hi, as they mentioned in the document, Whisper large v3 comes with a 128 mel spectrogram bin size in contrast to the other models.