Whisper
The Whisper plugin lets you transcribe audio files using OpenAI's whisper model.
What is Whisper?
Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.
How to set up the Whisper plugin?
Go to Settings > Plugins > Whisper. Select tab Settings
then enter your OpenAI API key.
How to use the Whisper Plugin?
To use the Whisper plugin, make sure you've setup the API key. Then:
Start a new chat. Choose an LLM that supports Function Calling (for example GPT-4o)
Enable the Whisper plugin
Drag the audio file to the chat input field (not the chat list) and tell the LLM to transcribe it

FAQ
Can I use this offline? No. This plugin uses the OpenAI API and requires Internet connection and a paid OpenAI API account.
Which whisper model does it use? The Audio API provides two speech to text endpoints, transcriptions and translations, based on our state-of-the-art open source large-v2 Whisper model. [source] To use the better
large-v3
model, please use the Whisper via Groq plugin.What are the limitations? File uploads are currently limited to 25 MB and the following input file types are supported:
mp3
,mp4
,mpeg
,mpga
,m4a
,wav
, andwebm
. [source]
Last updated
Was this helpful?