Whisper
Last updated
Was this helpful?
Last updated
Was this helpful?
The Whisper plugin lets you transcribe audio files using OpenAI's whisper model.
Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.
Go to Settings > Plugins > Whisper. Select tab Settings
then enter your OpenAI API key.
To use the Whisper plugin, make sure you've setup the API key. Then:
Start a new chat. Choose an LLM that supports Function Calling (for example GPT-4o)
Enable the Whisper plugin
Drag the audio file to the chat input field (not the chat list) and tell the LLM to transcribe it
Can I use this offline? No. This plugin uses the OpenAI API and requires Internet connection and a paid OpenAI API account.
Which whisper model does it use?
The Audio API provides two speech to text endpoints, transcriptions and translations, based on our state-of-the-art open source large-v2 Whisper model. [source]
To use the better large-v3
model, please use the Whisper via Groq plugin.
What are the limitations?
File uploads are currently limited to 25 MB and the following input file types are supported: mp3
, mp4
, mpeg
, mpga
, m4a
, wav
, and webm
. [source]