< Back to IRCAM Forum

Whisper Vamp Plugin 1.0.0 - New Release

Hello,

The first version of the speech-to-text Whisper Vamp plugin is available.

The Whisper plugin is an implementation of the Whisper speech recognition model developed by OpenAI as a Vamp plugin.

The Whisper plugin analyses the text in the audio stream and generates markers corresponding to phrases, words or tokens (depending on the Split Mode parameter). The Suppress Non-Speech Tokens parameter controls whether non-speech tokens are generated (only usable with Split Mode on Tokens).

The lightweight ggml-base-q5_1 model is embedded in the plugin and the other q5 models (tiny, small, medium, and large_v2) will be installed on your system. The Model parameter is used to select which model to use. You can also download and use other models that may be more appropriate to your needs. Please, refer to the following section dedicated to models.

The Whisper Vamp Plugin has been designed for use in the free audio analysis application Partiels.

The Whisper Vamp Plugin is compatible with:

  • Windows 10
  • MacOS 10.15 (ARM)
  • Linux