VAX Vamp Plugin 1.0.0 - New Release

guillot · February 27, 2025, 2:00pm

Hello everyone,

The new VAX Vamp Plugin analysis tool is now available!

VAX is a library developed by Guillaume Doras, Yann Teytaut, and Axel Röbel from the Sound Analysis and Synthesis team at IRCAM. It includes neural network models for aligning text with audio. Two models are available, dedicated respectively to speech and singing. The speech model was trained with the LibriSpeech ASR corpus and is designed for voice-only content. The singing model was trained with the DALI dataset and is intended for sung vocals with a musical background.

VAX-demo

The alignment feature of the VAX Vamp plugin allows you to realign input track markers onto the audio based on their textual content. It enables you to select the appropriate neural network model for the audio content. Additionally, it allows grouping or splitting the text by words or syllables to improve marker separation. The plugin also provides a probability matrix for Latin alphabet characters.

Along with the user manual, you will find a document template for the Partiels software. This template generates text from audio data using the Whisper plugin, then segments it into syllables and aligns it with the VAX plugin.

I thank Guillaume Doras, Yann Teytaut, and Axel Röbel for the VAX project, and especially Guillaume for the updates and fixes that made it possible to integrate these tools into a Vamp plugin !

Feel free to share your feedback and comments.