Train your own Neural Audio Model

Create custom neural audio synthesizers using RAVE, AFTER, and DDSP. No coding required.

Supported Models

RAVE

Realtime Audio Variational autoEncoder by IRCAM. Perfect for fast, high-quality timbre transfer and real-time synthesis.

AFTER

Audio Features Transfer and Exploration in Real-time. A diffusion-based model for precise control over timbre and structure.

DDSP

Differentiable Digital Signal Processing by Magenta. Combines DSP with deep learning for realistic instrument modeling.

Why train with us?

  • No Code Required

    Upload your audio, select a model, and let us handle the training pipeline.

  • Cloud GPU Training

    We use powerful GPUs to train your models quickly, so you don't need expensive hardware.

  • Ready for Production

    Export models directly for use in your DAW or live performance setup.

Ready to start?

Join the waitlist for early access.

Frequently Asked Questions

The data requirements vary depending on the model you choose:
  • DDSP: Approximately 10-15 minutes of clean, monophonic recordings. Ideally with MIDI transcription (or easy to be transcribed). Best for single instrument timbres.
  • RAVE: 2-3 hours of clean, high-quality coherent audio (single style). Works with diverse sound types and can handle more complex timbres.
  • AFTER: Typically more than 1 hour of audio samples for good results, ideally with MIDI transcription (or easy to be transcribed). Supports polyphonic content.

Make sure you own the rights to use the audio as training data. You agree to be the one responsible for copyright compliance in case of prejudice.

For detailed best practices, check the documentation for each model: RAVE, AFTER, DDSP.

You do! Any model you train using the platform is 100% yours, as long as you own the training data (be responsible).

The models are trained using high-performance GPU pipelines. The training process includes:
  • Preprocessing your audio data to extract relevant features
  • Data augmentation using advanced AI models to improve robustness
  • Multi-stage training with optimized hyperparameters for each model type
  • Quality validation to ensure your model meets performance standards

All of this happens automatically in the cloud—you just upload your data, monitor with the available tools, and wait for the results.

Each model type requires its own VST/AU plugin to run in your DAW:
  • RAVE: Use Neutone FX VST, IRCAM RAVE VST, or nn~ for Max/MSP and PureData
  • AFTER: Max for Live devices (Ableton only) or nn~ for Max/MSP
  • DDSP: Neutone FX VST or the Magenta DDSP VST

For detailed setup instructions, check the documentation for each model: RAVE, AFTER, DDSP.

Training time depends on the model complexity and size. On a single high-end GPU:
  • DDSP: Approximately 6 hours
  • RAVE: Approximately 12 hours
  • AFTER: Approximately 24 hours

You'll receive email notifications when your model is ready for download. Training happens in the background, so you can close your browser and come back later.