Module mimi

Module mimi 

Source
Expand description

mimi model

Mimi is a state of the art audio compression model using an encoder/decoder architecture with residual vector quantization. The candle implementation supports streaming meaning that it’s possible to encode or decode a stream of audio tokens on the flight to provide low latency interaction with an audio model.

§Example

# Generating some audio tokens from an audio files.
wget https://github.com/metavoiceio/metavoice-src/raw/main/assets/bria.mp3
cargo run --example mimi \
  --features mimi --release -- \
  audio-to-code bria.mp3 bria.safetensors

# And decoding the audio tokens back into a sound file.
cargo run --example mimi
  --features mimi --release -- \
  code-to-audio bria.safetensors bria.wav

Re-exports§

pub use encodec::load;
pub use encodec::Config;
pub use encodec::Encodec as Model;
pub use candle;
pub use candle_nn;

Modules§

conv
encodec
quantization
seanet
transformer

Enums§

NormType