Module quantized_rwkv_v5

Module quantized_rwkv_v5 

Source
Expand description

RWKV v5 model implementation with quantization support.

RWKV v5 is an attention-free language model optimized for efficiency. This implementation provides quantization for reduced memory and compute.

Key characteristics:

  • Linear attention mechanism
  • GroupNorm layer normalization
  • Time-mixing layers
  • State-based sequential processing
  • Support for 8-bit quantization

References:

Re-exports§

pub use crate::models::rwkv_v5::Config;
pub use crate::models::rwkv_v5::State;
pub use crate::models::rwkv_v5::Tokenizer;

Structs§

Model