Module quantized_rwkv_v6

Module quantized_rwkv_v6 

Source
Expand description

RWKV v6 model implementation with quantization support.

RWKV is a linear attention model that combines the efficiency of RNNs with the parallelizable training of Transformers. Version 6 builds on previous versions with further optimizations.

Key characteristics:

  • Linear attention mechanism
  • Time mixing layers
  • Channel mixing layers
  • RMSNorm for normalization
  • Support for 8-bit quantization

References:

Re-exports§

pub use crate::models::rwkv_v5::Config;
pub use crate::models::rwkv_v5::State;
pub use crate::models::rwkv_v5::Tokenizer;

Structs§

Model