Expand description
RWKV v6 model implementation with quantization support.
RWKV is a linear attention model that combines the efficiency of RNNs with the parallelizable training of Transformers. Version 6 builds on previous versions with further optimizations.
Key characteristics:
- Linear attention mechanism
- Time mixing layers
- Channel mixing layers
- RMSNorm for normalization
- Support for 8-bit quantization
References:
Re-exports§
pub use crate::models::rwkv_v5::Config;pub use crate::models::rwkv_v5::State;pub use crate::models::rwkv_v5::Tokenizer;