Module quantized_recurrent_gemma

Module quantized_recurrent_gemma 

Source
Expand description

Recurrent Gemma model implementation with quantization support.

Gemma is a large language model optimized for efficiency. This implementation provides quantization for reduced memory and compute.

Key characteristics:

  • Recurrent blocks with gated recurrent units
  • Convolution and attention blocks
  • RMSNorm for layer normalization
  • Rotary positional embeddings (RoPE)
  • Support for 8-bit quantization

References:

Re-exports§

pub use crate::quantized_var_builder::VarBuilder;

Structs§

Model