Expand description
Yi model implementation.
This candle implementation uses a pre-trained Yi decoder-only large language model for inference. The model was trained by 01.AI and follows a standard transformer architecture similar to LLaMA.
Original code:
- ๐ป Yi Model
- ๐ป Yi Modeling Code
- ๐ Technical Report Yi: Open Foundation Models by 01.AI
Key characteristics:
- Multi-head attention with rotary positional embeddings
- RMS normalization
- SwiGLU activation in feed-forward layers
- Grouped-query attention for efficient inference