Module persimmon

Module persimmon 

Source
Expand description

Persimmon Model

A transformer language model for efficient inference and general-purpose tasks. The model uses a standard transformer architecture with:

  • Layer normalization for Q/K attention
  • RoPE embeddings with partial rotary factor
  • ReLU activation
  • Separate number of attention heads and KV heads

References:

Structs§

Config

Enums§

PositionEmbeddingType

Constants§

DTYPE