Expand description
Module implementing the MPT (Multi-Purpose Transformer) model
References:
The model uses grouped query attention and alibi positional embeddings.
Module implementing the MPT (Multi-Purpose Transformer) model
References:
The model uses grouped query attention and alibi positional embeddings.