Expand description
Apply penalty and repeat_kv
Functionsยง
- apply_
repeat_ penalty - repeat_
kv - Repeats a key or value tensor for grouped query attention
The input tensor should have a shape
(batch, num_kv_heads, seq_len, head_dim),
Apply penalty and repeat_kv
(batch, num_kv_heads, seq_len, head_dim),