Module trocr

Module trocr 

Source
Expand description

TrOCR model implementation.

TrOCR is a Transformer-based OCR model that uses a Vision Transformer encoder and a BART-like decoder for optical character recognition.

Key characteristics:

  • Vision Transformer encoder for image processing
  • BART-style decoder for text generation
  • Learned positional embeddings
  • Layer normalization and self-attention

References:

Structsยง

TrOCRConfig
TrOCRDecoder
TrOCREncoder
TrOCRForCausalLM
TrOCRModel