Expand description
Contrastive Language-Image Pre-Training
Contrastive Language-Image Pre-Training (CLIP) is an architecture trained on pairs of images with related texts.
- 💻 GH Link
- 💻 Transformers Python reference implementation
- 🤗 HF Model
Modules§
- text_
model - Contrastive Language-Image Pre-Training
- vision_
model - Contrastive Language-Image Pre-Training