Colpali Model for text/image similarity scoring.
Colpali combines a vision encoder with an efficient LM for retrieving content.