Text this: Bridging the Multimodal Gap :