Skip to content (access key 's')
Logo of Technion
Logo of CS Department
Events

The Taub Faculty of Computer Science Events and Talks

Decomposing CLIP's Embedding Space: Towards Improved Interpretability and Control
event speaker icon
Ehud Gordon (M.Sc. Thesis Seminar)
event date icon
Monday, 15.09.2025, 12:00
event location icon
Meyer Building 1061 & Zoom
event speaker icon
Advisor: Prof. Guy Gilboa

Vision-Language Models (VLMs) like CLIP have transformed the field by enabling joint reasoning across modalities, zero-shot transfer, and enhanced multimodal alignment. Despite their success and widespread adoption, embeddings derived from CLIP exhibit limitations, including challenges in object binding, relation comprehension, and interpretability due to difficulty in interpretability stemming from entangled feature representations. This work, under Supervision of Prof. Guy Gilboa, investigates methods for decomposing and analyzing CLIP's embedding space, employing various statistical and decomposition techniques. Our approach seeks to enhance performance, interpretability, and robustness across multiple applications, including image classification and editing. By addressing foundational representational challenges, this research contributes towards a deeper understanding of multimodal embedding geometry and advances the interpretability of modern VLMs.