Events

The Taub Faculty of Computer Science Events and Talks

Decomposing CLIP's Embedding Space: Towards Improved Interpretability and Control

Ehud Gordon (M.Sc. Thesis Seminar)

Monday, 15.09.2025, 12:00

Meyer Building 1061 & Zoom

Advisor: Prof. Guy Gilboa

Vision-Language Models (VLMs) like CLIP have transformed the field by enabling joint reasoning across modalities, zero-shot transfer, and enhanced multimodal alignment. Despite their success and widespread adoption, embeddings derived from CLIP exhibit limitations, including challenges in object binding, relation comprehension, and interpretability due to difficulty in interpretability stemming from entangled feature representations. This work, under Supervision of Prof. Guy Gilboa, investigates methods for decomposing and analyzing CLIP's embedding space, employing various statistical and decomposition techniques. Our approach seeks to enhance performance, interpretability, and robustness across multiple applications, including image classification and editing. By addressing foundational representational challenges, this research contributes towards a deeper understanding of multimodal embedding geometry and advances the interpretability of modern VLMs.

[Back to the index of events]