דלג לתוכן (מקש קיצור 's')
אירועים

אירועים והרצאות בפקולטה למדעי המחשב ע"ש הנרי ומרילין טאוב

הסבר, שיפור ואומדן איתנות במודלי בינה מלאכותית
event speaker icon
הדס אורגד (הרצאה סמינריונית לדוקטורט)
event date icon
יום שני, 07.07.2025, 12:30
event location icon
טאוב 301
event speaker icon
מנחה: Dr. Yonatan Belinkov

Artificial Intelligence (AI), particularly neural networks, has become central to a wide array of applications — from language modeling to text-to-image generation. Despite these achievements, ensuring the robustness of AI models remains a significant challenge. Robustness refers to the ability of models to maintain performance across diverse inputs and avoid issues such as out-of-distribution failures, generation of harmful or incorrect content, and the propagation of social biases. Addressing robustness is crucial for deploying reliable AI systems in real-world scenarios.

Motivated by these challenges, this thesis aims to improve the understanding, evaluation, and ultimately the robustness of AI models through interpretability-based methods. Interpretability research, which aims to elucidate the decision-making processes of these models, offers a promising pathway to address robustness challenges with customizable and cost-effective methods. In this seminar, I will present our research on enhancing AI robustness by applying insights from interpretability studies, focusing on mitigating biases, reducing harmful content, improving adaptability, and addressing hallucinations.