אירועים

אירועים והרצאות בפקולטה למדעי המחשב ע"ש הנרי ומרילין טאוב

הסבר, שיפור ואומדן איתנות במודלי בינה מלאכותית

הדס אורגד (הרצאה סמינריונית לדוקטורט)

יום שני, 07.07.2025, 12:30

טאוב 301

מנחה: Dr. Yonatan Belinkov

Artificial Intelligence (AI), particularly neural networks, has become central to a wide array of applications — from language modeling to text-to-image generation. Despite these achievements, ensuring the robustness of AI models remains a significant challenge. Robustness refers to the ability of models to maintain performance across diverse inputs and avoid issues such as out-of-distribution failures, generation of harmful or incorrect content, and the propagation of social biases. Addressing robustness is crucial for deploying reliable AI systems in real-world scenarios.

Motivated by these challenges, this thesis aims to improve the understanding, evaluation, and ultimately the robustness of AI models through interpretability-based methods. Interpretability research, which aims to elucidate the decision-making processes of these models, offers a promising pathway to address robustness challenges with customizable and cost-effective methods. In this seminar, I will present our research on enhancing AI robustness by applying insights from interpretability studies, focusing on mitigating biases, reducing harmful content, improving adaptability, and addressing hallucinations.

[בחזרה לאינדקס האירועים]