While AI models are becoming an ever-increasing part of our lives, our understanding of their behavior in unexpected situations is drifting even further out of reach. This gap poses significant risks to users, model owners, and society at large.
In the first part of the talk, I will overview my research on detecting unexpected phenomena with and within deep learning models. Specifically, detecting (i) anomalous samples, (ii) unexpected model behavior, and (iii) unexpected security threats. In the second part of the talk, I will dive into my recent research on a specific type of unexpected security threat: attacks on image watermarks. I will review such attacks and present my recent work toward addressing them. I will conclude with a discussion of future research directions.
Bio: Niv Cohen is a postdoctoral researcher at the school of Computer Science & Engineering at New York University. He completed his PhD at the Hebrew University under the supervision of Yedid Hoshen. His undergrad was in Physics at the Technion as a part of their excellence program. He is also a recipient of the 2024 Blavatnik Prize for Outstanding Israeli Doctoral Students in Computer Science. Niv has been researching core deep learning questions centered around out-of-distribution phenomena: anomaly detection, watermarking generative AI data, and the limits on erasing concepts from deep models.