Ayelet Kravi, M.Sc. Thesis Seminar
Thursday, 14.11.2019, 11:00
Advisor: Prof. Roy Schwarz and Prof. Seffi Naor
The overlapping correlation clustering problem has multiple real-world applications in diverse areas such as: social networks (community detection), biology (protein analysis) and information retrieval (categorising documents).
Correlation clustering is a problem of assigning data points into distinct groups (clusters) based on similarity. When assigning two similar data points to the same cluster or assigning two different data points to different clusters, we score this assignment as successful, otherwise, we score the assignment as unsuccessful.
When solving the correlation clustering problem, two optional objectives are to minimise the disagreement (the number of unsuccessful assignments) or to maximise the agreement (the number of successful assignments).
In overlapping correlation clustering, we are allowing the clusters to intersect, meaning a single data point may belong to multiple clusters. In the overlapping version, we use the Jaccard similarity coefficient (a function that quantify the similarity between two given sets) to measure how successful is our assignment.
We studied the overlapping correlation clustering problem from theoretical perspective by formulating a liner relaxation and trying to solve it. We focused on a sub-problem of finding overlapping multiway-cut. Using linear relaxation, we managed to achieved an O(1+alpha)-approximation algorithm.