# Technical Report MSC-2021-10

 TR#: MSC-2021-10 Class: MSC Title: Correlation Clustering With Overlaps Authors: Ayelet Kravi Supervisors: Roy Shwartz and Josef (Seffi) Naor PDF Currently accessibly only within the Technion network Abstract: The Overlapping Correlation Clustering problem has multiple real-world applications in diverse areas such as: social networks (community detection), biology (protein analysis) and information retrieval (categorizing documents). Correlation Clustering is a problem of assigning data points into distinct groups (clusters) based on similarity. When assigning two similar data points to the same cluster or assigning two dissimilar data points to different clusters, we score this assignment as successful, otherwise, we score the assignment as unsuccessful. When solving the correlation clustering problem, two well studied objectives are to minimize the disagreement (the number of unsuccessful assignments) or to maximize the agreement (the number of successful assignments). In Overlapping Correlation Clustering, we are allowing the clusters to overlap, meaning a single data point may belong to multiple clusters. In the overlapping case, each data point is assigned to a (non-empty) set of clusters. We use the Jaccard similarity coefficient (a function that quantifies the similarity between two given sets) to measure how successful a given assignment is. In this work we study the Overlapping Correlation Clustering problem from a theoretical perspective and focus on the special case of finding an overlapping multiway-cut. Using a linear relaxation, we obtain an $O(\epsilon)$-additive approximation algorithm, and an $O(1+\epsilon)$-multiplicative approximation algorithm, for minimizing the worst disagreement. Copyright The above paper is copyright by the Technion, Author(s), or others. Please contact the author(s) for more information

Remark: Any link to this technical report should be to this page (http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-info.cgi/2021/MSC/MSC-2021-10), rather than to the URL of the PDF files directly. The latter URLs may change without notice.