Technical Report MSC-2012-21

Title: Monitoring General Functions in Distributed Systems with Minimal Communication
Authors: Amir Abboud
Supervisors: Assaf Schuster and Daniel Keren
Abstract: In today's connected, data-driven world, traditional database systems are replaced by data stream systems which are fundamentally distributed. These large-scale and widespread networked systems generate high-volume streams of data that often require processing in real time. Examples for such include: network traffic monitoring systems, real-time analysis of financial data, distributed intrusion detection systems and sensor networks. A principal concern within these distributed systems is threshold monitoring: Determining whether the value of a certain function, evaluated over network-wide data, crosses a certain threshold that may indicate a global phase change which calls for some action. %When dealing with dynamically evolving streams of data, scattered among numerous, possibly resource-constrained nodes, communication poses a major bottleneck, especially when a quick and reliable resolution is required in (near) real-time.

Formally, the threshold monitoring problem over a data stream system, consisting of $n$ nodes, can be described as follows: Given are a function $f$, vector $v_1,v_2,...,v_n$, where $v_i$ is a data tuple (vector) at the $i^{th}$ node, and a threshold $\tau$. The vectors are dynamic, and the system needs to alert whenever the value of $f(v_1,v_2,...,v_n )$ crosses $\tau$ (that is, either changes from a value larger than $\tau$ to a value smaller than $\tau$, or vice-versa). This innocuous-looking condition of threshold crossing covers a very wide range of alerts that complex, distributed, dynamic systems must trigger. %(e.g. a decision to issue a warning of an impending natural disaster).

A common approach to the monitoring problem is to continuously or periodically centralize all data or data summaries, thus transforming a distributed problem into a centralized one. However, such centralization, may simply be infeasible in realistic settings, due to the sheer volume and dynamic nature of the data (implying huge communication overheads and latencies, as well as rapid energy drain, in the case of sensors).

In this thesis we will present three works which propose techniques for reducing communication when monitoring a distributed network. The techniques are based on the following simple observation: nodes in the system should not send a message every time new data arrives, but rather send messages only when "interesting" things happen. We formulate this observation using an idea of \textbf{Safe Zones} (SZs). Each node of the system gets a Safe Zone (SZ), and is asked to communicate only when the data it observes drifts out of this SZ. We will make sure that as long as each node is in its SZ, the global "bad" event (threshold crossing) may not have happened.

A great deal of work exists for the limited cases in which the threshold function ($f$) is either linear or monotonic. However, many functions of interest are neither linear nor monotonic and it is impossible to extend work on these cases to general functions. We propose novel generic techniques for monitoring \textit{arbitrary} functions.

In Chapter 1 we introduce and formalize the idea of SZs, define the problem of finding the optimal SZs, and try to solve this problem with Geometric tools. In chapter 2 we present a different approach for the problem of finding the optimal SZs, which is more appropriate when the data sampled at the nodes is of a Discrete nature. In chapter 3 we present communication efficient algorithms for determining whether a global "bad" event happened, when a node drifts out of its SZ.

The applicability of these techniques to various real problems is provided in experiments, which also demonstrate their advantage, by orders of magnitude, over previous work. Both the experiments and theoretical analysis show that this advantage increases with the dimension of the data.

CopyrightThe above paper is copyright by the Technion, Author(s), or others. Please contact the author(s) for more information

Remark: Any link to this technical report should be to this page (, rather than to the URL of the PDF files directly. The latter URLs may change without notice.

To the list of the MSC technical reports of 2012
To the main CS technical reports page

Computer science department, Technion