Reverse nearest neighbor aggregates over data streams.
Flip Korn, S. Muthukrishnan and Divesh Srivastava.
Reverse Nearest Neighbor (RNN) queries have been studied for finite,
stored data sets and are of interest for decision support. However,
in many applications such as fixed wireless telephony access and
sensor-based highway traffic monitoring, the data arrives in a
stream and cannot be stored. Exploratory analysis on this data
stream can be formalized naturally using the notion of RNN aggregates
(RNNAs), which involve the computation of some aggregate (such as
COUNT or MAX DISTANCE) over the set of reverse nearest neighbor
"clients" associated with each "server".
In this paper, we introduce and investigate the problem of computing
three types of RNNA queries over data streams of "client" locations:
(i) Max-RNNA: given K servers, return the maximum RNNA over all
clients to their closest servers;
(ii) List-RNNA: given K servers, return a list of RNNAs over all
clients to each of the K servers; and
(iii) Opt-RNNA: find a subset of at most K servers for which their
RNNAs are below a given threshold.
While exact computation of these queries is not possible in the data
stream model, we present efficient algorithms to approximately answer
these RNNA queries over data streams with error guarantees. We
provide analytical proofs of constant factor approximations for many
RNNA queries, and complement our analyses with experimental evidence
of the accuracy of our techniques.