Date: Wednesday, 13 September 2006, 6:30 PM
Location: SAP LABS, Building D, 3410 Hillview Avenue,
Palo Alto, CA (Google
Maps | Yahoo!
Maps | Mapquest)
Cost: Free and open to all who wish to attend, but membership is
only $10/year.
Topic
A holistic approach to prospective anomaly detection for massive number of streams is proposed. The method works by building a baseline model to capture normal behavior. Any baseline model that provides a p-value for the observed, relative to the predicted can be used. Anomalies are detected by tracking normal scores derived from p-values. A flexible and fast five-parameter Bayesian model adjusts for multiple testing at each time point. Methods to delete uninformative streams from the monitoring process are also discussed. The method is illustrated on a real application where our baseline model is built using a state space approach.
About the Speaker
Deepak Agarwal is a senior research scientist at Yahoo! Research Labs. Prior to joining Yahoo!, he was senior technical staff member in the statistics department at AT&T Research Labs. Deepak obtained his PhD in statistics from University of Connecticut under the guidance of Professor Alan Gelfand. His thesis focused on building multi-level hierarchical Bayesian models for large, misaligned spatial data that are part of most GIS systems. At AT&T, Deepak worked on methods for mining massive graphs, anomaly detection using a time series approach and computational approaches for spatial scan statistic. Deepak won the best applications paper award at Siam Data Mining 2004 and the best student paper award at the Joint Statistical Meetings, 2001. He has served on a couple of NSF panels and several program committees in data mining and statistics.

