Technology > Statistics and Data Practices > Confidence: Increasing the Quantity and Quality of ENS Data
Our approaches are based on the observation that many faults impacting embedded sensing systems (ENS) require some kind of human intervention, be it calibrating a sensor, or moving a tree that has fallen on a node. Thus, while a majority of current works focus on enabling autonomous systems, completely removing the human from the loop is neither practical nor possible. However, solely relying on the human to manually monitor and administer a large number of nodes and sensors is not feasible as well. Our goal is to design a system that can intelligently guide a user to take actions that demonstrably improve the data and network quality. In short, sensor networks demand systems administration.
This insight motivated the design of Sympathy, a system designed to monitor and diagnose network faults. Like most traditional systems that attempt to guide a user in fixing a problem, Sympathy uses a knowledge-based (or expert system) approach, implemented using a decision tree to analyzes metrics periodically collected from the network and diagnose faults. Sympathy has proven successful, accompanying numerous deployments over the past several years. However, through the design of this system we learned two key lessons: 1) Diagnoses that are too specific are often wrong and delay the detection of the true fault; and 2) Statically defined thresholds used in decision trees are difficult to set and do not easily adapt to different environments and systems.
Machine and statistical learning techniques are designed to adapt to dynamic environments by learning from experience. However, not all learning approaches are applicable. Supervised learning machines require a training dataset consisting of a set of inputs and desired outputs. Creating this labeled dataset is conditional upon the user having access to ground truth. Unfortunately users do not have such knowledge in many ENS deployments where sensors are deployed in new and unexplored environments. Unsupervised learning mechanisms are designed for such scenarios, inferring models or patterns in data without external feedback.
We are incorporating these lessons into the design of Confidence, the first unified system geared towards increasing the quantity and quality of data collected from a large network. Confidence enables users to effectively administer large numbers of sensors and nodes by automating key tasks and intelligently guiding a user to take actions that demonstrably improve the data and network quality. Confidence uses K-means clustering, an unsupervised learning technique with similar advantages to knowledge-based systems in that it is simple to implement, scales to large amounts of data, and is relatively transparent. K-means can be simply modified to dynamically adapt to data streams, giving it a significant advantage over statically designated techniques. Clustering does not require pre-set thresholding of data values or learning supervision. Instead, the pattern recognition algorithm automatically classifies together points that appear similar in an N-dimensional feature space.
In our modified version of K-means, classification is a two part process: (1) Points that join an "unpopular" or anomalous cluster in the feature space are considered faulty; (2) Faulty points trigger user notification of actions associated with the faulty cluster which are known to have fixed previous points in that cluster. Clusters are learned over time, and easily adapt to different environments and sensors, and elegantly apply to improving both network and sensor quality. We show that Confidence outperforms similar systems by avoiding precise diagnoses and decision trees that employ statically set thresholds. We reduce the problem of identifying and diagnosing system faults to identifying the correct feature space in which faults appear anomalous. We show that this system is generalizable by using it to address both network and data faults.
Because of the finite feature space, knowledge and experience from one deployment should be more easily transported to another deployment simply by initializing the cluster locations in the new deployment to match those of the old deployment.
We demonstrate in simulation and in practice that this approach meets our performance criteria, which is to design a system that can detect and diagnose faults quickly and accurately. False positives can lead to too much interaction, which can be as detrimental to system performance as insufficient action; one paper from UC Berkeley cites that 40% of the time a component in an embedded sensing system is touched, something breaks. In our experience, this number can be even higher. In some instances common problems, such as those caused by an intermittently faulty wireless channel, resolve themselves. Detection accuracy is our primary metric in evaluating Confidence performance.
In one experiment, we inject a fault into each node in a 25 node network, one at a time. The figure below is an empirical CDF of the time it takes both Confidence and Sympathy to identify the fault. The automatic clustering used for Confidence performs almost 3 times better than the static decision trees used in Sympathy. False positive rates show similar results.

Graduate Students: