Technology > Systems Area Projects > On-line In-field Fault Detection and Correction in Wireless Sensor Networks
We have been addressing one of the rapidly emerging and most important problems is wireless sensor network - on-line testing and fault detection. Note that accuracy all consequently applied techniques for sensor network and computational sensing optimization crucially depends on our ability to detect faults accurately. The problem is complex because noise, calibration issues, lack of ground truth, we have identified three phases of the on-line sensor testing and fault detection and we formally defined the sensor testing and fault detection task. We have also established the complexity of the testing problem by proving that the problem is NP-complete. We introduced an approach for detection and correction of transient faulty sensor data that employs the principle of separation of concerns, to identify and address three modular phases of the testing problem. The phases are: (i) intersensor model building and validation; (ii) fault-detection using local and global consistency checking; and (iii) data correction using a novel boosting-based procedure for missing data recovery. Experimental evaluations on traces of an embedded sensor network with temperature and humidity sensors indicate that the new method can almost always detect and correct the faulty sensor data with less than 2% error. Finally, we also have developed a classification of the detected types of sensors faults that is one of our starting for future research.
Our approach has three main components. First, we develop a model, using a nonparametric isotonic regression approach, for all pairs of nodes such that one node can be used to predict another. Using this predictive model, we build a graph, called a prediction graph, in which a directed edge from sensor node s_X to node s_Y exists only if sensor node s_X can predict the value that node s_Y senses (e.g., a temperature reading) to within a target error rate. Second, at each point of time, the testing approach uses the transitive prediction ability by checking the consistency between the current value of one sensor and its predictions from the graph to classify the value as correct or faulty. The testing procedure assigns colors to the edges of the prediction graph, such that a faulty edge is red, while a correct edge is it blue. We propose the idea of testing for the faulty sensors by finding the dominating set of nodes that are incident to the red edges using an ILP-based procedure. Third, we remove the values identified as faulty by the testing procedure and substitute them by the correct values. The substitution procedure applies a new variant of the boosting algorithm to find the best substitution for the removed data.
Given a time series of data measurements from two sensors, it is natural to ask whether the values sensed (e.g., temperature) by one sensor can be predicted by the other, i.e., can sensor Y be predicted via some function of sensor X’s data, Y=f(X). Regression analysis uses data samples from both X and Y to find the function f. There are many forms of regression analysis in the statistics literature. We propose the use of isotonic regression. We find that adding the isotonic constraint controls the search for a good predictor function f in a successful way. This constraint has a very natural interpretation in sensor networks. It is intuitive that if the phenomenon being sensed (e.g., temperature, humidity, light) increases, then both sensors will experience an increase in their measurements.
Using the modeling method, we develop multiple prediction models for each sensor under test (response sensor), where in each of the models, the response sensor values are predicted from the measurements at another sensor. A faulty data is identified by checking the consistency between the real sensor value and the predicted values for that one sensor. The faulty sensor data is removed and then is substituted by leveraging on the predicted values from its multiple copies. The recovery procedure is combining multiple individual predictions by using a boosting algorithm to select the most likely value.
We have been analyzing several data sets, including ones collected at UCLA Botanical Garden, James Reserve, and in several indoor environments. We have developed several new software packages for data modeling, including ones for combinatorial isotonic regression, symmetric combinatorial isotonic regression, and 2D versions for several regression and kernel smoothing techniques. We have also developed clique detection-based and ILP-based maximum likelihood fault detection techniques.
Our accomplishments can be traced along five directions: (i) practical and application oriented, (ii) statistical modeling, (iii) optimization techniques; (iv) theoretical, and (v) conceptual.
The main practical accomplishment is that we created an approach that consists of several statistical and optimization techniques and software tools that accurately identifies and corrects fault on several different types of sensor networks. In terms of statistical modeling we introduced several new types of classifiers, regressions, and density smoothing techniques and have developed fast algorithms and reliable software implementation for them. We have also developed a family of optimization algorithms that leverage statistical and pattern properties of a specific instance of the targeted problems, to better address computationally intractable problems. It is important to note the new algorithmic approach can be used for producing both heuristic solutions in provably polynomial time, as well as provably optimal solutions in heuristically short execution times. From the theoretical point of view, we created several abstraction of faults that are rich enough to cover many types of common faults and simultaneously simple to facilitate optimization intensive fault detection treatment. We have also established the complexity of several fault detection phases. Finally, to gain conceptual insight into sources and nature of faults, we classified all detected faults into six categories: (i) random and high intensity oscillations; (ii) saturation; (iii) stuck at recent previous value; (iv) stuck at random value; (v) calibration errors; and (vi) others. The most common were random oscillations (82%). Interesting is that in 14% of the cases, random oscillations were interleaved with periodic correct readings. Saturation faults are the ones where the readings of a sensor are saturated at its maximum physical range, although the readings should be even higher because all other sensors kept increasing their values. These faults were common for sensors that we directly exposed to the sun; they constituted 5% of all faults. Stuck at the previous value was also a common fault (6% of all faults), while stuck at a random value were an order of magnitude less common. Finally, almost all other faults were calibration errors, mainly occurring when the supply voltage had relatively low values.
We plan to pursue five main application and data-driven research directions: