Skip Header NavigationIntranet 
CENTER FOR EMBEDDED NETWORKED SENSINGContactDirectionsEmploymentEventsNews
HomeAbout UsResearchEducationResourcesPeople

Research Project


Macroprogramming Sensor Networks Using Kairos Framework

Technology > Systems : Programming and Storage > Macroprogramming Sensor Networks Using Kairos Framework

On this page: Overview | Approaches | Systems/Experiments | Accomplishments | Future Directions | People

OVERVIEW

Sensor data analysis includes a wide range of operations, from attribute-based searching for named events, to complex sensor data manipulation, including searching for new patterns or trends and looking for correlation structures. Conventional approaches to such monitoring have involved wired and sparsely deployed networks that transfer all data from sensors to a central data repository for persistent storage.

Unfortunately, wireless sensor networks are highly resource-limited, severely limiting deployment lifetime if all raw data must be transmitted to a central location. The long-term storage requirements of such systems add an additional dimension to optimize, since the storage capacity of low-end sensor nodes (Motes, Smart-Dust) will be limited by cost and form factor. The problem of insufficient storage is exacerbated by the fact that such systems are long-lived and operate unattended for many years.

In this project, we explore data storage techniques to facilitate search in sensor networks. In last year’s work, we described different storage schemes intended for applications with different search requirements. These included a distributed wavelet-based summarization and data aging system, DIMENSIONS, and two distributed indexing schemes, DIFS and DIMS, that were optimized for range-sum queries. Our work during this year has focused on three aspects of in-network data storage that was not covered in the previous year:

In the rest of this section, we describe these three new pieces of work in greater detail. The first part describes our mote implementation that is designed for the structural health monitoring project led by Ramesh Govindan. The second part discusses the cause and effect of spatio-temporal irregularity, and how we can deal with them.

APPROACHes

Progressive Storage and Transmission

In this section, we outline some results in building a wavelet-based in-network storage and progressive transmission scheme for Wisden, a system for Wireless Sensor Network for Structural Monitoring. While most prior research (e.g.: Directed Diffusion) has focused on data aggregation in order to increase network lifetime, our primary motivation for considering compression is to scale Wisden to many nodes.

The data rate requirements for structural monitoring can be a significant fraction of radio bandwidth. Another way to describe this limitation is to consider the latency of data acquisition; Even if each of 20 nodes generated only 10 minutes worth of 3-channel vibration data, it would take almost an hour transmit the data to the base station assuming a nominal radio bandwidth of 2 KBps.

To address the latency of data acquisition, we have designed and implemented a progressive storage and transmission strategy on the motes. This approach uses local storage on the motes as an in-network cache for raw data and transmits low-resolution compressed summaries of data in near-real time. The user can visualize this summarized event data and the raw data can be collected from the distributed caches when required. This on-demand data collection has to occur within the time window before which data in the cache is replaced by newly generated samples. As we show below, such an approach can compress vibration data by a factor of 20; when coupled with event detection, it can reduce the acquisition latency to less than in a minute in many cases.

Figure 1: Power Spectral Density of vibration signal shows that much of the energy is concentrated in the lower freguency subbands

Our progressive transmission strategy for vibration data uses wavelet compression. The applicability of wavelet techniques follows from characteristics of large structures, whose frequency response is usually focused in the low-frequency components. Figure 1, which shows the power spectral density of a vibration signal, illustrates this clearly.

Wavelet Codec Internals

We use an optimized implementation for the motes due to memory and computation constraints. For this reason, many design choices that we make are simpler than other progressive codecs such as JPEG2000. The component diagram of our implementation is shown in Figure 2. We now describe the individual system components in more detail.

Figure 2: Component diagram of wavelet codec

Integer-Integer Wavelet decomposition: Our implementation uses the bi-orthogonal Cohen-Daubechies-Feauveau (2,2) (CDF(2,2)) integer wavelet lifting transform that relies solely on integer addition and bit shifting operations. Wavelet lifting involves two steps: (a) a prediction step when the odd values of the time-series are predicted from the even values, and (b) an update step when the even values are updated to capture the error in the prediction step. The predict and update operations for the CDF(2,2) lifting transform are:

Equation 1: d sub i equals d sub i minus one half times open paren s sub i + s sub (i + 1) close paren; Equation 2: s sub i equals s sub i plus one quarter times open paren d sub (i-1) plus d sub i close paren


The choice of the above lifting transform over other kernels was based on two factors: computation overhead and compression performance. Using longer wavelet filters involves more computation overhead but does not provide significant compression improvement over the chosen filter, at least for the building vibration dataset that we studied (CUREE- Kajima research program).

While the lifting transform itself is very efficient, normalization of coefficients at various subbands involves floating point operations. The normalization coefficients for the CDF(2,2) transform are:


Equation 3: n sub H equals the square root of 2; Equation 4: n sub L equals the square root of 2


where nH is the higher frequency subband and nL is the lower frequency subband. We perform the normalization operations during the wavelet thresholding step rather than during wavelet decomposition to be more computationally efficient. The wavelet codec operates on buffers of length 2n, where n is a positive integer. To avoid blocking artifacts at the buffer boundaries, we pad each buffer with a few samples at either end.

Quantization: Quantization involves representing a range of values in a signal by a single value. This reduces the number of symbols that are required to represent a signal, and hence makes the signal more compressible. We implemented a simple uniform quantizer that can be used to reduce the resolution of data depending on the range of the signal and the number of bits allocated to each sample. Figure 4 shows the quantized and thresholded version of the signal in Figure 3.

Signal Thresholding: Thresholding is a technique used to modify the wavelet decomposed signal such that the resulting signal contains long sequences of zeros that can be efficiently compressed by an entropy coding scheme. We use a hard thresholding scheme in which if the absolute value of any wavelet falls below the threshold, it is set to zero (shown in Figure 4). We maintain a probability density function (pdf) of the signal to facilitate the selection of an appropriate threshold. The user specifies what percentage of the signal need to be zeros in the lossy version, and the pdf can be used to determine the appropriate threshold.

Figure 3: Wavelet Decomposed Signal. Figure 4: Quantized and Thresholded Signal

The thresholds for different subbands are normalized using the coefficients shown in the above equations. This operation needs to be done only once, hence, it reduces the computation requirements of normalizing the signal.

Run-length encoding: Wavelet decomposition, quantization and thresholding process the signal to make it more amenable for compression by an entropy coding scheme, but no compression has yet occurred. An entropy coding scheme is typically designed such that the symbols that occur most frequently use the least amount of bits. Run length coding is the simplest of such schemes that exploits runs of a particular value in the signal. The thresholded and quantized signal in Figure 4 shows many runs of zeros that can be exploited by such an encoding scheme.

BitStream: The run-length encoded signal is a series of symbols of different lengths depending on the number of bits used in quantization and the lengths of the special symbols used in the run length encoding process. A bitstream module is used to pack these variable length symbols into the data segment of a TinyOS message packet. Each time the data segment of a packet is filled up, it can be scheduled for transmission to the base station using a reliable transport mechanism.

Operation Description

The progressive transmission operation involves three steps: event detection, local data storage, and progressive coding. The event detection scheme (described in greater detail in [1]) runs continually, and triggers when an event is detected. The event signal then undergoes wavelet decomposition, and the decomposed signal is written to the persistent external flash memory on the mote. Until this point, no compression has occurred, hence, the lossless event data is available on flash.

A separate periodic task reads the flash memory and compresses the signal using the five step process described in the previous section, after which it transmits the bitstream to the base-station. The choice of signal threshold and number of quantization bins is assumed to be determined by a priori analyis of training data to obtain maximum compression benefit within the specified error bounds.

A user at the base-station can analyze the low-resolution signal in real-time and request either the raw data or additional detail in the signal. Since the raw data is stored on flash, the difference between the previously transmitted resolution and the requested resolution is coded by the steps described in the previous section and transmitted to the base-station. This progressive transmission procedure should be completed before the data on flash is overwritten by future events.

SYSTEMS / EXPERIMENTS

We evaluate the performance of our system on two fronts: (a) the computation and memory overhead of doing the wavelet compression in real-time on a mote, and (b) the compression gain by using our scheme, which translates to the latency of data acquisition. We used data from shaker table tests at CUREE-Kajima.

Table 1: Computation and Memory requirements of mote implementation

Table 1 shows the computation and memory requirements of the core components of our compression system. The computation time is low, and enables us to perform the entire operation in real-time. We were able to perform sensor data sampling, 128 sample CDF(2,2) wavelet lifting transform as well as writing the decomposed buffer to the EEPROM for sampling rates upto 250Hz. As can be seen in Table 1, the memory requirements are low as well, making it easier to integrate with other components of the system.

The compression and error results from using such a scheme are shown in Figure 5 and Figure 6 respectively. Both graphs use a threshold that sets 80% of the decomposed signal to zero. The trends of both graphs are as expected; as the number of quantization bits increases, both the compression ratio and the RMS error reduce. One sweet spot that emerges from these two graphs is a 4-bit quantization scheme. This choice gives us about 20-fold reduction in data size with very low RMS error of 3.1. The peak-to-peak signal to noise ratio (PSNR) for the above choice of parameters is 30dB.

Figure 5: Compression Ratio. Figure 6: Root Mean Square Error

These results are very promising are indicate that such an approach can be used to allow near real-time structural data acquisition from tens of sensors. We expect to more directly validate the latency benefits when we integrate our implementation into Wisden.

Studying the impact of spatio-temporal irregular sampling on data storage
A central issue that cuts across many research thrusts in sensor networks is the impact of irregular sampling. We argue that spatio-temporal irregularity is fundamental to wireless sensor networks and must be considered by sensor network algorithm, protocol, and application designers. Such irregularity is inevitable due to terrain, deployment, and hardware practicalities.

Why is irregularity inevitable?
Spatially Irregular Deployments: Can we expect sensor network deployments to be uniformly regular?
Most sensor network deployments will have irregular spatial configurations for two fundamental reasons: (a) the phenomena of interest are not uniformly distributed and the deployment of sensor resources will be variable in order to achieve denser sensing where there is greater spatial variability (e.g., on the edge of biological regions as shown in Figure 7), and (b) terrain and other deployment practicalities bias deployment locations to where necessary power sources, communication or access can be achieved.

Figure 7: Irregular spatial placement for the micro-climate monitoring deployment at James Reserve is inevitable due to terrain and deployment practicalities (power, GPS) and limited number of nodes available

Temporally Irregular Sampling: Can we expect clocks at different nodes in a sensor network to be continually, synchronized to the precision required by the application?
Recent research into time-synchronization for such GPS-less sensor networks have shown that distributed, precise synchronization is indeed feasible. Such a procedure, however, comes with the associated cost of transmitting periodic beacons for noise reduction and multi-hop synchronization. Thus there is a fundamental tradeoff: more energy is required for finer synchronization for high-frequency sensing (e.g., seismic applications which sample at 100Hz, acoustic at 48KHz), while nodes with constrained energy budgets must emphasize energy conservation. For instance, in Figure 7 and 8, a cluster of ipaqs lose synchronization upto 150µs (5 acoustic samples) within a matter of seconds, requiring high rate of time-synchronization and therefore, high communication overhead. Variability in time-synchronization will result from precisely this need to sacrifice synchronization guarantees for energy consumption. Additionally, many external factors will contribute to such variability including changing ambient and system noise levels, loss of synchronization beacons, node failures and other system dynamics.

Figure 8: A group of 9 proximate synchronized lpaqs lose sync to a maximum of 150us (5 acoustic samples) within 5 seconds. Figure 9: Lag between two lpaqs increases from 4 samples to 9 samples over three seconds.

In a position paper, we discuss different sets of solutions that can be used to deal with spatio-temporal irregularity.

ACCOMPLISHMENTS

We have submitted a paper on the Wisden system that includes the work on progressive compression and storage and published one position paper on our work on irregular sensor sampling.

FUTURE DIRECTIONS

PEOPLE

Faculty:

Prof. Ramesh Govindan

Graduate Students:

Ramakrishna Gummadi
Omprakash Gnawali