Technology > Multiscaled Actuated Sensing > Adaptive Sampling by Using a Robotic Boat and a Static Sensor Network
Gaurav S. Sukhatme
Sensor networks provide new tools for observing and monitoring the environment. In aquatic environments, accurately measuring quantities such as temperature, chlorophyll, salinity, and concentration of various nutrients is useful to scientists seeking a better understanding of aquatic ecosystems, as well as government officials charged with ensuring public safety via appropriate hazard warning and remediation measures. Broadly speaking, these quantities of interest are scalar fields. Each is characterized by a single scalar quantity which varies spatiotemporally. A characteristic of such scalar fields is that the sensor readings are only valid locally. That is, the correlation between the sensors at different locations decreases rapidly with increasing distance between the sensor nodes. To estimate the scalar field where no sensor nodes are deployed, we need to interpolate the data. Intuitively, the more the readings near the location where a field estimate is desired, the less the reconstruction error. In other words, the spatial distribution of the measurements (the samples) affects the estimation error.
In many cases, it may not be feasible to move the static sensor nodes after deployment. In such cases, one or more mobile robots could be used to augment the static sensor network, hence forming a sensor-actuator network or a robotic sensor network. Static nodes and mobile robots both have their advantages and constraints. Static nodes generally consume less energy than mobile robots and their batteries last longer. However, static nodes cannot move and cannot change the spatial distribution of the sensor readings while mobile robots are able to move and take sensor readings at different locations.
An immediate question to ask is how to coordinate the mobile robots and the static nodes such that the error associated with the estimation on the scalar field is minimized subject to the constraint that the energy available to the mobile robot(s) is bounded. Specifically, if each static node makes a measurement in its vicinity, and the total energy available to the mobile robot is known, what path should the mobile robot take to minimize the mean square integrated error associated with the reconstruction of the entire field? Here we assume that the energy consumed by communications and sensing is negligible compared to the energy consumed in moving the mobile robot. We also assume that the mobile robot can communicate with all the static nodes and acquire sensor readings from them. Finally, we focus on reconstructing phenomena which do not change temporally (or change very slowly compared to the time it takes the mobile robot to complete a tour of the environment).
We developed a general solution to the above problem and test it on the NAMOS test bed, which consists of anchored buoys (the static nodes) and a robotic boat (the mobile robot) capable of measuring temperature and chlorophyll concentrations.
Non-parametric regression has been well studied in statistics and many methods have been proposed. We use a local linear regression to estimate the phenomenon. The asymptotic properties of 1 dimensional local linear Regression were first studied by Fan and later extended by Ruppert to the multi-dimensional case. The main take away message from these two studies is that the Integrated Mean Square Error (IMSE) of the estimator is related to the second derivatives of the phenomenon investigated.
If
is the function to be estimated (reconstructed), we assume the following model
, (1)
where i = 1, ..., n, d is the number of dimensions,
are
-valued predictor variables,
are scalar response variables,
is finite and the
are mutually independent and identically distributed random variables with zero mean and unit variance and are independent of
.
Suppose
is the d-dimensional bandwidth matrix and K is a d-variate kernel which satisfies: 1.
; 2.
and 3.
for all non-negative integers
such that their sum is odd, it has been proved that the estimation error associated with the local linear regression is given by the following equation:
(2)
where
, f(x) is the density function with
, and
is the Hessian matrix of
.
In equation 2, when
is big enough and H is small enough, the MSE can be approximated without the infinitesimal. If we assume the Hessian matrix of
is known, we can determine the optimal bandwidth and optimal density function by minimizing the Integrated MSE with the constraint
. By applying the Lagrange-Euler differential equation, we have the optimal density function as follows:
(3)
where we assume the bandwidth matrix is defined as
.
is called the optimal design. When the cost of moving from one sample location to another is small compared to the cost of taking sensor readings, we can use
to generate the sample locations. When samples are to be taken by a mobile robot, optimal design alone is not feasible because now the constraint is not the number of samples to be taken but the distance that the mobile robot can travel. This limits how many samples can be taken and where those samples can be taken.
Assume that initially there are
readings from static sensor nodes
, the path of the mobile robot passes through the points
, and then the optimal path should minimize IMSE, which can be estimated as follows.
, (4)
where
is the estimation of the density function. Similar to the information gain defined in robot exploration literature, we define the gain for each point as
,
and the gain for a path p as
.
Note that the gain of the whole path is not the sum of the gain of all points on the path because the gain at each location is not independent with each other. On the one hand, one new sampling point not only increases the sampling density at/near that point. On the other hand, MSE is not linear in the density function. The higher the initial density function at the location x is, the less the gain that is achieved by taking one more sample at x.
However, if the whole path is taken as the state, the size of the state space is exponential in the available energy and hence is computationally intractable. By examining the effect of dependency closely, we find that the effect is local and the range it affects depends on the bandwidth. Therefore, if we discretize the sensing field sparsely enough, we can approximate the gain of the path by using the sum of the gains of all the points on the path. In this case, the state can be represented by the position of the node, current energy available to the robot, and the gain achieved. Now the search space is linear in both the size of the environment and the energy level and a breadth first search can be applied to find the optimal path.

(a) (b)
Figure 1 (a) The robotic boat (b) The energy consumption model for the robotic boat
To apply an algorithm to search for the optimal path for sampling, an energy consumption model is necessary to determine if two states are adjacent to each other and the energy consumed when the mobile node transit from one state to another. The energy consumption model used is depicted in Figure 1(b). As shown in the figure, from each state, the mobile node is able to transit to 5 other states. We assume that the energy caused by state transition is proportional to the distance the robot traveled. Circular arcs are used to approximate the optimal curves. As a result, the energy consumed if the boat moving forward is 1 unit and all other state transition cost 1.6 units.


(a) (b)
Figure 2 (a) The raster scan data set. (b) The interpolated temperature field by using whole raster scan data set.
Our algorithm has been tested with the raster scan data taken by the robotic boat in Lake Fulmor, CA, during a period of 40 minutes in the afternoon of May 9th, 2006. Figure 2(a) shows the points in the entire raster scan data set. The solid curve is the boundary of the sensing field and the black dots are the data points. Temperature readings are used in the tests and Figure 2(b) shows the temperature field interpolated with the whole data set. We mine this data set (the ground truth) as follows to test the algorithm via a data-denial simulation. We assume that the boat is able to follow the optimal path generated by the adaptive sampling algorithm accurately. When one sampling location x is generated, we search in the raster scan data set for the point closest to x and take the corresponding temperature reading as the sensor readings at location x. Once the process of sampling is done, the temperature field is estimated and the result is compared with the raster scan data instead of current temperature on the surface of the lake. As a result, even if the temperature field changes during the time when the robotic boat is doing raster scan, the data set still can be used as taken from a static field. It is also assumed that there exist 15 static nodes to provide the initial data set and the locations of those static nodes are shown as small circles in Figure 2(a). Because we are going to use some of the temperature readings in the raster scan data set as the readings of the static sensors, the locations of the static nodes are picked so that they coincide with some of the points in the raster scan data set.

Figure 3 The square of the trace of the estimated Hessian matrix.
We assume that the boat starts from the top right corner of the lake. The initial direction of the boat is west, i.e., facing toward the center of the lake. Figure 3 shows the square of the trace of the estimated Hessian matrix. The higher the trace, the higher the gain can be achieved by taking a reading there. As shown in Figure 3, the trace of the estimated Hessian matrix is higher in the bottom left part of the lake than in the top right part. As a result, the boat tends to take more samples in the bottom left part. Figure 4(a) and 4(b) show two typical paths generated by the algorithm. The triangles indicate the directions of the boat. The Figure 4(a) shows the path with initial energy 25 units and Figure 4(b) shows the path with the initial energy 50 units. When the initial energy is high, the path planner tends to allow the boat to wander a little on its way to the place with maximum gain so that it does not have to move back later. When the initial energy is low, the path planner requires the boat to go straight to those locations with maximum gain.


(a) (b)
Figure 4. Path generated by adaptive sampling. (a) Initial energy is 25; (b) Initial energy is 50.

Figure 5. Comparison between adaptive sampling and random walk.
Our algorithm is compared to a random walk with same initial state. For each method, we generate a set of new data points and then estimate the temperature with the new data set together with the initial readings from the static sensors, which are still part of the raster scan data. Then the estimation is compared to the whole raster scan data set assuming that the raster scan is the ground truth. The IMSE is approximated by summing the square error at each data point of the raster scan. For each initial energy value, the random walk runs for 50 trials and the IMSE reported is the average over these 50 trials. Figure 4 shows the comparison of the IMSE between the two methods with the initial energy being varied from 5 to 60 units. From the figure, we can see that with the adaptive sampling algorithm, the IMSE is about 20% less than a random walk. Since
for the best case in 2D problem, a 20% improvement is significant.
Our algorithm has also been tested on the boat in a lake in Echo Park in Los Angeles and the target scalar field is the temperature field. Once again, we use the raster scan data as the ground truth. The boat was first used to quickly perform a raster scan in a 40m x 70m area, which is part of the lake. We assume that the temperature field did not change significantly during the period when the experiments are carried out. Figure 6 shows the results of one experiment. The solid lines in Figures 6(a) show the boundary of the test area and the circles are the locations where initial temperature readings are available.
The boat starts from the right bottom corner facing north in the first experiment and top right corner facing west in the second experiment. The triangles in Figure 6(a) are the planned locations where the boat should go and take temperature readings and the dots show the GPS record of the boat, which show the actual track. Note that the actual track did not follow the planned path strictly. This is due to the GPS resolution and the effect of the wind. Currently, there is no battery monitor on the boat and hence we cannot measure the energy consumed by the boat directly. However, during the experiments, the speed of the propeller on the boat is approximately constant. That is, the thrust generated by the propeller is approximately constant and hence we can approximate the energy consumption by measuring the distance the boat traveled. With the dense GPS readings along the track, it is easy to compute the distance. Figure 6(b) show the IMSE of the estimation vs. the distance the boat traveled in both experiments respectively.


(a) (b)
Figure 6. (a) shows the planned path and the GPS trace of the actual track the boat took. (b) shows estimation error v.s. distance the boat traveled (the energy consumed by the boat is approximately proportional to the distance traveled).
We have proposed an adaptive sampling algorithm for a mobile sensor network consisting of a set of static nodes and a mobile robot tasked to reconstruct a scalar field. Our algorithm is based on local linear regression. Sensor readings from static nodes (a set of buoys) are sent to the mobile robot (a boat) and used to estimate the Hessian Matrix of the scalar field (the surface temperature of a lake), which is directly related to the estimation error. Based on this information, a path planner generates a path for the boat such that the resulting integrated mean square error (IMSE) of the field reconstruction is minimized subject to the constraint that the boat has a finite amount of energy which it can expend on the traverse. Data from extensive traverses in the field as well as simulations validate the performance of our algorithm.
We are currently working on how to determine the appropriate resolution to discretize the sensed field. One interesting observation from the simulations and experiments is that when the initial available energy is increased, the estimation errors decrease rapidly and level off instead of decreasing to zero. Theoretically, when the energy available to the mobile node increases, more sensor readings can be taken and hence the estimation errors should keep decreasing. By examining the path generated by the adaptive sampling algorithm, we found that when the initial energy is enough for the mobile node to go through all the 'important' locations, increasing the initial energy does not have much effect on the estimation error. We plan to investigate advanced path planning strategies and alternative sampling design strategies in future work.
Faculty:
Staff:
Graduate Students: