Skip Header NavigationIntranet 
CENTER FOR EMBEDDED NETWORKED SENSINGContactDirectionsEmploymentEventsNews
HomeAbout UsResearchEducationResourcesPeople

Research Project


Debugging

Technology > Systems : Programming and Storage > Debugging

On this page: Overview | Approach | Accomplishments | Future Directions

OVERVIEW

Due to their resource constrained nature and tight physical coupling, wireless sensor networks afford limited visibility into the application. As a result it is often difficult to debug issues that arise during their development and deployment. In addition, interactions between sensor hardware, protocols, and environmental characteristics are impossible to predict, so sensor network application design is an iterative process between debugging and deployment.

A rich set of debugging tools is vitally important for helping relative novices, such as students, make good progress on building real sensor network deployments. Current debugging techniques fall short for these systems which contain bugs characteristic of both distributed and embedded systems. Such bugs can be difficult to track because they are often multicausal, non-repeatable, timing-sensitive and have ephemeral triggers such as race conditions, decisions based on asynchronous changes in distributed state, or interactions with the physical environment. Visualization and data logging tools aid greatly in debugging, but their general approach of displaying all accessible information falls short during post-deployment debugging and even many simulation and emulation scenarios. Furthermore, it is a challenge to extract debugging information from a running system without introducing the probing effect (alteration of normal behavior due to instrumentation) or draining excessive energy.

We therefore plan to develop Sympathy, a tool that will collect relevant narrow metrics, highlight potentially interesting events, and determine correlations between even seemingly unrelated events, in order to help debug issues. Sympathy will enhance visibility into the system while preserving resources, non-intrusively observing the network, and facilitating seamless debugging between simulation, emulation, and post-deployment phases.

Sympathy’s strength lies in its support for highlighting events and correlating them with metrics in their spatiotemporal context. This is an improvement over traditional debugging techniques in three ways: it facilitates discovery of correlations by associating context with a specific event; it provides event tracking, which involves maintaining state; and it determines which events are important to track (only a finite number of events can be tracked). In addition to highlighting correlations, Sympathy avoids several iterations of debugging and re-running that would otherwise be needed to capture and analyze metrics in order to find events.

APPROACH

Sympathy’s primary goal is to aid in debugging application failures by highlighting the occurrence of certain events while an application is still running, and provide context for these events to aid in isolating their cause. In order to aid in debugging, Sympathy processes metrics continuously collected from surrounding nodes, and highlights the occurrence of pre-defined events observed from those metrics. The metrics are enumerated below. Spatio-temporal context for the event is provided based on the metrics collected in order to aid in isolating the cause. Finally, because bugs are often caused by interacting events, the system aids in determining correlations between various events.


For example, if a node is experiencing route flapping, this can be identified by examining the next-hop metric collected at each node. Currently a route-flapping event is defined as a change in next hop. Once this event is identified, Sympathy logs it to a file. Sympathy will then provide temporal context by printing to the log file the metrics it has been collecting from the node where the event was observed for the past 200 time units (seconds). All events will be logged to the same file in order to highlight correlations based on temporal or spatial collocation.


Sympathy´s general architecture is as follows: Sympathy collects metrics from all nodes and watches the metrics for indications of events, which are metric changes that often indicate important changes in application state. On inferring an event, Sympathy:

  1. Stores all metrics it has collected from the past 200 time units for the node causing the trigger, providing temporal context.
  2. Stores all metrics it has collected from the past 200 time units for the nodes neighboring the node where the event was detected, providing spatial context.
  3. Prints event and context information to a log file, which can aid in correlating events.
  4. Calls applications interested in the event.

Metrics Collected by Sympathy

Metric Name

Metric Description

Neighbor lists

List of neighbors and associated incoming link qualities. Neighbors are identified by ID, link qualities as delivery rates between 0 (100% loss) and 100 (100% delivery). If link quality has changed, the previous link quality is indicated in parentheses.

Outgoing link quality

Average link quality received from the current node, over all its neighbors.

Next hop (Routing table)

The next hop chosen by this node.

Path loss (Routing table)

The path loss rate associated with the next hop, calculated over the entire path based on pair-wise link qualities. Path loss is the inverse of link quality: lower values mean better quality.

Second-best next hop and path loss (Routing table)

The second-best choice for next hop and its associated path loss. If a node gets multiple gradient messages, it sets its next hop to the node providing the lowest path loss; this metric provides the second best node.

Byte counts

The number of bytes transmitted and received by this node.

Events Inferred & Collected by Sympathy

Event Name

Description (Additional Information Provided in Log File)

Metrics Used to Recognize Event

Missing node

No node reports a node n as a neighbor. Logs n.

All neighbor lists

Isolated node

Node n has no neighbors. Logs n.

n’s neighbor list

Route Change

n’s next hop changes at least once. Logs the previous and current next hop, the associated path loss for the top two choices for next hop, and the number of gradient messages received in this round.

n’s routing table information (Next Hop / Path Loss )

Neighbor list change

Node n2 joins or is dropped from n1’s neighbor list. Logs n1, n2, and current and previous link qualities.

n1’s neighbor list

Link quality change

Node n2’s link quality to n1 drops below a statically defined threshold. Logs n1, n2, and current and previous link qualities.

Link Quality

Sympathy is implemented using the Emview/Emstar infrastructure to collect state from each node. However, it is important to note that neither Emstar nor Emview is inherent to Sympathy’s functionality, and it can be implemented in other ways.

ACCOMPLISHMENTS

FUTURE DIRECTIONS

Shorter Term Goals

A debugging tool that will cover both pre- and post-deployment debugging. Post-deployment debugging will rely more on inferences of system state based on externally observable metrics, such as messages, and will not be as precise as the pre-deployment techniques. This entails:

Longer Term Goals

Move towards self-monitoring and self-healing networks. We need to: