Technology > Systems Infrastructure Area Projects > Generalized software tools: SensorBase
Various sensor networks use different data storage and management mechanisms. In particular, the ESS2 mechanism in UCLA forwards raw environmental data as well as system metrics from low powered motes to one or more sinks running Stargate/Linux. Each sink writes data input streams into its storage device, often a compact flash card or RAM FS waiting for users to retrieve them manually. In cases where constant or even intermittent internet connection is present in the field, the sink uploads log files to one or more reliable servers located in secure server farms via 802.11b/T1, GPRS, or other well known mechanisms.
While the above data storage and management mechanism is straightforward to implement, publishing and sharing data is a challenge. For example, an ecologist who needs to find data correlations from different set of sensor networks has to retrieve compressed log files from different servers. In addition, the ecologist needs to understand different fields in the different file formats, and use different calibration equations to convert from raw data values to actual data values.
SensorBase.org is platform for common data storage and management system for sensor networks. It provides users a uniform and consistent method for publishing sensor network data. It allows users to define data types, groups, and permission levels. It is a sensor network specific search engine, which allows users to query for specific data sets based on geographic location, sensor type, date/time range, and other relevant fields.
Users that want to publish data sets create a project description and define what types of sensor readings are allowed for the project. Users can also create new measurement types, sensor types, raw to actual calibration equations, permission levels, add users to the group, and add trust reference count to users.
The core of SensorBase.org is a sensor network database schema, accessed using front-end text and HTML interfaces via HTTP and HTTPS. Below shows the relational schema for SensorBase.org with short descriptions for the various tables:

Project descriptions (project, measurement, sensor tables)— the project table contains information relevant to the project that users create. Each user can create one or more projects, and each project can be read or written by one or more users. Each project contains one or more measurement types such as “humidity”, “temperature”, “voltage”, and such. Each measurement type contains one or more sensor types such as “C129CX 3V thermister” and “Wind-o-matic 2000.”
Access control (user table)—the user table contains general information and access control permissions that each user has. Each parent user that has the root or modify-user privilege can create new accounts; new accounts are managed by the parents, or users that have the root privilege.
Pre-aggregation tables— from previous deployment experiences, users access data at a granularity much courser than the per second basis that SensorBase.org uses. For example, one biologist at James Reserve is interested at weekly min/max/average values, and frequently calculates the per second stream aggregates at on-demand basis. Processing millions of rows in the database is very IO intensive and slow. By specifying rules for pre-aggregation however, SensorBase.org periodically launches another process that aggregates and pre-caches ahead of time to reduce access latency.
In addition to the schema descriptions above, SensorBase.org front-end allows users to easily retrieve data in tab or comma separated files, using near natural language queries. Some of the sample queries that SensorBase.org allows include “get all the data points from user ‘%Richard%’”, “get all the data points from project ‘Cold Air Drainage’ from 2006-01-01 to 2006-01-05”, “get x,y,rawValue from project ‘Botanical Garden’”. The front-end also allows applications to use the REST/stateless HTTP GET requests to retrieve data points, easily and efficiently to interface with applications such as Google Maps, Google Earth, and even other web services and data management systems.
We are currently uploading data intermittently on a few selected deployments. We are in the process of creating benchmarks to test the current schema, so that we can understand both the accessing patterns of users and to better optimizes the system. The Cold Air Drainage transect experiment at James Reserve shows a few million rows inserted per month, and we expect to ramp up the usage of SensorBase.org after some of the benchmarks and experiments.
Basic functionalities in SensorBase.org are working. This includes signing in, creating a new project, creating a new measurement, creating a new sensor type, uploading data in the ESS2 specific format, and retrieving data points using simple queries.
Work planned in the coming year