Overview

We are increasingly surrounded by IoT devices that sense a wide variety of information in their vicinity to support both existing building operations (e.g., climate control and occupant comfort, security, diagnosing when equipment fails, energy efficiency) as well as drive emerging smart building applications (e.g., context sensing towards occupant wellness, sustainable operation, locating and managing available resources such as conference rooms, proactively detecting and managing building faults for building managers). However, the real-world deployment of such systems is challenging due to the need for first-class system support for security and privacy controls for building occupants and numerous deployment, scalability, reliability, and management challenges that come with a sensing infrastructure for buildings.

We present the design of Mites, a scalable end-to-end hardware-software system for supporting and managing high-fidelity distributed general-purpose sensing in buildings with fundamental primitives for privacy and security, scalable data management,and machine learning .

Our lab has carefully designed and developed the Mites platform to enable sensing of several ambient environmental parameters using nine different sensors — such as vibration, sound, movement, light intensity and color, humidity, temperature, pressure, thermal, and magnetic fields — to create one all-purpose “Mite” sensor that can be powered by a USB wall plug or using Power-over-Ethernet (PoE) in our TCS Hall building testbed. When using PoE, the Mites device can remotely power cycled or even be powered off to disable them entirely dependent on the preferences of the office occupants.

System Design

We built an end-to-end full-stack system with the goal of providing high-fidelity sensing of various ambient environmental facets in physical spaces in a building, ultimately enabling a diverse set of IoT applications.

Our Mites sensor package, comprising nine sensors that capture twelve multimodal sensing modalities (e.g., temperature, humidity, light, movement, and audio), is installed in the entire building, either in the ceiling or in the wall or plugged into a powered wall socket. Our Mites devices acquire data from the onboard sensors, perform signal processing, and extract different features, all on the sensor itself for privacy and to reduce data dimensionality. The featurized data from each Mites device are sent over WiFi using an end-to-end encrypted connection to our custom Mites software backend, hosted on secure on campus servers. Our backend includes features such as scalable data collection, management features, security and privacy primitives, and APIs for access control and management. We have implemented different UIs and a cross platform web application to support various building stakeholders and use cases.

We designed and implemented a three-tier architecture comprising a Gateway Layer (GL), Request Management Layer (RML), and Device Management Layer (DML) to provide scalability, extensibility, and reliability of our entire system (as shown in the above figure). The GL and RML manage all connected Mites devices, routing and load balancing the various data streams to different nodes of the DML depending on the available compute resources.

In addition, the GL and the RML store information about their operation in an existing open-source distributed data store called BuildingDepot, which we have extended to support the Mites infrastructure. Specifically, our extensions to BuildingDepot include new functionalities for scalability, privacy (obfuscation of metadata such as location, DeviceID, etc.), and extensibility, which are essential for a large-scale sensing infrastructure for buildings. Each DML worker node handles streams of featurized data from a set of Mites devices at configurable rates from $1$-$10$ Hz.

Finally, to enable scalable Machine Learning, we integrate our Mites system with a ML platform designed specifically for IoT use cases, called MLIoT. The DML and RML interact with MLIoT allow data visualization from Mites devices and provide training and serving ML models efficiently at scale.

Hardware

We designed a custom highly-integrated Mites device, with nine distinct physical sensors and specifically decided not to include a camera. Our integrated “single device” design serves as an exemplary embodiment of board design using many low-level sensor modalities to provide an exciting vehicle for IoT investigation.

We strategically placed sensors on the PCB to ensure optimal performance (e.g., ambient light sensor faces outwards), and we spatially separated analog and digital components to isolate unintended electrical noise from affecting the performance of neighboring components. For connectivity, we considered industry standards such as Ethernet and ZigBee but ultimately chose a combination of WiFi and Bluetooth for its ubiquity, ease- of-setup, range, and high bandwidth.

Firmware

Our firmware featurizes data on-board the Mites device. Not only does this reduce network overhead, but it also denatures the data, better protecting privacy while still preserving the essence of the signal. In particular, we selected features that do not permit reconstruction of the original signal.

Data from our high-sample-rate sensors are transformed into a spectral representation via a 256-sample sliding window FFT (10% overlapping), ten times per second. We also discard phase information. Our raw 8x8 GridEye matrix is flattened into row and column means (16 features). For our other low-sample-rate sensors, we compute seven statistical features (min, max, range, mean, sum, standard deviation and centroid) on a rolling one-second buffer (at 10Hz). The featurized data for every sensor is concatenated and sent to our secure server (located on campus) as a single data frame, encrypted with 128-bit AES.

We tune our raw sensor sampling rates over the course of deployment, collecting data at the speed needed to capture environmental events, but with no unnecessary fidelity. Specifically, we sample temperature, humidity, pressure, light color, light intensity, magnetometer, Wifi RSSI, GridEye and PIR motion sensors at 10Hz. All three axes of the accelerometer are sampled at 4 kHz, our microphone at 17kHz, and our and EMI sensor at 500 kHz. Note that when accelerometers are sampled at high speed, they can detect minute oscillatory vibrations propagating through structural elements in an environment (e.g., dry-wall, studs, joists), very much like a geophone.

Privacy and Security

Firmware Featurization

Our firmware featurizes data on-board the Mites device. Not only does this reduce network overhead, but it also denatures the data, better protecting privacy while still preserving the essence of the signal. The data from the sensors on each Mites device is processed in a series of steps that essentially convert it into a non-reconstructable featurized representation that consists of basic statistical features (min, max, range, average, sum, standard deviation, and centroid) and aggregated frequency representation values (using a Fast Fourier Transform (FFT)). This featurization and denaturing of data is done specifically to mitigate any privacy concerns such that the essence of the signals can be extracted while preventing the reconstruction of the original signals. Notably, all this processing and denaturing happens on the Mites device itself in its secure firmware; thus, the raw sensor data never leaves the Mites device.

Location obfuscation

We also designed and implemented a novel privacy-aware data collection method that reduces the potential risk of indirect association of the sensor data in office spaces with the behavior of one or more of the occupants of the office. We obfuscate the locations in offices where occupants have not consented yet such that a set of offices are all grouped together (e.g. all offices on the N/W corner of the 3rd floor). These obfuscated locations can still allow applications that need aggregate data (e.g. average humidity and temperature in the 3rd floor N/W corner) while preventing indirect association of the sensor data from its office occupant(s).

Data Model Views of Sensor Data to enable Privacy

Applications may need data from one or more sensors on a Mites device (e.g. occupancy detection may need PIR and thermal grid eye data). Similarly, once occupants have given their consent they may want to share data from a subset of their sensors with other users or applications. The Mites system provides fine-grained mechanisms to enable/disable access to specific sensor(s) from a Mites device, as well as specifying the level of access (Read, Write) is necessary to prevent over privileged apps. Our goal with these primitives is to provide occupants the transparency, and control, on who has access to the data from their personal spaces and for which applications and purposes.

Fine-grained Access Controls for Users

We provide extensive privacy controls for authenticated occupants in these offices to disable any (or even all) of the sensors using the Mites Mobile App. Users who don’t use the MitesApp can just send us an email to disable any or all the sensors on the Mites in their office, or request them to be powered off completely.

Demonstration Video