Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition

Future smart homes and offices will need to be able to sense the activities of their occupants in order to intelligently adapt to the environment and respond to their users’ needs. An incredible variety of technical approaches for recognizing the user's actions has been considered over many decades of research. While tagging every object in the environment with sensors can be used to infer activity, this approach is expensive, hard to maintain, and often visually obtrusive. Therefore, the trend has been towards centralized sensing, either a worn device, or with microphones or cameras operating in an environment. While more practical, these high-fidelity sensors also raise significant privacy concerns. Indeed, many users are wary of microphones and cameras recording them in their homes, especially after recent data leaks. For this reason, there is renewed interest in identifying and exploring sensing modalities that are inherently more privacy preserving, yet sufficiently rich to enable fine-grained activity recognition.

In this work, we explore one such sensor: the millimeter wave (mmWave) Doppler radar. Owing to their extensive use in security and automobile applications, the price of these sensors has fallen dramatically, to even just a few dollars for basic units. More sophisticated frequency-modulated continuous wave (FMCW) sensors cost around $30USD. Both types of radar sensors are solid state and small enough to be integrated into consumer devices, such as smart speakers and smartphones. These radar sensors emit a known RF signal, and any motion in the scene (either from users or objects) causes reflected signals to be Doppler-shifted, which can then be used to create a 1D Doppler plot. In the case of FMCW sensors, a 2D plot of range vs. the Doppler shift of signals can be produced. Although some biomechanical attributes are expressed in the Doppler signal (e.g., limb gait while walking), this has only been shown to recognize people from a small set of users, and not from the population at large. Indeed, it would seem hard to be embarrassed by leaked Doppler data, in contrast to a video or audio recording that can easily reveal identity and capture sensitive content.

However, Doppler radar faces a significant challenge: bootstrapping machine learning classifiers. Unlike audio and computer vision approaches that can draw from huge libraries of videos to train machine learning models, Doppler radar has no existing large datasets. All prior Doppler sensing work we could find in the literature had to collect their own bespoke training data for recognition tasks. The scale of data appears to be so limited that the full potential of techniques like deep learning techniques remains to be seen. Thus, in this research, we propose a unique software pipeline that allows unstructured videos to be transformed into synthetic Doppler radar data that can then be used for training. This process opens up an unparalleled volume of training data for Doppler sensors, closing an important gap and elevating the feasibility of Doppler sensing for activity recognition.

Download

Paper PDF

Reference

Karan Ahuja, Yue Jiang, Mayank Goel, and Chris Harrison. 2021. Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 292, 1–10. DOI:https://doi.org/10.1145/3411764.3445138

Chris Harrison	Research Curriculum Vitae Fun Projects Travel
Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition Future smart homes and offices will need to be able to sense the activities of their occupants in order to intelligently adapt to the environment and respond to their users’ needs. An incredible variety of technical approaches for recognizing the user's actions has been considered over many decades of research. While tagging every object in the environment with sensors can be used to infer activity, this approach is expensive, hard to maintain, and often visually obtrusive. Therefore, the trend has been towards centralized sensing, either a worn device, or with microphones or cameras operating in an environment. While more practical, these high-fidelity sensors also raise significant privacy concerns. Indeed, many users are wary of microphones and cameras recording them in their homes, especially after recent data leaks. For this reason, there is renewed interest in identifying and exploring sensing modalities that are inherently more privacy preserving, yet sufficiently rich to enable fine-grained activity recognition. In this work, we explore one such sensor: the millimeter wave (mmWave) Doppler radar. Owing to their extensive use in security and automobile applications, the price of these sensors has fallen dramatically, to even just a few dollars for basic units. More sophisticated frequency-modulated continuous wave (FMCW) sensors cost around $30USD. Both types of radar sensors are solid state and small enough to be integrated into consumer devices, such as smart speakers and smartphones. These radar sensors emit a known RF signal, and any motion in the scene (either from users or objects) causes reflected signals to be Doppler-shifted, which can then be used to create a 1D Doppler plot. In the case of FMCW sensors, a 2D plot of range vs. the Doppler shift of signals can be produced. Although some biomechanical attributes are expressed in the Doppler signal (e.g., limb gait while walking), this has only been shown to recognize people from a small set of users, and not from the population at large. Indeed, it would seem hard to be embarrassed by leaked Doppler data, in contrast to a video or audio recording that can easily reveal identity and capture sensitive content. However, Doppler radar faces a significant challenge: bootstrapping machine learning classifiers. Unlike audio and computer vision approaches that can draw from huge libraries of videos to train machine learning models, Doppler radar has no existing large datasets. All prior Doppler sensing work we could find in the literature had to collect their own bespoke training data for recognition tasks. The scale of data appears to be so limited that the full potential of techniques like deep learning techniques remains to be seen. Thus, in this research, we propose a unique software pipeline that allows unstructured videos to be transformed into synthetic Doppler radar data that can then be used for training. This process opens up an unparalleled volume of training data for Doppler sensors, closing an important gap and elevating the feasibility of Doppler sensing for activity recognition. Download Paper PDF Reference Karan Ahuja, Yue Jiang, Mayank Goel, and Chris Harrison. 2021. Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 292, 1–10. DOI:https://doi.org/10.1145/3411764.3445138
© Chris Harrison