Posted in | News | Optics and Photonics

Simple Optical Technique to Extract Audio Information from High-Speed Video Recordings

Those formerly silent walls can "talk" now: Researchers have demonstrated a simple optical technique by which audio information can be extracted from high-speed video recordings.

The method uses an image-matching process based on vibration from sound waves, and is reported in an article appearing in the November issue of the journal Optical Engineering, published by SPIE, the international society for optics and photonics.

"One of the intriguing aspects of the paper is the ability to recover spoken words from a video of objects in the room," said journal Associate Editor Reiner Eschbach, a Research Fellow at Xerox Corp. "The paper shows that the sound creates minute vibrations in objects and that these vibrations ― given the right equipment ― can be picked up from a video signal. This is an interesting foray into a new application space and will, in my view, trigger interesting research in the field,"

The article, "Audio extraction from silent high-speed video using an optical technique," was authored by Zhaoyang Wang, Hieu Nguyen, and Jason Quisberth of the Department of Engineering of the Catholic University of America, and is available from the SPIE Digital Library.

The technique is based on the fact that sound waves are mechanical waves that cause air to vibrate when traveling, the paper notes. That vibration through air can cause vibration of objects located in its traveling path, especially if the objects are lightweight, thin, and flexible, such as a piece of paper. The vibrations, although usually with small amplitudes, can be detected and analyzed algorithmically, and audio reconstructed based on those calculations.

The authors used a subset-based image-correlation approach to detect the motions of points on the surface of an object, capturing target images with a high-speed camera and applying the Gauss-Newton algorithm and a few other measures to achieve very fast and highly accurate image matching. Because the detected vibrations are directly related to sound waves, a simple model was used to reconstruct the original audio information of the sound waves.

While other recent work in the area reports on more sophisticated techniques to compute motion signals, the authors chose a simpler image-matching approach to measure vibration. Because light can travel through air considerably farther than sound and can pass through glass, they anticipate that the technique may find applications such as the passive detection of conversations inside of a building from a far distance, Wang said. "We are currently improving the technique to increase its accuracy and sensitivity, make the measurements in real-time, and remove interference from other sources."

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.