MIT, Microsoft and Adobe pull sounds from 'soundless' videos using vibrations
Can hear a crisp packet crackle through bulletproof glass
Researchers from Massachusetts Institute of Technology (MIT), Microsoft and Adobe have developed a system that can pull sounds from a seemingly 'soundless' video using vibrations.
Together they have built an algorithm that can assess vibrations on a object on a video, and reproduce its original sounds. Studies so far have allowed them to hear a crisp packet rustling from behind bulletproof glass, at some distance.
"When sound hits an object, it causes the object to vibrate," said Abe Davis, a graduate student in electrical engineering and computer science at MIT and one of the paper's authors. "The motion of this vibration creates a very subtle visual signal that's usually invisible to the naked eye. People didn't realise that this information was there."
Davis was joined by Michael Rubinstein of Microsoft Research, Gautham Mysore of Adobe Research, MIT professors Frédo Durand and Bill Freeman, and fellow student Neal Wadhwa.
The ‘visual microphone' opens up a range of audio possibilities, according to Davis, who said that it gives a fresh perspective on objects. Law enforcement and forensics are likely uses, according to the report, but he expects wide adoption.
"We're recovering sounds from objects," he added. "That gives us a lot of information about the sound that's going on around the object, but it also gives us a lot of information about the object itself, because different objects are going to respond to sound in different ways."
A high-speed camera, capturing video at 2,000-6,000 frames per second (fps), was used to catch the vibrations in the studies, as were standard 60fps models. Both studies delivered usable results, the researchers said.
"While this audio reconstruction wasn't as faithful as it was with the high-speed camera, it may still be good enough to identify the gender of a speaker in a room; the number of speakers; and even, given accurate enough information about the acoustic properties of speakers' voices, their identities," it said.
The researchers will present more details on their research at the Siggraph conference in Vancouver taking place from 10 August.