Discuss as:

Device turns gestures into song

Researchers at the University of British Columbia demonstrate a gesture-controlled artificial speech system that's good enough to sing.



Researchers have created a system that converts hand gestures into speech, and yes, into song as well. Although the system isn't yet ready for a shot at "American Idol," its name — Digital Ventriloquized Actor, or DiVA — gives you an idea where the technology is going.


"It is a singing synthesizer," said Sidney Fels, director of the University of British Columbia's Media and Graphics Interdisciplinary Center, or MAGIC. Fels explained how DiVA does its thing today in Vancouver at the annual meeting of the American Association for the Advancement of Science.

With the gestures of the right hand, DiVA's operator controls the pitch and the character of the sounds. Closed-hand gestures produce consonants. Open-hand gestures produce vowels. Meanwhile, the left hand is hooked up with finger contacts to create stop sounds like and buh. "We designed a gestural space that mimics the vocal tract," Fels explained.

The result is eerie: In the video above, you'll see a singer accompanying herself with the DiVA's voice. (I'm not ready to put it on my playlist just yet.) And in a series of videos, DiVA operator Sageev Oore synthetically sings the alphabet song and recites Dr. Seuss' "Green Eggs and Ham" verse as if he were playing two characters. (Which is kind of like Gollum talking to himself in the "Lord of the Rings" movies.)

If DiVA goes commercial, it could provide a new way for people with speech disabilities to make themselves heard. But why go to all that trouble when there are other speech synthesizers out there, including the electronic voice made famous by physicist Stephen Hawking?

"The problem with that is, you won't be able to sing. You won't be able to be expressive," Fels said.

One of the intended applications for the technology is to create new types of singing musical instruments that can be played in real time. Fels said there have been five compositions written for DiVA so far, played by musicians trained to use the device. "It takes about 100 hours for a performer to learn how to speak and use the system," Fels said in a news release.

The gloves, the volume-control foot pedals, the magnetic-sensor system and other components that bring DiVA to life can get rather unwieldy. "It's a backpack full of equipment," Fels told journalists. "I wouldn't walk around the restaurant and order sushi with it." But Fels and his MAGIC team are developing a version that can be operated with a computer tablet.

That hints at what may be more important applications in the longer run. The DiVA project got started as a way to teach people how to control a complex system with gestures and give them auditory feedback to let them know when they're doing the gestures right.

"Other possible applications for this discovery are interfaces to make certain tasks easier, such as controlling cranes or other heavy machinery," Fels said. It's also conceivable that gesture-based training might offer an alternative way to learn and practice foreign languages, particularly Asian dialects that depend on precise tonal control.

Gesture-controlled input devices ranging from Nintendo's Wii and Microsoft's Kinect have already revolutionized the gaming industry. Will DiVA, or other devices like it, open up a whole new frontier for the field? Does the future belong to gestures? Feel free to weigh in with your comments below.

More about gesture-controlled devices:

More from the AAAS meeting in Vancouver:


Since I mentioned Kinect, I should note that msnbc.com is a joint venture involving Microsoft as well as NBC Universal.

Alan Boyle is science editor at msnbc.com. Connect with the Cosmic Log community by "liking" the log's Facebook page, following @b0yle on Twitter or adding the Cosmic Log Google+ page to your circles. You can also check out "The Case for Pluto," my book about the controversial dwarf planet and the search for new worlds.