Researchers at Newcastle University and Microsoft Research Cambridge (MSR) have developed a sensor the size of a wrist-watch which tracks the 3-D movement of the hand and allows the user to remotely control any device.
Mapping finger movement and orientation, it gives the user remote control, allowing them to answer their phone while it's still in their pocket, for example.
Being presented this week at the 25th Association for Computing Machinery Symposium on User Interface Software and Technology, 'Digits' allows for the first time 3D interactions without being tied to any external hardware. It has been developed by David Kim, a MSR-funded PhD from Newcastle University's Culture Lab; Otmar Hilliges, Shahram Izadi, Alex Butler, and Jiawen Chen of MSR Cambridge; Iason Oikonomidis of Greece's Foundation for Research & Technology; and Professor Patrick Olivier of Newcastle University's Culture Lab.
"The Digits sensor doesn't rely on any external infrastructure so it is completely mobile," explained David Kim of Newcastle University. "This means users are not bound to a fixed space. They can interact while moving from room to room or even running down the street.
"What Digits does is finally take 3D interaction outside the living room."
The researchers said that to enable 3D spatial interaction anywhere, Digits had to be lightweight, consume little power, and have the potential to be as small and comfortable as a watch.
At the same time, Digits had to deliver superior gesture sensing and "understand" the human hand, from wrist orientation to the angle of each finger joint, so that interaction would not be limited to 3D points in space. Digits had to understand what the hand is trying to express – even while inside a pocket.
"We needed a system that enabled natural 3D interactions with bare hands, but with as much flexibility and accuracy as data gloves," said Kim.
The current prototype, which is being showcased at the ACM UIST 2012 conference this week, includes an infrared camera, IR laser line generator, IR diffuse illuminator, and an inertial-measurement unit (IMU) track.
"We wanted users to be able to interact spontaneously with their electronic devices using simple gestures without even having to reach for them," Kim added.
"Can you imagine how much easier it would be if you could answer your mobile phone while it's still in your pocket or buried at the bottom of your bag?"
One of the project's main contributions is a real-time signal-processing pipeline that robustly samples key parts of the hand, such as the tips and lower regions of each finger. Other important research achievements are two kinematic models that enable full reconstruction of hand poses from just five key points.
The project posed many challenges, but the researchers agreed that the hardest was extrapolating natural-looking hand motions from a sparse sampling of the key points sensed by the camera.
"We had to understand our own body parts first before we could formulate their workings mathematically," Izadi said. "We spent hours just staring at our fingers. We read dozens of scientific papers about the biomechanical properties of the human hand. We tried to correlate these five points with the highly complex motion of the hand. In fact, we completely rewrote each kinematic model about three or four times until we got it just right."
The team said that the most exciting moment of the project came when team members saw the models succeed. "At the beginning, the virtual hand often broke and collapsed. It was always very painful to watch," Kim said.
"Then, one day, we radically simplified the mathematical model, and suddenly, it behaved like a human hand. It felt absolutely surreal and immersive, like in the movie 'Avatar'. That moment gave us a big boost."
Digits isn't meant to be a general-purpose interaction platform, but to prove the utility of the technology, both the Digits technical paper being presented at UIST 2012 and accompanying video present interactive scenarios using Digits in a variety of applications, with particular emphasis on mobile scenarios, where it can interact with mobile phones and tablets.
The researchers also experimented with eyes-free interfaces, which enable users to leave mobile devices in a pocket or purse and interact with them using hand gestures.
Another exciting application area for Digits is in gaming. Currently, gaming systems on the market do not support hand sensing at a high level of fidelity. Because of the technical challenges in sensing a full 3D hand pose, most systems constrain the problem by limiting hand tracking to 2D input only or by supporting interaction through surfaces and other tangible mediators.
Digits could be complementary to these existing sensing modalities; one option could be to combine Kinect's full-body tracker with Digits' high-fidelity freehand interaction.
"By understanding how one part of the body works and knowing what sensors to use to capture a snapshot," Izadi said. "Digits offers a compelling look at the possibilities of opening up the full expressiveness and dexterity of one of our body parts for mobile human-computer interaction."
By instrumenting only the wrist, the user's entire hand is left to interact freely without wearing data gloves, input devices worn as gloves, most often used in virtual reality applications to facilitate tactile sensing and fine-motion control.
The Digits prototype, whose electronics are self-contained on the user's wrist, optically image the entirety of the user's hand, enabling freehand interactions in a mobile setting.