Kinect Tuner: Tracking and Audio
Another great blog from Microsoft Engineering Team
Theo Michel (SDE Lead) Wrote:Introducing ... the Kinect Tuner!
The Kinect Tuner (the "Tuner") is a new feature of the Xbox 360 console that allows users to "tune" their play space and audio settings in order to get the most out of Kinect. If you suspect that Kinect is not tracking you or hearing you as well as it should, the Tuner is the place to go.
Why do we call this helpful feature the Tuner? In many ways, the Tuner is a troubleshooter, but it's called the Tuner because "troubleshooter" sounds like something in the back of a motherboard manual, and other proposed names were too confusing, too dry, or otherwise objectionable. (Things can get out of hand when you let people anonymously suggest names on a whiteboard.)
The Kinect Tuner is accessible from the Xbox Dashboard and through menus in Kinect games, but the easiest way to get to it is to press the Guide button and then select "Kinect Tuner" from the list on the default Guide page. Of course, to use the Kinect Tuner, you'll first need to buy a Kinect and plug it into your Xbox 360 console. Go ahead, you know you want to!
There are three main areas of the Tuner, available from the main menu:
In this post, I will mostly discuss the internals and history of the Tracking Tuner, touch a bit on the Audio Tuner, and discuss some tips and tricks for getting the most out of both. The Depth Calibration part of the Tuner will be covered in a later post.
- Tracking makes sure that the Kinect Sensor's cameras can see you clearly and detect your movements.
- Audio makes sure that the Kinect Sensor's microphone array can hear you clearly.
- Depth Calibration calibrates the Kinect Sensor's notion of "depth" so that it knows as precisely as possible how far away you are.
Helping make sure Kinect is tracking you correctly is the number one job of the Tuner. Even before you select the "tracking" option from the Tuner's menu, we show (at the right side of the screen) a live feed of Kinect's view of your play space. (Play space is our term for the area where you're playing.) The idea here is that you can immediately see what Kinect sees, and maybe spot the problem right away. For example, your Halo 3 Cat Helmet might be sitting in front of the Kinect sensor, or it could be pointed in the wrong direction.
In case you're not sitting in front of a Kinect-enabled Xbox right now, the Tracking Tuner is a short wizard that walks you through the following:
- Asks you to make sure your Kinect sensor is positioned correctly; basically, two to six feet off the ground, centered above or below your TV.
- Resets the camera to a saved tilt angle. This part doesn't take into account the user's position - just the position and angle of the camera relative to the floor.
- Lets you adjust the angle if you need to.
- Runs some tests to make sure Kinect can see you clearly.
- Saves the new angle to your console.
- Compliments you on your good looks. You look great!
The Depth Map
The most interesting part of the Tracking Tuner is the part that shows you what Kinect sees. This is a live feed of the depth information that we call the "depth map" which is streamed straight from the Kinect Sensor into the Kinect Tuner. The depth map is a normal two-dimensional video, except that the color of each pixel in the video stream represents distance from the camera (whereas, pixel color normally represents ... color). So, you may ask, why is it always greenish? This is because we chose to map object distance to a range of green colors. Here's the spectrum of colors we currently use.
You might notice that there's a specific area where the green is lighter - this color corresponds to the "sweet spot" of distance for skeletal tracking.
Why not use the full color spectrum (red > blue)? Mainly because having multiple colors makes the image hard to interpret. Also we're Xbox, so we like green. A lot.
As you're standing in front of the sensor and looking at the depth map, you may either notice some purple spheres that follow your head and hands, or a full stick figure superimposed on you. These indicate that Kinect is successfully tracking you. Within our team, the skinny fellow is affectionately known as StickBallMan. The purple spheres aren't affectionately known as anything, but I'll call them The Spheres for now.
If you see StickBallMan, this indicates that Kinect is in "full" tracking mode. This will happen when you run Kinect Tuner over games, like Kinect Adventures, Kinect Sports, or Dance Central.
If you see The Spheres, this indicates that Kinect is in "hands" tracking mode. This will happen when you run Kinect Tuner over the Xbox Dashboard and related experiences, like the Avatar Editor or Zune.
StickBallMan and The Spheres can be very useful tools for spotting any issues Kinect may be having seeing you. For example, if your feet are out of view, StickBallMan's lower legs may jitter slightly, or be removed altogether. (Ouch!) If you see that, you may need to back up, or adjust the camera angle.
The Evolution of the Tracking Tuner
One might think that software development would follow a blueprint, similar to the construction of a bridge. An engineer lays out exactly what it will look like, and then the construction crew comes along and makes it happen.
But with the Kinect Tuner, as with many software projects, the development followed more of an evolutionary cycle. Here's the path we took, and some interesting stops we made along the way.
The Primordial Sludge
When our team started working on the tracking part of the Kinect Tuner (then known as the "Natal Troubleshooter" or "Nui Troubleshooter"), we had no idea what it was going to be. We just knew there needed to be some system-wide way for people to figure out their tracking issues.
We wanted the experience to be a part of the Xbox Guide so that users could figure out issues while a title was running, because there would be nothing worse than having to bail out of a game to correct some minor problem with where you're standing. So the team got to work figuring out what could be done in the Guide.
Working in the Guide is quite limiting because it gets only a small portion of the Xbox 360's resources - the rest understandably go to the currently running game. Fortunately, with coding magic, we were quickly able to get a depth map feed displaying in real time in the Guide (albeit with very little smoothing or other graphical enhancements), which proved that what we wanted was in the realm of possibility.
Dinosaurs and Mammals
Once we had something running, we tried a few different approaches.
We spent quite a bit of time trying to give the Kinect Tuner the look and feel of the current Out Of Box Experience (OOBE) - the little play space setup that you do when you plug in Kinect for the first time. OOBE has an interesting approach - they put a virtual "camera" behind the user, so the screen shows the user's avatar (from behind) with a mockup of the TV in the background.
Here's what the play space looks like in OOBE:
The goal here was to get the user to stand on the big green circle, so the Tuner would know the user was being tracked and ready to proceed. One of the interesting issues with this type of visualization is how the user relates their movements to the avatar on the screen. We found that it's absolutely essential to show the purple arrow on the ground and "virtual" TV in the background. Without these, even experienced Kinect users (members of our team!) were unsure of which direction they needed to walk to arrive at the circle.
We were able to get this to work quite nicely, although 3D-rendered avatars were not feasible in the Xbox Guide. So we used StickBallMan instead. It looked pretty good (see below), but we ran into problems with what happens when Kinect loses tracking - how does the user know where they are in relation to the imaginary TV and how to fix the problem? One solution was to show the depth map if tracking was lost, but that kind of switching back-and-forth is pretty jarring.
Another avenue we explored was enhancing the depth map. One of our awesome partners in XNA was willing to try out some serious enhancements on the depth map, to make it look prettier while giving the user context on where to stand. Here's an early prototype that he came up with:
Notice the red "wall" on the left - this was a plane that moved side-to-side, giving the appearance of Kinect scanning you for data. This approach was very cool looking, but a bit too "techy" and potentially confusing. It's also quite resource intensive, and would have exceeded our Guide resource limitations.
While experimenting with these approaches, we kept the plain depth map around, and slowly realized that people liked it. In every meeting where the Kinect Tuner was present, some people (who shall remain nameless) were unable to resist dancing in front of it, and watching StickBallMan rock out. So eventually we pulled out the OOBE-like view and made the depth map the sole play space troubleshooting tool.
The Kinect Tuner went through a round of usability testing soon after that change, and we were very excited (and a bit surprised) to hear that seeing the depth map was one of the "top moments of joy" during the usability tests. (This was not a test that involved any gameplay ... but still that's cool.)
While figuring out how to represent the play space, we were also working on how to automatically "test" whether the user was well positioned within the play space and could be seen clearly. Once we had settled on using the depth map, the team was able to focus much more on this part. We ended up using tech provided by multiple teams working on portions of the Kinect platform in order to test different aspects of the play space.
Some tests didn't end up making the final cut because they were unreliable or unnecessary. For example, at one point we had a test that checked whether your body was being obscured. It turned out that this frequently failed due to the user crossing their arms, so it had to go.
Here are some of the things the Tuner will automatically detect and instruct you to correct:
If one of these tests fails, the Kinect Tuner may show a color picture of what Kinect was seeing at the time of the failure to help you understand the source of the problem. Don't worry, those pictures aren't saved anywhere!
- "Too close": We'd like at least 1.8 meters (six feet) between the sensor and the player.
- "Head not detected": This usually indicates that your head is too close to the top of the camera's view. The best bet here is to back up, or tilt the camera up.
- "Body turned": This indicates that the player's body is not directly facing the sensor.
- "Face not clear": This indicates that Kinect can't see the user's face well enough. This could be due to a reflection off of glasses, hair over the face, or bad lighting among others.
- "Too much/Not enough light": Lighting is too dim or too bright.
- "No player detected": Kinect isn't seeing anyone. If you hit this, just wave to Kinect!
Play Space Tips and Tricks
Here are some little-known facts about the play space tuner, and tricks for getting the most out of it:
- On the "Adjust camera angle" screen, you can use the right thumbstick to adjust the camera, instead of using the up/down buttons.
- When the Kinect camera angle is adjusted, Kinect temporarily loses skeletal tracking, so StickBallMan and The Spheres may disappear. Just wave hello to your depth map, and StickBallMan or The Spheres should show up, so you'll know you are being tracked once again.
- The Tuner is intended to handle one person at a time. If multiple people are in view, StickBallMan will adhere to whoever is closest to the Kinect Sensor.
- If you need to tune your play space for multiple people, have the tallest and shortest people stand side by side while adjusting the camera angle and position, and try to get them both fully in view. (Then have one person go through the rest of the Tuner.)
- At the end of the Tracking Tuner, your camera angle is saved to your Xbox. If the camera tilt angle is ever changed (for example, by a game), the camera will tilt back to this saved angle next time you return to the Xbox Dashboard, or start another game.
- Wherever you are within the play space Tuner, you can always get back to your game with a single press of the Xbox Guide button. (But this won't save any changes you've made to the camera angle!)
If you find yourself yelling "XBOX, WHY CAN'T YOU HEAR ME?," or if your hard-core FPS buddies are hearing your Viva Pinata background music when you are using Kinect for party chat, then you may need to make a visit to the Audio portion of Kinect Tuner.
The Audio Tuner is a short wizard that walks you through:
- Checking your background noise level: The microphone array on the Kinect Sensor is quite sensitive, so if it's close to anything loud, that might interfere with normal operation.
- Turning it up to 11!: This step asks you to turn up your speaker volume. We need to be sure your volume is turned up high enough for the next step.
- MEC Calibration: The Xbox will play a series of sounds through your speakers.
- Configuration: The tuner determines whether Kinect is enabled for in-game and party chat.
- Voice: The tuner makes sure Kinect can hear your voice, by asking you to read some numbers.
Whereas the Tracking Tuner is an Xbox Guide application, the Audio Tuner is an Xbox Dashboard application. Being a Dashboard app gives us more power to do some of the resource-intensive things we needed to do in the Audio Tuner - primarily, MEC calibration.
If you're giving voice commands to your Xbox console or chatting with friends online without using a headset, you are using the Kinect Sensor's built-in set of microphones. But in order for this to work properly, it's important that the sounds of the game you're playing (or movie you're watching, or song you're listening to) don't drown out your voice. The solution to this problem is a process called multichannel echo cancellation, or "MEC". MEC takes the audio stream being received by the Kinect Sensor's microphones, and removes the sounds being output by the Xbox console.
MEC is "multichannel" because Xbox may output multiple channels of sound (e.g., 5.1 channels of surround sound), and it is "echo cancellation" because the sounds received by the Kinect Sensor are not exactly the same as the sounds the Xbox console sends to the stereo system (due to multiple factors, including echoes).
These factors make the game/movie/music sounds tricky to remove from the audio stream being "heard" by the Kinect Sensor. The MEC Calibration step of the Audio Tuner solves this problem by teaching your Xbox console the audio characteristics of your stereo and room. In the words of Rob, one of the developers of MEC:
"For our multichannel echo cancellation to work properly, we need to acoustically measure the room. To accomplish this, we play a series of calibration tones through separate speakers that allows us to measure the latency of the speakers as well as the physical characteristics of the room. This information effectively allows us to simulate the acoustical characteristics of the speakers and room and remove any sounds played through the console from the Kinect microphone."
Rob is working on a blog post on chat and audio, so stay tuned to the Engineering Blog for more info on this!
Audio Tips and Tricks
Here are some tips on getting the most out of the Audio Tuner and Kinect Audio in general:
- The Audio Tuner is automatically run as part of OOBE (the Out Of Box Experience) the first time you use Kinect. So if you've run through OOBE, you should be good to go with audio.
- When running the MEC portion of the Audio Tuner, be sure that the room is as quiet as possible.
- Always calibrate at the highest volume you will play games, watch movies, or listen to music. If you calibrate at a lower volume and then turn up to play, the echo cancellation performance could suffer and Kinect won't be able to hear you as well. Turn it up to 11!
- If you move Kinect to a different location, change the furniture in your room, or change the settings or speakers on your stereo system, you should re-run the Audio Tuner to recalibrate.
- If possible, place your Kinect out in the open. Being in an enclosed space can interfere with MEC.
- Don't place Kinect on or near a speaker.
We hope the Kinect Tuner is helping you get more enjoyment out of your Kinect, and that this blog post gave you an interesting peek into the adventure we had developing the Tuner.
Stay tuned (ha!) to the Xbox Engineering blog for more info on the Tuner and other topics. And remember - you look great!
[Kinect Sports Leaderboards]