Kinect – First steps

On Tuesday 13 September 2011, I attended my first K-day at Jayway Stockholm. The day consisted of stand-ups, awesome technical presentations, round-table, labs and, of course, viewing of the Microsoft BUILD conference from Anaheim.

During the labs I joined a group who wanted to have a look at Kinect. This nice device features a depth-sensor (using infra-red laser), an RGB camera and an array of microphones. Some months ago Microsoft published the Kinect SDK which lets you write applications interacting with it. It’s in beta phase and only for non-commercial use. The SDK comes with two example applications namely a SkeletalViewer application and ShapeGame application. The former tracks a captured person and renders a skeleton image (“stick figure”), the RGB-image and a depth-image. The latter is a simple game where you use your body to catch falling shapes from the sky.

After installing the SDK, the sample code for the aforementioned applications is placed in C:UsersPublicDocumentsMicrosoft Research KinectSDK SamplesNUI on a Windows 7 system.

Our team wanted to try whether the Kinect was able to track user movements when placed behind glass which it proved to be. This was also shown in this demo.

Next, I wanted to try out some experimenting in the code to get a feel for the API’s. My intention was simply to have an image follow the movements of the right hand. There are two API’s to control the Kinect; NUI API (Natural User Interface) and the Audio API. You can use C++ or C# to build applications for the Kinect and though I have lot of love for C++ I chose C# since, well, it’s faster to get up and running with.

To use the API you simply reference Microsoft.Research.Kinect.dll on the .NET tab in the Add reference dialog and then add using Microsoft.Research.Kinect.Nui; in your source file. The main NUI API class is called Runtime and gives you access to the video stream, depth image stream and three events namely DepthFrameReadySkeletonFrameReady and VideoFrameReady. Using the stream getters or using the events is just a matter of chosing to retreive image data using a polling model or an event model.

For inspiration and helper methods I used the SkeletalViewer application but instead of creating the UI using XAML I just launched a WPF GUI from a console application. Instead of .NET events I chose to use Reactive Extensions’ Observables which gives you much greater flexibility and power for reacting on different user behavior. The main method code looks like this:

After NUI API and GUI setup the frame ready events for the skeleton and RGB video are converted to observable streams. The former is then used for filtering out the right hand position and attaching an image to it while the latter is used for getting the image frame of the video and showing it on the canvas. The result is shown here:

Jay follows hand's movement

To conclude: it is really easy to program against the Kinect API and really fun too! The Kinect sensor is quite inexpensive (~800 Skr) in comparison to what you get: endless possibilities and awesome technology! :)

This Post Has 3 Comments

Leave a Reply

Close Menu