Kinect Hacks for Dummies


No comments posted yet


gadjung (2 years ago)

download ?

zcfeng (3 years ago)


Slide 1

Kinect Hacks for Dummies Tomoto Shimizu Washio Twitter: @tomoto335e (en) / @tomoto335 (ja) Rev 1: 3/11/2011 (JTPA Geek Saloon) Rev 2: 6/19/2011

Slide 2

Table of Contents Introduction Who is the author? Overview Kinect basics Chapter 1: Tech Side Hardware/software preparation Hacks and tips Chapter 2: Biz Side Original intention and actual feedback Video view analyses What else happened Conclusion

Slide 3

Who is the Author? 1973 1979 1985 1991 1997 2003 2009 CPU able to speak with MN1610 Z80 SC62015 x86, ARM, Other RISCs BASIC Programming language able to speak LISP Perl C VB C++ Java Tcl/Tk Ruby C# Haskell ActionScript Elementary school High school Part-time programmer 30% Guitar player and composer 65% University student 5% Hitachi California again at Hitachi Data Systems! Born in Japan Javascript Python California at Stanford! Where/what I was

Slide 4

Overview (1) What is this presentation all about? My Kinect Hacks as holiday project: How much fun Kinect hacking could be What is Kinect Hacks? Creating your own cool stuff using Kinect, motion sensing gaming system for Xbox360 When and how I started Kinect Hacks? On Dec 2011, a month after Kinect’s release From my friend’s tweet about Kinect and Kinect Hacks Me: “Wow, I’ve gotta do this! I don’t mind spending all my Winter Holidays!”

Slide 5

Overview (2) What is interesting with my Kinect Hacks? Intensive crash project Major part(*) done in a week (before my wife runs out patience) No special knowledge of motion recognition or 3D CG at the beginning Challenge for “the silliest thing ever” Me: “I’ll take my hats off to the smart hacks created by the brilliant people all around the world. Then, I’ll create something silly nobody ever thought of, dedicating the best of my intelligence, energy, and CPU & GPU power. It must be a fun!” Got unexpectedly huge response from public Huge views in YouTube & Nicovideo (300k in 1st week) Appeared on news blogs, newspapers, TV, and other media Contest-awarded Contacted by investor for commercialization … (*) kinect-ultra V1 that earned largest public response

Slide 6

Overview (3) What you may learn today How to start cool Kinect Hacks by yourself  Chapter 1: Tech Side Some hints for a geek to make a “hit” (Well, I hope so)  Chapter 2: Biz Side Disclaimers I am totally amateur for image recognition, motion recognition, and 3D CG I know only things interesting and/or necessary for me I do not care much for academic accuracy (Be careful I may be lying) I am a geek but not a business person

Slide 7

Kinect Basics (1) What is Kinect actually? Gaming system for Xbox360 that enables intuitive and natural game play using gestures and voices without controllers Released at Nov 2011 What is Kinect Sensor? Input device with RGB camera, IR depth sensor, and some other auxiliary sensors 640x480@30fps, 1280x1024@10fps(*) Internals developed by PrimeSense Connectable to PC via USB Drivers and libraries available for free In this presentation, “Kinect” refers to Kinect Sensor (*) With Avin’s Windows driver

Slide 8

Kinect Basics (2) What can you do with Kinect? Generally speaking… Kinect provides color of and distance to the object for each pixel 3D object recognition by PC Skeleton recognition by PC (So you will get 3D positions for each joint) RGB camera + Depth sensor Don’t you see you can build any cool stuff on this? Let’s hack! Far So far Near

Slide 9

Chapter I: Tech Side This chapter explains the nuts-and-bolts behind this crash project Like the tricks behind a magic, it’s nothing surprising once you get to know General mathematics (especially geometry) required How much time did I spend? Study: 3 days kinect-ultra: 7 days (for V1) + 2 days (for V2) kinect-kamehameha: 1 day (for V1) + 1 day (for V2) I think I should count “night” rather than “day” actually Got huge public response for this

Slide 10

Hardware Preparation Kinect, of course! Caution: buy standalone, but not Xbox-bundle Xbox-bundle does not have the adapter for USB connector Windows PC With fairly fast CPU and GPU The more powerful your hardware is, the more energy you can use for cool essential stuff rather than performance optimization Mine: Core i7 2600 + GeForce GTX 285 How about Mac and Linux? I am not so familiar, but probably Windows is safer because of good driver support(*) and Microsoft’s official SDK in the future You don’t need Xbox (*) Avin’s Windows driver can automatically calibrate RGB camera and IR depth sensor, but I was not able to find the same feature in Linux drivers when I tried. It could be better now.

Slide 11

Software Preparation (1) OpenNI + NITE + Avin’s SensorKinect Basic software component set for sensor information access and recognition algorithms OpenNI: Framework NITE: OpenNI-compatible implementation Avin’s SensorKinect: OpenNI-compabitle Kinect driver Advantages to other options (such as OpenKinect)(*) Released by PrimeSense Player recognition and skeleton tracking available out-of-the-box! Actually, this was the key success factor for me to get this project done so quickly without any special knowledge about motion recognition Auto calibration between RGB camera and IR sensor Thanks to Avin for nice driver implementation In this presentation, “OpenNI” refers to all of these software components as a set (*) Microsoft official SDK has just come out but not yet evaluated

Slide 12

Software Preparation (2) OpenGL support libraries Chose OpenGL for my first 3D CG API to learn Just followed “OpenGL SuperBible 5th Edition” Standard support libraries (e.g. freeglut) Original library in this book (GLTools) Others OpenCV Only used for reading image files and Gaussian random number

Slide 13

Hack 0: Study with Sample Programs Study for 3 days before starting kinect-ultra Surveyed both OpenKinect and OpenNI, and chose latter Learned basic pixel information access and OpenGL usage from OpenNI’s sample programs First practice piece: depth-aware delayed-overlay See “Algorithm March by Kinect”

Slide 14

Hack 1: Transformation Use “calibration complete” event to trigger transformation Calibration by “psi pose” is common for Kinect apps to start skeleton tracking “Something happens on calibration complete” is Kinect-ish entertainment Modulate color of player area to represent the superhero suit OpenNI reports “hey, this pixel seems a part of player #1” so the app easily knows which pixels should be modulated Switch color (red or gray) for each pixel based on its distance from head App can calculate Euclid distance between any pixels/joints in real world coordinates It is slow, however; some optimization is required You: “Isn’t it too rough?”  Me: “Well, that’s OK, this is meant to be funny after all!” Skinning should be ideal, but too serious and challenging Ψ psi pose

Slide 15

TIP: A Bit about Coordinate Systems Kinect coordinates Each XY plane (0, 0)~(640, 480) 0 Real world coordinates Skeleton positions from OpenNI Virtual 3D polygon objects Transformed by OpenNI API (a little slow) 10000~ Raw pixel color & depth data from Kinect OpenGL coordinates 1.0 0.0 Z-buffer (Non-linear) Each XY plane (-1.0, -1.0)~(1.0, 1.0) Vertex data finally given to OpenGL Projected by OpenGL API Depth (seems linear) XY plane Z

Slide 16

Hack 2: Detect Pose  Shoot Laser No motion detection, only pose detection! Calculation is tremendously easy without time derivative Once the positions of skeleton parts are given, elementary vector operations (distance, dot product, cross product) work very well Try and error to decide good parameters (e.g. thresholds) Spawn laser while specific pose is detected Laser is flat rectangle object in 3D virtual space with alpha texture, and laid over image from RGB camera Position/direction/initial velocity calculated from the pose Same approach for shooting Eye Slugger With an additional stability check

Slide 17

Hack 3: Hidden Surface Processing Place each pixel from Kinect as point object in 3D virtual space Not texture mapping So pixels and other 3D virtual objects hide each other Handle pixels in projective coords for good performance 3D virtual objects basically reside in real world coords, but mapping all pixels into real world is too slow Instead, directly map pixels from Kinect coords to OpenGL raw coords by transforming depth value to OpenGL Z-buffer value See next page, it was a hack

Slide 18

TIP: Fast Depth Transformation Kinect coordinates Each XY plane (0, 0)~(640, 480) 0 Real world coordinates Skeleton positions from OpenNI Virtual 3D polygon objects 10000~ Raw pixel color & depth data from Kinect OpenGL coordinates 1.0 0.0 Z-buffer (Non-linear) Each XY plane (-1.0, -1.0)~(1.0, 1.0) Projected by OpenGL API Depth (seems linear) XY plane Z Transformed by OpenNI API (a little slow) Vertex data finally given to OpenGL

Slide 19

Hack 4: Hit testing Hit-test between lasers (= rectangles in 3D virtual space) and real objects (= points in 3D virtual space), and convert lasers into sparks Impractical to check the distance between all the objects Instead, divide the real world space into coarse 1-bit voxels, and mark voxels that contain points No distance calculation, just voxel look up is enough for hit testing Mark voxels with down-sampled pixels Marking voxels needs to be done in the real world coordinates thus slow Maybe inaccurate, but fun!

Slide 20

IR laser depth sensing works even in dark room Cast random dot pattern and analyze parallax TIP: How Kinect Works in Darkness? (capture from above URL)

Slide 21

Drawing white circle does not look light ball at all…What can I do? Instead, brighten surroundings as per distance from light ball center You feel dazzling light and heat! (Thanks to human illusion) Use approximation because real Euclid distance calculation for all pixels is slow Calculate “pseudo” distance in projective coordinates (with tweaking Z value a bit) Try and error to decide how to modulate brightness by pseudo distance Not 100% scientific and realistic, but good enough and, most importantly, fun! Hack 5: Light Ball

Slide 22

Hack 6: Energy Wave (1) Represented by long-stretched polygon sphere Decide transparency by dot product between normal of polygonal surface and sight vector (for nebular effect) Solid around center, transparent around edge Implemented by GLSL (shading language) Although it was first time for me to work on this language, it’s done in about 30 minutes by tweaking a sample code in a book Add random fluctuation to normal (for misty/swirly effect) Accidentally discovered from bug

Slide 23

Hack 6: Energy Wave (2) n v Simple Reflection (in sample code) rgb = rgb·(n·v / |v|)k (sight vector) (normal) Nebular Effect a = (n·v / |v|)k After a quick tweak… Add random fluctuation to the normal to make the transparency roughly modulated by position and time. This makes the energy wave look misty or swirly Act as transparency Act as brightness

Slide 24

Hack 7: Hair! Secret formula to model the hair O = center of head, P = each pixel on player’s border near and above O Render narrow triangle from P to the direction of OP with length of n|OP| where n is a simple linear saw-wave function of r where r is the angle of OP against the horizon Add some repulsion against energy ball Randomly blend graded yellow (for “goldish shine” effect) Everything is calculated/rendered in 2D on projective plane Easy and unrealistic, but cartoonish and funny O P n|OP| n = simple linear saw-wave function of r Player’s border 0 r π/2 n r

Slide 25

Chapter 2: Biz Side Got unexpected huge response to uploaded video Maybe able to read some hint for a geek to make a “hit”…

Slide 26

What Did I Intend Actually? Absolutely no intention to be “successful”, but had other clear intentions which might be eventual success factors Desire to be in the same line as other Kinect Hackers Must be differentiated -- useless, nonsense, and never-seen Must be quickly done Before real game studios publish their serious work Before someone else (as crazy as myself) shoot lasers Completeness of entertainment First created laser shooting only (in 2 days), then added other features one by one till satisfied with “completeness” Motivated by “hey, this idea is too good! I couldn’t finish without it!” Transformation, hidden surface, hit testing, Eye Slugger, timeout, flying out, … Targeted at worldwide Created videos in both Japanese and English, and uploaded them to both YouTube and Nicovideo (Japanese video site) Creating only for one community would mean not to welcome the other

Slide 27

Examples of unexpected feedback It’s for kids! “My kid keeps PC and never leaves.” “When my kids and I play heroes and bad guys, they identify themselves with the heroes in their mind. If they can actually become the heroes out of their imagination, it will be wonderful.” It makes my dream come true! “I wanted to do this since I was a kid.” “The kid’s part of me says ‘Look! He transforms! I wanna do it!’ and drown out my adult’s words.” Me: “I did not mean it at all. I just tried to be silly and funny. But, it is definitely a pleasure to see people get excited about the future of the technology demonstrated by my dedicated work.”

Slide 28

Video View Analysis of kinect-ultra Exploded within 24 hours and reached to 300k in a week More discussion in next page Japan heats up and cools down very quickly while worldwide seems a little slower Forgotten while nothing happens, and remembered by occasional events Nicovideo-award nominee Explosion

Slide 29

Hypothesis of explosion mechanism Interesting to think how access could grew up so largely and rapidly Hypothesis: multistage explosive chain reaction among video, tweets, and news sites(*) Is it possible to make it happen intentionally?  Not sure, probably very difficult (*) My colleague tracked the public activity and came up with this hypothesis. Great job of him. Subject of observation is mainly Japan.

Slide 30

Video View Analysis of kinect-kamehameha No explosion Got many views at first in Nicovideo (more than ultra in fact), but did not fuse explosion Probably insufficient impact to make them tweet and penetrate the threshold Sustainable popularity from worldwide more than Japan From DBZ fans in the world? Views mainly come from Brazil Sporadic jump up – don’t know what is happening

Slide 31

What else happened (1) Appear on media Blog, news, and tech review sites Papers and magazines (e.g. Japan Times) TV shows (e.g. NHK BS1/2 in Japan) Net casting (in Japan and France) For more information: Public demos and presentations 3D Vision & Kinect Hacking Meetup JTPA Geek Saloon Maker Faire (Thanks to Matt Bell for involving me) Campus Party (Did not make it, though)

Slide 32

What else happened (2) Win and nominated for awards Matt Cutt’s Kinect Contest Winner Maker Faire 2011 Bay Area Editor’s Choice Winner Nicovideo Award 2011 Spring Nominee Other interesting encounters with Other hackers, of course! Investors Artists (who wanted to use the video in his art work) 3D modelers (who kindly contributed Eye Slugger model)

Slide 33

Conclusion Let’s Hack! Be creative and imaginative, listen carefully to your voice within, then materialize what really excites yourself You do not need to be a specialist in that technical domain, but know what you know and what not, and come up with your own solution most effective as to your purpose Your dedication could be awarded by the result far beyond your expectation

Summary: Technical description of my Kinect Hacks, kinect-ultra and kinect-kamehameha. Project sites: and

Tags: kinect

More by this User
Most Viewed