We present GHand, a GPU algorithm for markerless hand pose estimation from a single depth image obtained from a commodity depth camera. Our method uses a dual random forest approach: the first forest estimates position and orientation of hand in 3D, while the second forest determines the joint angles of the kinematic chain of our hand model. GHand runs entirely on GPU, at a speed of 64 FPS with an average 3D joint position error of 20mm. It can detect complex poses with interlocked and occluded fingers and hidden fingertips.
This paper aims to tackle the practically very challenging problem of efficient and accurate hand pose estimation from single depth images. A dedicated two-step regression forest pipeline is proposed: Given an input hand depth image, step one involves mainly estimation of 3D location and in-plane rotation of the hand using a pixel-wise regression forest. This is utilized in step two which delivers final hand estimation by a similar regression forest model based on the entire hand image patch. Moreover, our estimation is guided by internally executing a 3D hand kinematic chain model.
We propose the first algorithm to compute the 3D Delaunay triangulation (DT) on the GPU. Our algorithm uses massively parallel point insertion followed by bilateral flipping, a powerful local operation in computational geometry. Although a flipping algorithm is very amenable to parallel processing and has been employed to construct the 2D DT and the 3D convex hull on the GPU, to our knowledge there is no such successful attempt for constructing the 3D DT.