Gestural input based on hand pose estimation is a common interaction method for augmented reality (AR). This interaction technique has gained more popularity with the emergence of novel AR-supporting devices such as Microsoft HoloLens 2 (HL2) and advancements in computer vision research underpinning hand-tracking and gesture recognition methods. In our work, we focus on challenging cases where the AR interface is facilitated with a state-of-the-art HL2 headset for unconstrained execution of tasks requiring simultaneous hand movement and tracking. When using this headset, AR users might bimanually interact with digital and physical objects that are visible in the user’s field of view (FoV) through the see-through visor. Due to the limiting in-built capabilities, we investigated a range of hand pose estimation functionalities from different domains. To ensure a fair comparison, we asked several participants to carry out tasks requiring interactions with real-world objects and record the performance of various hand-tracking solutions. Next, we evaluated the performance of these algorithms through crowdsourcing, often used to provide ground truth for machine learning training. Our results provide a guideline for AR developers in selecting appropriate hand-tracking solutions for a given deployment context.
Dr Sławomir Tadeja
Dr Slawomir K. Tadeja is a Postdoctoral Associate with the Department of Mechanical Engineering at the Massachusetts Institute of Technology (MIT). Here, he works...