The starting point for research by a team headed by Karan Ahuja of the Future Interfaces Group within the Human-Computer Interaction Institute at Carnegie Mellon University (Pittsburgh, PA) is the observation that “low-cost, smartphone-powered VR/AR headsets are typically very basic devices, little more than plastic or cardboard shells.Such devices lack advanced features, such as controllers for the hands thus limiting their interactive capability. Moreover, even high-end consumer headsets lack the ability to track the body and face. For this reason, interactive experiences like social VR are underdeveloped.” Addressing these issues was the task undertaken by the team.
A recent article by the team on this subject is entitled “MeCap: Whole-Body Digitization for Low-Cost VR/AR Headsets.” It can be found in the Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology (UIST ‘19). ACM, New York, NY, USA, 453-462. A copy of the article is available on-line and can be found here.
The approach adopted by the researchers is called MeCap. Although it looks awkward, it is actually quite cleaver. It is implemented using a smartphone’s rear-facing camera to view two mirrored half-spheres which are positioned about 6 inches in front of the headset. No additional infrastructure or sensors are required. The accessory offers a highly distorted but otherwise full view of the wearer. The two spherical images captured by the smartphone camera are processed to produce “multiple, unwrapped, synthetic viewpoints.” This data is then used in conjunction with existing keypoint labeling algorithms to digitize the body and hands.
An informative video presenting and discussing the MeCap system can be found at the end of this article. The system is illustrated in the photograph below.
The MeCap system was demonstrated to provide “real time estimates of the wearer’s 3D body pose, hand pose, facial expression, physical appearance and surrounding environment. It was also possible to capture some aspects of the apparel of the wearer.”
Particularly attractive features of the MeCap approach are that it is very simple to implement and is low cost. Implementation should cost only about $5.
In their article, the researchers report on an evaluation of the accuracy of each tracking feature. Details of the means of evaluation and details of the results are presented in the article. Bottom line is that the researchers assess the approach as showing “imminent feasibility.”
Looking towards future work, the researchers identified other potential sensing opportunities. These include detection of other aspects of visual appearance such as beards, hairstyle or shoes. It was suggested that such additional information could allow for further avatar personalization. The researchers go on to comment that, although it is fairly straightforward to extract a patch of clothing and capture the pattern, it was found challenging to “tile elegantly.” Addressing this issue is another potential future task. Finally, it was reported that the wide angle field of view provided by the MeCap system sometimes permitted keypoint tracking of other people. This included people located to the sides and even partially behind the wearer. Future work could explore the possibilities of using this capability in applications such as social VR and spatial audio. The researchers conclude with the final speculation that it might even be used to provide body/hands/face data to other users wearing headsets without MeCap capability. -Arthur Berman
Carnegie Mellon University, Karan Ahuja, [email protected]