New Motion Capture Techniques Use Only a Single Mounted Camera

A team of researchers has devised an ingenious, novel technique for a human motion capture system - using a single camera worn by the user.

Called "MonoEye," a team of scientists from the Tokyo Institute of Technology (Tokyo Tech) in Japan and Carnegie Mellon University in the US used a single ultrawide fisheye camera secured with a harness on the chest of the user. Despite its simple setup, this could revolutionize fields of sports, medicine, and entertainment.

Conventional motion capture (mocap) systems usually require specialized studios that have a set of synchronized cameras attached to walls and mounts, capturing the motion of actors or subjects from different angles. Usually, the mocap actors are fitted with a bodysuit that has sensors or markers that work in conjunction with the machine vision.

Additionally, machine vision systems used in these mocap studios are also rapidly advancing, thanks to machine learning and artificial intelligence.

Optical Valley Animation Technology Returns To Work After Months-Long Lockdown — WUHAN, CHINA - MAY 07:(CHINA OUT) An employee wears a safety mask while also doing motion capture at the 2:10 animation studio in Optical Valley technopark on May 7, 2020 in Wuhan, Hubei Province, China. During the Wuhan lockdown, many Chinese mainland residents stayed at home, so mobile phone animation products sales increased rapidly during that period. After the lockdown was lifted, the employees returned to the studio began to create.2019 Wuhan animation industry also made some achievements.《One Small Step》, an animated film produced by an animation company in Wuhan, was nominated for best animated short film at the 91st Academy Awards ceremony.Now, in order to promote the economic recovery of Wuhan, the central government and local governments have increased their support for the science and technology information industry. Photo by Getty Images

A Cost-Effective and Open-World Motion Capture System

Led by Hideki Koike at Tokyo Tech, the team presents their technique, which overcomes the space constraint - such as the requirement for enclosed, especially-equipped studios- as well as being generally cost-effective.

MonoEye can capture the user's body motion and the perspective, or "viewport." Researchers report: "Our ultra-wide fisheye lens has a 280-degree field-of-view and it can capture the user's limbs, face, and the surrounding environment." To attain a robust motion capture, researchers integrated MonoEye with three deep neural networks that can estimate the body pose, head post, and camera post, all in real-time 3D. These neural networks were already trained with a dataset that contains 680,000 models with varying body shapes and clothing, designed for different backgrounds and lighting effects. It also included some 16,000 photorealistic image frames.

However, this promising new technology still has its challenges: the domain gap between synthetic and real-world datasets. To overcome this gap, researchers intend to develop more photorealistic images for the dataset, improving the system's accuracy, and reducing the domain gap.

Researchers behind MonoEye are set to present their work at the 33rd ACM Symposium on User Interface Software and Technology.

Low-Cost Improvements in MoCap Technology

MonoEye follows a related work also from Tokyo Tech and Carnegie Mellon, in collaboration with the University of St. Andrews from Scotland and the University of New South Wales from Australia. Together, researchers from the institutions developed a wrist-worn device for capturing 3D hand poses.

Like the MonoEye, it also uses a singular camera - capturing photos of the back of the hand - and integrated with a neural network, DorsalNet, used to recognize dynamic gestures. Used for more spatially limited applications, this wrist-worn camera could advance the fields of augmented reality and virtual reality devices, which in turn affects the same industries as the MonoEye aims to medicine, sports, and entertainment.

While conventional technologies for hand motion capture require substantially less equipment than full-body mocap, it still requires a glove fitted with sensors and cameras. Unfortunately, it hinders natural hand movement aside from the discomfort for the user. With the new, wrist-worn camera, it is akin to having the mocap actor wearing a smartwatch.

Check out for more news and information on Technology in Science Times.