Solid and Effective Upper Limb Segmentation in Egocentric Vision
- Monica Gruosso – University of Basilicata
- Nicola Capece – University of Basilicata
- Ugo Erra – University of Basilicata
CATEGORY. Long Paper
KEYWORDS. Semantic Segmentation, Neural Networks, Computer Vision, Virtual Reality, Augmented Reality, Web MR/VR, Dataset, Egocentric Vision
ABSTRACT. Upper limb segmentation in egocentric vision is a challenging and nearly unexplored task that extends the well-known hand localization problem and can be crucial for a realistic representation of users’ limbs in immersive and interactive environments, such as VR/MR applications designed for web browsers that are a general-purpose solution suitable for any device.
Existing hand and arm segmentation approaches require a large amount of well-annotated data. Then different annotation techniques were designed, and several datasets were created. Such datasets are often limited to synthetic and semi-synthetic data that do not include the whole limb and differ significantly from real data, leading to poor performance in many realistic cases.
To overcome the limitations of previous methods and the challenges inherent in both egocentric vision and segmentation, we trained several segmentation networks based on the state-of-the-art DeepLabv3+ model, collecting a large-scale comprehensive dataset.
It consists of 46 thousand real-life and well-labeled RGB images with a great variety of skin colors, clothes, occlusions, and lighting conditions. In particular, we carefully selected the best data from existing datasets and added our EgoCam dataset, which includes new images with accurate labels.
Finally, we extensively evaluated the trained networks in unconstrained real-world environments to find the best model configuration for this task, achieving promising and remarkable results in diverse scenarios.
The code, the collected egocentric upper limb segmentation dataset, and a video demo of our work will be available on the project page.