EPYNET: Efficient Pyramidal Network for Clothing Segmentation

Soft biometrics traits extracted from a human body, including the type of clothes, hair color, and accessories, are useful information used for people tracking and identification. Semantic segmentation of these traits from images is still a challenge for researchers because of the huge variety of clothing styles, layering, shapes, and colors. To tackle these issues, we proposed EPYNET, a framework for clothing segmentation. EPYNET is based on the Single Shot MultiBox Detector (SSD) and the Feature Pyramid Network (FPN) with the EfficientNet model as the backbone. The framework also integrates data augmentation methods and noise reduction techniques to increase the accuracy of the segmentation. We also propose a new dataset named UTFPR-SBD3, consisting of 4,500 manually annotated images into 18 classes of objects, plus the background. Unlike available public datasets with imbalanced class distributions, the UTFPR-SBD3 has, at least, 100 instances per class to minimize the training difficulty of deep learning models. We introduced a new measure of dataset imbalance, motivated by the difficulty in comparing different datasets for clothing segmentation. With such a measure, it is possible to detect the influence of the background, classes with small items, or classes with a too high or too low number of instances. Experimental results on UTFPR-SBD3 show the effectiveness of EPYNET, outperforming the state-of-art methods for clothing segmentation on public datasets. Based on these results, we believe that the proposed approach can be potentially useful for many real-world applications related to soft biometrics, people surveillance, image description, clothes recommendation, and others.

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Semantic Segmentation UTFPR-SBD3 EPYNET IoU 51,02 # 1
1:1 Accuracy 92,06 # 1

Methods