Clothing parsing of Chinese minorities via the fusion of visual style and label constraints | Navarchivo

Autor	Q. Zhang L. Liu L. Gan X. Fu L. Liu Q. Huang
Palabras clave	Image parsing Label constraints Minority clothing Semantic labels Visual style
Resumen	Objective: Many minority groups live in China, and the visual styles of their clothing are different. The combination of clothing parsing and the clothing culture of these minority groups plays an important role in realizing the digital protection of the clothing images of these groups and the inheritance of their culture. However, a complete dataset of the clothing images of Chinese minorities remains lacking. The clothing styles of minority groups have complex structures and different visual styles. Semantic labels to distinguish the clothing of different minorities are lacking, and defining the semantic labels of ethnic accessories is a challenging task. Describing information, such as local details, styles, and ethnic characteristics of minority group clothing, is difficult when using existing clothing image parsing methods. Mutual interference between semantic labels leads to unsatisfactory accuracy and precision of clothing image parsing. Therefore, we proposed a clothing parsing method based on visual style and label constraints. Method: Our method primarily parsed minority group clothing through their visual style by fusing local and global features. Then, the label constraint network was used to suppress redundant tags and optimized the preliminary parsing results. First, we defined the general semantic labels of minority group clothing. The distinctive semantic labels were defined in accordance with the combination preference of semantic labels. We set four sets of annotation pairs based on human body parts, with a total of eight label points. Each pair of annotations corresponds to a set of key points on the clothing structure. The upper body garment was marked with the left/right collar, left/right sleeves, and left/right top hem. The lower body garment was marked with the left/right bottom hem. We also marked the visibility of each annotation and used the label annotations to determine whether occlusion occurred in the clothing. Second, combining the training images with the annotation pairs and the self-defined semantic labels, a visual style network was added on the basis of a full convolutional network. A branch was built on the last convolutional layer in the SegNet network. The branch was divided into three parts, with each part respectively dealing with the position and visibility of the annotation pairs and the local and global characteristics of the clothes. The two parts of the local and global features of the clothing were outputted to "fc7\_fusion" for fusion. The style features were returned to the SegNet network through a deconvolution layer, and preliminary parsing results were obtained. Finally, a label mapping function was used to convert the preliminary parsing result into a label vector in accordance with the number of labels. Each element indicates whether a corresponding label exists in the preliminary parsing result. Then, the label vector was compared with the true semantic labels in the training set, and the labels were corrected to suppress redundant label probability scores. The label constraint network eliminated redundant and erroneous labels by comparing the labels of the preliminary parsing results with those of the training images. The label constraint network avoided the mutual interference of labels and increased the accuracy of the parsing result. In addition, we constructed a clothing image dataset of 55 minority groups. The primary sources were online shopping sites, such as Taobao, Tmall, and JD. This dataset was expanded by including datasets from other platforms, such as Baidu Pictures, blogs, and forums. A total of 61 710 images were collected. At least 500 images were collected for each minority group. Result: The proposed method was validated on an image dataset of minority group clothing. Experimental results showed that the detection accuracy of clothing visual style features was higher with annotation pairs. The visual style network efficiently fused local and global features. The label constraint network effectively solved the mutual interference problem of labels. The method proposed in this study improved parsing accuracy on large-scale clothing labels, particularly on skirts with considerable differences in pattern texture and color blocks. The method also improved the small labels of accessories, such as hats and collars. The results of the minority group clothing parsing improved significantly. The pixel accuracy of the parsing results reached 90.54\%. Conclusion: The clothing of minority groups is characterized by complicated styles and accessories, lack of semantic labels, and complex labels that interfere with one another. Thus, we proposed a clothing parsing method that fuses visual style with label constraints. We constructed a dataset of minority group clothing images and defined the generic and distinctive semantic labels of minority group clothing. We made pixel-level semantic annotations and set up annotation pairs on the training images. Then, we built a visual style network based on SegNet to obtain preliminary parsing results. Finally, the mutual interference problem of semantic labels was solved through a label constraint network to obtain the final parsing result. Compared with other clothing parsing methods, our method improved the accuracy of minority group clothing image parsing. Inheriting culture and protecting intangible cultural heritage are significant. However, some clothing parsing results of this method are not ideal, particularly the accuracy of small accessories. The semantic labels of minority group clothing are imperfect and insufficiently accurate. Subsequent work will continue to improve the dataset, focusing on the aforementioned issues to further improve the accuracy of minority group clothing parsing.
Año de publicación	2021
Revista académica	Journal of Image and Graphics
Volumen	26
Número	2
Número de páginas	402-414
	Publisher: Editorial and Publishing Board of JIG
Idioma de edición	Chinese
Numero ISSN	10068961 (ISSN)
URL	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85101236315&doi=10.11834%2fjig.190655&partnerID=40&md5=d2f7a97611f8ba8362be6a68cb7b3958
DOI	10.11834/jig.190655
Descargar cita	DOI Google Académico BibTeX EndNote X3 XML EndNote 7 XML Endnote tagged Marc RIS