Autor
Palabras clave
Resumen

[Objective] This study proposes a new term extraction model for the intangible heritage (traditional drama), which also helps us construct a term database. [Methods] First, we analyzed the drama language characteristics from term category, semantic structure, and text length perspectives. Then, we added part of speech and domain features to the character representation obtained by the BERT-BiLSTM-CRF model. Finally, we incorporated the graph convolutional network (GCN) to the new model and captured the constraint relationship of the distant words. [Results] The F1 value of the proposed model reached 91.11\%, which was 1.3 percentage points higher than the baseline BERT-BiLSTM-CRF model. [Limitations] We only retrieved the experimental data from Baidu Baike and the official website of Intangible Cultural Heritage, which should have included more free texts from other sources, more categories of drama terms, as well as the external features. [Conclusions] The proposed model and the database for traditional drama terms will help us construct the knowledge graph for traditional drama.

Volumen
5
Número
12
Número de páginas
123-136
Publisher: Chinese Academy of Sciences
Numero ISSN
20963467 (ISSN)
URL
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85130486234&doi=10.11925%2finfotech.2096-3467.2021.0359&partnerID=40&md5=1a909699326c6395a16dd94626b4de99
DOI
10.11925/infotech.2096-3467.2021.0359
Descargar cita