Autor | |
Palabras clave | |
Resumen |
[Objective] This study proposes a new term extraction model for the intangible heritage (traditional drama), which also helps us construct a term database. [Methods] First, we analyzed the drama language characteristics from term category, semantic structure, and text length perspectives. Then, we added part of speech and domain features to the character representation obtained by the BERT-BiLSTM-CRF model. Finally, we incorporated the graph convolutional network (GCN) to the new model and captured the constraint relationship of the distant words. [Results] The F1 value of the proposed model reached 91.11\%, which was 1.3 percentage points higher than the baseline BERT-BiLSTM-CRF model. [Limitations] We only retrieved the experimental data from Baidu Baike and the official website of Intangible Cultural Heritage, which should have included more free texts from other sources, more categories of drama terms, as well as the external features. [Conclusions] The proposed model and the database for traditional drama terms will help us construct the knowledge graph for traditional drama. |
Volumen |
5
|
Número |
12
|
Número de páginas |
123-136
|
Publisher: Chinese Academy of Sciences
|
|
Numero ISSN |
20963467 (ISSN)
|
URL |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85130486234&doi=10.11925%2finfotech.2096-3467.2021.0359&partnerID=40&md5=1a909699326c6395a16dd94626b4de99
|
DOI |
10.11925/infotech.2096-3467.2021.0359
|
Descargar cita |