Loading…

Rapid Extraction of Research Areas from Scientific and Technological Literature

Along with the rapid development of Internet Plus, big data, and other technologies, the construction of smart cities is promoting the transformation and upgrading of mapping geographic information models from traditional information services to intelligent services with spatial sensing. At present,...

Full description

Saved in:
Bibliographic Details
Published in:Sensors and materials 2020-12, Vol.32 (12), p.4489
Main Authors: Yin, Chuan, Liu, Wanzeng, Yin, Duoduo, Zhai, Xi, Liu, Kexin, Jing, Changfeng, Huang, He
Format: Article
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Along with the rapid development of Internet Plus, big data, and other technologies, the construction of smart cities is promoting the transformation and upgrading of mapping geographic information models from traditional information services to intelligent services with spatial sensing. At present, however, most of the knowledge needed to provide intelligent services is implicit in the form of unstructured text in various books and journal papers in related fields, which is difficult to capture, use, analyze, and share. In particular, geographical feature knowledge is one of the types of knowledge that needs to be extracted urgently. To solve this problem, in this paper, we propose a method for the rapid extraction of research areas from scientific and technological literature abstracts. Firstly, with the help of a general naming entity identification tool, we propose a method of rapidly annotating place-name entities in administrative divisions. Then, combining the bidirectional long short-term memory conditional random field (BiLSTM-CRF) model with a place-name database covering five levels of administrative divisions in China, the identification, disambiguation, and relationship extraction of place names in different administrative divisions are realized. On this basis, the extraction of research areas is regarded as a two-classification problem, feature vectors such as frequency and location are constructed for the names of the extracted administrative divisions, and the classification model is constructed with the random forest algorithm to rapidly extract research areas. The experimental results show that the recognition accuracy of place names in administrative areas in this study is 92.61% and the recognition accuracy of research areas is 90.31%. The results are superior to those of similar algorithms; thus, the proposed method can accurately and rapidly extract research areas.
ISSN:0914-4935
DOI:10.18494/SAM.2020.3127