ISBN: 978-981-18-5852-9 DOI: 10.18178/wcse.2022.04.174
Effective Example Extrapolation for Long-Tail Relation Extraction
Abstract— Relationship extraction (RE) aims to identify the relationship between two entities in a sentence and is an important step to complete the knowledge graph (KG). In the medical field, the distribution of data is often unbalanced, for example, it is indisputable that the incidence of common diseases is higher than that of rare diseases, and both the size of departments and the number of cases result in an unbalanced distribution of electronic medical record data. Relationship extraction is more challenging when the relationship categories are distributed in long tails. Data augmentation is a common approach used to address category imbalance. We propose an effective example extrapolation (3E) data augmentation method: 3E generates new synthetic examples by simulating the example generation process of data-rich head relationships and extrapolating to an insufficient number of tail relationships categories. Experiments were con-ducted on the publicly available medical relationship extraction dataset 2010 i2b2/VA and compared with the upsampling meth-od to further validate its advantages in handling long-tail relationships.
Index Terms— relation extraction, electronic medical record, data augmentation, example extrapolation, longtail relations.
Ben Li
Xi’an University of Posts & Telecommunications
Jungang Han
Xi’an University of Posts & Telecommunications; Shaanxi Key Laboratory of Network Data Analysis and Intelligence Processing
Xiaoying Pan
Xi’an University of Posts & Telecommunications; Shaanxi Key Laboratory of Network Data Analysis and Intelligence Processing
Cite: Ben Li, Jungang Han, Xiaoying Pan, " Effective Example Extrapolation for Long-Tail Relation Extraction, " WCSE 2022 Spring Event: 2022 9th International Conference on Industrial Engineering and Applications, pp. 1509-1514, Sanya, China, April 15-18, 2022.