Application of the EVEX resource to event extraction and network construction: Shared Task entry and result analysis

Background: Current text mining algorithms typically aim at extracting biomolecular interactions only from the context of one sentence, and no prior work exists on extending the event context to the information extracted from other documents on a large scale. The motivation for this research is thus to test whether a performance gain can be obtained by aggregating information across documents with mutually supporting evidence.

In this paper, we describe our participation in the latest BioNLP Shared Task using the large-scale text mining resource EVEX. We participated in the Genia Event Extraction (GE) and Gene Regulation Network (GRN) tasks with two separate systems. In the GE task, we implemented a re-ranking approach to improve the precision of an existing classifier, incorporating features from the EVEX resource. In the GRN task, our system relied solely on the EVEX resource and utilized a rule-based conversion algorithm between the EVEX and GRN formats.

Results: In the GE task, our re-ranking approach led to a modest performance increase and resulted in the first rank of the official Shared Task results with 50.97% F-score. Additionally, in this paper we explore and evaluate the usage of continuous vector representations for this challenge.

In the GRN task, we ranked fifth in the official results with a strict/relaxed SER score of 0.92/0.81 respectively. To try and improve upon these results, we have implemented a novel machine learning based conversion system and benchmark its performance against the original rule-based system. Conclusions: For the GRN task, we were able to produce a gene regulatory network from the EVEX data, warranting the use of such generic large-scale text mining data in network biology settings. A detailed performance and error analysis provides more insight into the relatively low recall rates. In the error analysis for the GE task, we demonstrate that the re-ranker approach provides an opportunity for a substantial increase of performance, only partially realized in our current implementation.



Hakala, K., Van Landeghem, S., Salakoski, T., Van de Peer, Y., Ginter, F. (2014) Application of the EVEX resource to event extraction and network construction: Shared Task entry and result analysis. BMC Bioinformatics special issue on BioNLP Shared Task 2013.









Contact:
VIB / UGent
Bioinformatics & Evolutionary Genomics
Technologiepark 927
B-9052 Gent
BELGIUM
+32 (0) 9 33 13807 (phone)
+32 (0) 9 33 13809 (fax)

Don't hesitate to contact the in case of problems with the website!