Integrating large-scale text mining and co-expression networks: Targeting NADP(H) metabolism in E. coli with event extraction.
We present an application of EVEX, a literature-scale event extraction resource, in the concrete biological use case of NADP(H) metabolism regulation in Escherichia coli. We make extensive use of the EVEX event generalization based on gene family definitions in Ensembl Genomes, to extract cross-species candidate regulators. We manually evaluate the resulting network so as to only preserve correct events and facilitate its integration with microarray-based co-expression data. When analysing the combined network obtained from text mining and co-expression, we identify 41 candidate genes involved in triangular patterns involving both subnetworks. Several of these candidates are of particular interest, and we discuss their biological relevance further. This study is the first to present a real-world evaluation of the EVEX resource in particular and literature-scale application of the systems emerging from the BioNLP Shared Task series in general. We summarize the lessons learned from this use case in order to focus future development of EVEX and similar literature-scale resources.
Kaewphan, S., Peltonen, S., Van Landeghem, S., Van de Peer, Y., Jones, P., Ginter, F. (2012) Integrating large-scale text mining and co-expression networks: Targeting NADP(H) metabolism in E. coli with event extraction. Proceedings of the LREC workshop on Building and Evaluating Resources for Biomedical Text Mining. Istanbul, Turkey.
VIB / UGent
Bioinformatics & Evolutionary Genomics
+32 (0) 9 33 13807 (phone)
+32 (0) 9 33 13809 (fax)