Evaluating large-scale text mining applications beyond the traditional numeric performance measures
Text mining methods for the biomedical domain have matured substantially and are currently being applied on a large scale to support a variety of applications in systems biology, pathway curation, data integration and gene summarization. Community-wide challenges in the BioNLP research field provide gold-standard datasets and rigorous evaluation criteria, allowing for a meaningful comparison between techniques as well as measuring progress within the field. However, such evaluations are typically conducted on relatively small training and test datasets. On a larger scale, systematic erratic behaviour may occur that severely influences hundreds of thousands of predictions. In this work, we perform a critical assessment of a large-scale text mining resource, identifying systematic errors and determining their underlying causes through semi-automated analyses and manual evaluations.
The supplementary data of this study is freely available from http://bioinformatics.psb.ugent.be/supplementary_data/solan/bionlp13/
Van Landeghem, S., Kaewphan, S., Ginter, F., Van de Peer, Y. (2013) Evaluating large-scale text mining applications beyond the traditional numeric performance measures. Proceedings of the BioNLP 2013 Workshop 63-71.
VIB / UGent
Bioinformatics & Evolutionary Genomics
+32 (0) 9 33 13807 (phone)
+32 (0) 9 33 13809 (fax)