Selecting relevant features for gene structure prediction.
Gene structure prediction is an important, yet complex task inbioinformatics. A reliable gene prediction forms the basis of manyapplications in functional, structural and comparative genomics. Apart from using machine learning techniques merely for classification orprediction, biologists are also interested in the principles behind complexprocesses, such as splicing. Such a discovery of domain-specific knowledgeis a main challenge for scientists working in this field, because manyissues involved in gene transcription are not yet well understood. A well-known method to gain more insight into data is the application offeature selection techniques. By eliminating irrelevant or redundantfeatures, a subset of relevant features can be discovered, often improvingboth classification performance as domain understanding. The paper is structured as follows. We start by explaining the biologicalbackground, needed to understand gene prediction. Then we introduce themain topics of the paper: gene structure prediction and feature subsetselection. The next section then discusses the specific problems that arisewhen combining gene structure prediction with feature selection techniques. We end with some concluding remarks and future perspectives.
Saeys, Y., Degroeve, S., Aeyels, D., Rouzé, P., Van de Peer, Y. (2004) Selecting relevant features for gene structure prediction. Proceedings of Benelearn 13:103-109.
VIB / UGent
Bioinformatics & Evolutionary Genomics
+32 (0) 9 33 13807 (phone)
+32 (0) 9 33 13809 (fax)