Gene prediction and gene classes in Arabidopsis thaliana.

Gene prediction methods for eukaryotic genomes still are not fully satisfying. One way to improve gene prediction accuracy, proven to be relevant for prokaryotes, is to consider more than one model of genes. Thus, we used our classification of Arabidopsis thaliana genes in two classes (CU(1) and CU(2)), previously delineated according to statistical features, in the GeneMark gene identification program. For each gene class, as well as for the two classes combined, a Markov model was developed (respectively, GM-CU(1), GM-CU(2) and GM-all) and then used on a test set of 168 genes to compare their respective efficiency. We concluded from this analysis that GM-CU(1) is more sensitive than GM-CU(2) which seems to be more specific to a gene type. Besides, GM-all does not give better results than GM-CU(1) and combining results from GM-CU(1) and GM-CU(2) greatly improve prediction efficiency in comparison with predictions made with GM-all only. Thus, this work confirms the necessity to consider more than one gene model for gene prediction in eukaryotic genomes, and to look for gene classes in order to build these models.

Mathé, C., Dehais, P., Pavy, N., Rombauts, S., Van Montagu, M., Rouzé, P. (2000) Gene prediction and gene classes in Arabidopsis thaliana. J. Biotechnol. 78(3):293-9.

VIB / UGent
Bioinformatics & Evolutionary Genomics
Technologiepark 927
B-9052 Gent
+32 (0) 9 33 13807 (phone)
+32 (0) 9 33 13809 (fax)

Don't hesitate to contact the in case of problems with the website!

You are visiting an outdated page of the BEG/Van de Peer Lab site.

Not all pages have been ported, so these archived pages are still available.

Redirect to the new website?