Promoter prediction

The main topic of my research is the accurate and fast prediction of gene promoters. The accurate prediction of gene promoters in whole genomes is still one of the most difficult problems in bioinformatics. Ab initio promoter detection in anonymous sequences is at best in it's infancy, in eukaryots only for Human and Drosophila there exist tools that can predict promoters to some extent.

In my research I will focus on applying machine learning techniques for feature extraction, classification and evaluation to the problem of annotating promoters correctly. Hitherto most programs have used a functional description of the promoter, promoters are seen a collection of motifs that follow each other. Instead of this rather sequential approach I will focus on the three dimensional structure of a promoter. It has been shown before that promoters have distinct structural features and it should be possible to use this distinct structure to predict promoters.

Promoter structure

While all promoter sequences have a distinct structure compared to genes or intergenics, it is very interesting to study to structure of different classes of promoters and how they have evolved.


(20) Verkest, A., Abeel, T., Heyndrickx, KS., Van Leene, J., Lanz, C., Van De Slijke, E., De Winne, N., Eeckhout, D., Persiau, G., Van Breusegem, F., Inzé, D., Vandepoele, K., De Jaeger, G. (2014) A Generic Tool for Transcription Factor Target Gene Discovery in Arabidopsis Cell Suspension Cultures Based on Tandem Chromatin Affinity Purification. Plant Physiol. 164(3):1122-33.

(19) Vercruyssen, L., Verkest, A., Gonzalez, A., Heyndrickx, KS., Eeckhout, D., Han, B., Jégu, T., Archacki, R., Van Leene, J., Andriankaja, M., De Bodt, S., Abeel, T., Coppens, F., Dhondt, S., De Milde, L., Vermeersch, M., Maleux, K., Gevaert, O., Jerzmanowski, A., Benhamed, M., Wagner, D., Vandepoele, K., De Jaeger, G., Inzé, D. (2014) ANGUSTIFOLIA3 Binds to SWI/SNF Chromatin Remodeling Complexes to Regulate Transcription during Arabidopsis Leaf Development. The Plant Cell 26(1):210-29 .

(18) Galagan, J. E., Minch, K., Peterson, M., Lyubetskaya, A., Azizi, E., Sweet, L., Gomes, A., Rustad, T., Dolganov, G., Glotova, I., Abeel, T., Mahwinney, C., Kennedy, A. D., Allard, R., Brabant, W., Krueger, A., Jaini, S., Honda, B., Yu, W., Hickey, M. J., Zucker, J., Garay, C., Weiner, J., Sisk, P., Stolte, C., Winkler, J. K., Van de Peer, Y., Iazzetti, P., Camacho, D.M., Dreyfuss, J., Liu, Y., Dorhoi, A., Mollenkopf, H., Drogaris, P., Lamontagne, J., Zhou, K., Piquenot, J., Tae Park, S., Raman, S., H.E. Kaufmann, S., P. Mohney, R., Chelsky, D., Branch Moody, D., R. Sherman, D., K. Schoolnik, G. (2013) The Mycobacterium tuberculosis regulatory network and hypoxia. Nature 499(7457):178-83.

(17) Sterck, L., Billiau, K., Abeel, T., Rouzé, P., Van de Peer, Y. (2012) ORCAE: online resource for community annotation of eukaryotes. Nat. Methods 9(11):1041.

(16) Van Landeghem, S., Björne, J., Abeel, T., De Baets, B., Salakoski, T., Van de Peer, Y. (2012) Semantically linking molecular entities in literature through entity relationships. BMC Bioinformatics 13 Suppl 11:S6.

(15) Abeel, T., Van Parys, T., Saeys, Y., Galagan, J., Van de Peer, Y. (2012) GenomeView: a next-generation genome browser. Nucleic Acids Res. 40(2):e12.

(14) Holder, J.W., Ulrich, J.C., DeBono, A.C., Godfrey, P.A., Desjardins, C.A., Zucker, J., Zeng, Q., Leach, A.L.B., Ghiviriga, I., Dancel, C., Abeel, T., Gevers, D., Kodira, C., Desany, B., Affourtit, J., Birren, B.W., Sinskey, A.J. (2011) Comparative and Functional Genomics of Rhodococcus opacus PD630 for Biofuels Development. PLoS Genet. 7(9):e1002219.

(13) * Abeel, T., * Van Landeghem, S., Morante, R., Van Asch, V., Van de Peer, Y., Daelemans, W., Saeys, Y. (2010) Highlights of the BioTM 2010 workshop on advances in bio text mining. BMC Bioinformatics 11, I1. *contributed equally

(12) * Van Landeghem, S., * Abeel, T., Saeys, Y., Van de Peer, Y. (2010) Discriminative and informative features for biomolecular text mining with ensemble feature selection. Bioinformatics 26(18):i554-60. *contributed equally

(11) Abeel, T., Helleputte, T., Van de Peer, Y., Dupont, C., Saeys, Y. (2010) Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26(3):392-8.

(10) Klijn, C., Michaut, M., Abeel, T. (2010) Highlights from the 6th International Society for Computational Biology Student Council Symposium at the 18th Annual International Conference on Intelligent Systems for Molecular Biology. BMC Bioinformatics 11(Suppl 10), I1. Boston, MA, USA.

(9) Abeel, T., De Ridder, J., Peixoto, L. (2009) Highlights from the 5th International Society for Computational Biology StudentCouncil Symposium at the 17th Annual International Conference on IntelligentSystems for Molecular Biology and the 8th European Conference on ComputationalBiology. BMC Bioinformatics 10 Suppl 13:I1.

(8) Abeel, T., Van de Peer, Y., Saeys, Y. (2009) Towards a gold standard for promoter prediction evaluation. Bioinformatics 25(12):i313-20.

(7) Abeel, T., Van de Peer, Y., Saeys, Y. (2009) Java-ML: a machine learning library. J. Mach. Learn. Res. 10:931-4.

(6) Saeys, Y., Abeel, T., Van de Peer, Y. (2008) Robust Feature Selection using Ensemble Feature Selection Techniques. Proceedings of ECML/PKDD 5212:313-25.

(5) Abeel, T., Saeys, Y., Rouzé, P., Van de Peer, Y. (2008) ProSOM: Core promoter prediction based on unsupervised clustering of DNA physical profiles. Bioinformatics 24(13):i24-31.

(4) Saeys, Y., Abeel, T., Van de Peer, Y. (2008) Towards robust feature selection techniques. Proceedings of Benelearn 45-46.

(3) Abeel, T., Saeys, Y., Bonnet, E., Rouzé, P., Van de Peer, Y. (2008) Generic eukaryotic core promoter prediction using structural features of DNA. Genome Res. 18(2):310-23.

(2) Abeel, T., Saeys, Y., Van de Peer, Y. (2008) ProSOM: Core promoter identification in the human genome. Proceedings of Benelearn 77-78.

(1) Saeys, Y., Abeel, T., Degroeve, S., Van de Peer, Y. (2007) Translation initiation site prediction on a genomic scale: beauty in simplicity. Bioinformatics 23(13):i418-23.

