ResearchUnlike gene prediction, prediction of promoters in silico is still in its infancy. A promoter is usually located upstream from the transcription start site (TSS), but regulatory elements can also be located downstream, for example, in the first intron of the gene itself. The promoter can roughly be divided in two parts: a proximal part, referred to as the core, and a distal part. The proximal part is believed to be responsible for correctly assembling the RNA polymerase II complex at the right position and for directing a basal level of transcription. It is mediated by elements, such as TATA and Initiator boxes through the binding of the TATA box-binding protein, and other general TFs specific for the RNA polymerase II. The distal part of the promoter is believed to contain those elements that regulate the spatio-temporal expression. A promoter region, as described above, presents a rather linear view of the promoter and this is one of the reasons why prediction programs do not perform better. The promoter is defined functionally and not structurally, which strongly limits its the means to model it. Many programs focus on search by signal using only one or two given features, such as the presence of a TATA box or Initiator element, but disregarded structural and more general sequence-based features characteristic for promoter element. In reality, a supplementary layer of complexity is added by bringing the TFs together on a promoter, by adopting a three-dimensional configuration, enabling the interaction with other parts to activate the basal transcription machinery. Newer approaches do take into account more features; they consider the higher order structure of a promoter DNA sequence important for transcriptional regulation and are based on the concept that they share common content features, although polymerase II promoters are quite different in terms of individual organization. Promoter regions might be distinguished from non-promoter regions on the basis of specific structural properties. These features are either directly or indirectly correlated with the three-dimensional structure a promoter region should adopt for gene expression in. The three-dimensional structure can depend on characteristic physico-chemical profiles of Z-DNA associated with scaffold and matrix attachment regions, stability of duplex DNA, DNA curvature, bending and curvature in B-DNA, DNA bending/stiffness, bendability, propeller twist, B-DNA twist, and protein induced deformability. If eukaryotic promoters have such general structural features independently of the genes they control, looking for these should help in identifying promoters in general.
Papers(5) Florquin, K., Degroeve, S., Saeys, Y., Van de Peer, Y. (2005) Large-scale structural analysis of the core promoter inMammalian and plant genomes. Nucleic Acids Res. 33(13):4255-64.
(4) Vandepoele, K., Vlieghe, K., Florquin, K., Hennig, L., Beemster, G.T.S., Gruissem, W., Van de Peer, Y., Inzé, D., De Veylder, L. (2005) Genome-wide identification of potential plant E2F target genes. Plant Physiol. 139(1):316-28.
(3) Vlieghe, K., Florquin, K., Vuylsteke, M., Rombauts, S., Van Hummelen, P., Van de Peer, Y., Inzé, D., De Veylder, L. (2003) Microarray analysis of E2Fa-DPa-overexpressingplants uncovers a cross-talking genetic networkbetween DNA replication and nitrogen assimilation. J. Cell Sci. 116(Pt 20):4249-59.
(2) * Rombauts, S., * Florquin, K., Lescot, M., Marchal, K., Rouzé, P., Van de Peer, Y. (2003) Computational approaches to identify promoters and cis-regulatory elements in plant genomes. Plant Physiol. 132(3):1162-76. *contributed equally
(1) De Bodt, S., Raes, J., Florquin, K., Rombauts, S., Rouzé, P., Theissen, G., Van de Peer, Y. (2003) Genomewide structural annotation and evolutionary analysis of the type I MADS-box genes in plants. J. Mol. Evol. 56(5):573-86.
VIB / UGent
Bioinformatics & Evolutionary Genomics
+32 (0) 9 33 13807 (phone)
+32 (0) 9 33 13809 (fax)