Publications

We publish new findings in genomics journals as well as in journals focusing on plant biology, bioinformatics & tool/resource development. A more detailed overview incl. citations is available via ORCID, ResearcherID or Google Scholar

Publications

  1. Vandepoele, K. (2017). A guide to the PLAZA 3.0 plant comparative genomic database. In A. D. van Dijk (Ed.), Plant genomics databases : methods and protocols (Vol. 1533, pp. 183–200). New York, NY, USA: Springer.
    PLAZA 3.0 is an online resource for comparative genomics and offers a versatile platform to study gene functions and gene families or to analyze genome organization and evolution in the green plant lineage. Starting from genome sequence information for over 35 plant species, precomputed comparative genomic data sets cover homologous gene families, multiple sequence alignments, phylogenetic trees, and genomic colinearity information within and between species. Complementary functional data sets, a Workbench, and interactive visualization tools are available through a user-friendly web interface, making PLAZA an excellent starting point to translate sequence or omics data sets into biological knowledge. PLAZA is available at http://bioinformatics.psb.ugent.be/plaza/ .
  2. De Schutter, Kristof, Tsaneva, M., Kulkarni, S. R., Rougé, P., Vandepoele, K., & Van Damme, E. (2017). Evolutionary relationships and expression analysis of EUL domain proteins in rice (Oryza sativa). RICE, 10.
    Background: Lectins, defined as 'Proteins that can recognize and bind specific carbohydrate structures', are widespread among all kingdoms of life and play an important role in various biological processes in the cell. Most plant lectins are involved in stress signaling and/or defense. The family of Euonymus-related lectins (EULs) represents a group of stress-related lectins composed of one or two EUL domains. The latter protein domain is unique in that it is ubiquitous in land plants, suggesting an important role for these proteins. Results: Despite the availability of multiple completely sequenced rice genomes, little is known on the occurrence of lectins in rice. We identified 329 putative lectin genes in the genome of Oryza sativa subsp. japonica belonging to nine out of 12 plant lectin families. In this paper, an in-depth molecular characterization of the EUL family of rice was performed. In addition, analyses of the promoter sequences and investigation of the transcript levels for these EUL genes enabled retrieval of important information related to the function and stress responsiveness of these lectins. Finally, a comparative analysis between rice cultivars and several monocot and dicot species revealed a high degree of sequence conservation within the EUL domain as well as in the domain organization of these lectins. Conclusions: The presence of EULs throughout the plant kingdom and the high degree of sequence conservation in the EUL domain suggest that these proteins serve an important function in the plant cell. Analysis of the promoter region of the rice EUL genes revealed a diversity of stress responsive elements. Furthermore analysis of the expression profiles of the EUL genes confirmed that they are differentially regulated in response to several types of stress. These data suggest a potential role for the EULs in plant stress signaling and defense.
  3. Babiychuk, E., Trinh, H. K., Vandepoele, K., Van De Slijke, E., Geelen, D., De Jaeger, G., Obokata, J., et al. (2017). The mutation nrpb1-A325V in the largest subunit of RNA polymerase II suppresses compromised growth of Arabidopsis plants deficient in a function of the general transcription factor IIF. PLANT JOURNAL, 89(4), 730–745.
    The evolutionarily conserved 12-subunit RNA polymerase II (Pol II) is a central catalytic component that drives RNA synthesis during the transcription cycle that consists of transcription initiation, elongation, and termination. A diverse set of general transcription factors, including a multifunctional TFIIF, govern Pol II selectivity, kinetic properties, and transcription coupling with posttranscriptional processes. Here, we show that TFIIF of Arabidopsis (Arabidopsis thaliana) resembles the metazoan complex that is composed of the TFIIF and TFIIF polypeptides. Arabidopsis has two TFIIF subunits, of which TFIIF1/MAN1 is essential and TFIIF2/MAN2 is not. In the partial loss-of-function mutant allele man1-1, the winged helix domain of Arabidopsis TFIIF1/MAN1 was dispensable for plant viability, whereas the cellular organization of the shoot and root apical meristems were abnormal. Forward genetic screening identified an epistatic interaction between the largest Pol II subunit nrpb1-A325V variant and the man1-1 mutation. The suppression of the man1-1 mutant developmental defects by a mutation in Pol II suggests a link between TFIIF functions in Arabidopsis transcription cycle and the maintenance of cellular organization in the shoot and root apical meristems.
  4. Ruprecht, C., Proost, S., Hernandez-Coronado, M., Ortiz-Ramirez, C., Lang, D., Rensing, S. A., Becker, J. D., et al. (2017). Phylogenomic analysis of gene co-expression networks reveals the evolution of functional modules. PLANT JOURNAL, 90(3), 447–465.
    Molecular evolutionary studies correlate genomic and phylogenetic information with the emergence of new traits of organisms. These traits are, however, the consequence of dynamic gene networks composed of functional modules, which might not be captured by genomic analyses. Here, we established a method that combines large-scale genomic and phylogenetic data with gene co-expression networks to extensively study the evolutionary make-up of modules in the moss Physcomitrella patens, and in the angiosperms Arabidopsis thaliana and Oryza sativa (rice). We first show that younger genes are less annotated than older genes. By mapping genomic data onto the co-expression networks, we found that genes from the same evolutionary period tend to be connected, whereas old and young genes tend to be disconnected. Consequently, the analysis revealed modules that emerged at a specific time in plant evolution. To uncover the evolutionary relationships of the modules that are conserved across the plant kingdom, we added phylogenetic information that revealed duplication and speciation events on the module level. This combined analysis revealed an independent duplication of cell wall modules in bryophytes and angiosperms, suggesting a parallel evolution of cell wall pathways in land plants.
  5. Ritter Traub, A., Iñigo, S., Fernandez Calvo, P., Heyndrickx, K., Dhondt, S., Shi, H., De Milde, L., et al. (2017). The transcriptional repressor complex FRS7-FRS12 regulates flowering time and growth in Arabidopsis. NATURE COMMUNICATIONS, 8.
    Most living organisms developed systems to efficiently time environmental changes. The plant-clock acts in coordination with external signals to generate output responses determining seasonal growth and flowering time. Here, we show that two Arabidopsis thaliana transcription factors, FAR1 RELATED SEQUENCE 7 (FRS7) and FRS12, act as negative regulators of these processes. These proteins accumulate particularly in short-day conditions and interact to form a complex. Loss-of-function of FRS7 and FRS12 results in early flowering plants with overly elongated hypocotyls mainly in short days. We demonstrate by molecular analysis that FRS7 and FRS12 affect these developmental processes in part by binding to the promoters and repressing the expression of GIGANTEA and PHYTOCHROME INTERACTING FACTOR 4 as well as several of their downstream signalling targets. Our data reveal a molecular machinery that controls the photoperiodic regulation of flowering and growth and offer insight into how plants adapt to seasonal changes.
  6. Kreft, L., Botzki, A., Coppens, F., Vandepoele, K., & Van Bel, M. (2017). PhyD3: a phylogenetic tree viewer with extended phyloXML support for functional genomics data visualization. BIOINFORMATICS, 33(18), 2946–2947.
    Motivation: Comparative and evolutionary studies utilize phylogenetic trees to analyze and visualize biological data. Recently, several web-based tools for the display, manipulation and annotation of phylogenetic trees, such as iTOL and Evolview, have released updates to be compatible with the latest web technologies. While those web tools operate an open server access model with a multitude of registered users, a feature-rich open source solution using current web technologies is not available. Results: Here, we present an extension of the widely used PhyloXML standard with several new options to accommodate functional genomics or annotation datasets for advanced visualization. Furthermore, PhyD3 has been developed as a lightweight tool using the JavaScript library D3.js to achieve a state-of-the-art phylogenetic tree visualization in the web browser, with support for advanced annotations. The current implementation is open source, easily adaptable and easy to implement in third parties' web sites. Availability and implementation: More information about PhyD3 itself, installation procedures and implementation links are available at http://phyd3.bits.vib.be and at http://github.com/vibbits/phyd3/. Contact: klaas.vandepoele@ugent.vib.be or michiel.vanbel@ugent.vib.be Supplementary information: Supplementary data are available at Bioinformatics online.
  7. Van de Velde, Jan, Van Bel, M., Vaneechoutte, D., & Vandepoele, K. (2016). A collection of conserved noncoding sequences to study gene regulation in flowering plants. PLANT PHYSIOLOGY, 171(4), 2586–2598.
    Transcription factors (TFs) regulate gene expression by binding cis-regulatory elements, of which the identification remains an ongoing challenge owing to the prevalence of large numbers of nonfunctional TF binding sites. Powerful comparative genomics methods, such as phylogenetic footprinting, can be used for the detection of conserved noncoding sequences (CNSs), which are functionally constrained and can greatly help in reducing the number of false-positive elements. In this study, we applied a phylogenetic footprinting approach for the identification of CNSs in 10 dicot plants, yielding 1,032,291 CNSs associated with 243,187 genes. To annotate CNSs with TF binding sites, we made use of binding site information for 642 TFs originating from 35 TF families in Arabidopsis (Arabidopsis thaliana). In three species, the identified CNSs were evaluated using TF chromatin immunoprecipitation sequencing data, resulting in significant overlap for the majority of data sets. To identify ultraconserved CNSs, we included genomes of additional plant families and identified 715 binding sites for 501 genes conserved in dicots, monocots, mosses, and green algae. Additionally, we found that genes that are part of conserved mini-regulons have a higher coherence in their expression profile than other divergent gene pairs. All identified CNSs were integrated in the PLAZA 3.0 Dicots comparative genomics platform (http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/) together with new functionalities facilitating the exploration of conserved cis-regulatory elements and their associated genes. The availability of this data set in a user-friendly platform enables the exploration of functional noncoding DNA to study gene regulation in a variety of plant species, including crops.
  8. Veeckman, E., Vandepoele, K., Asp, T., Roldàn-Ruiz, I., & Ruttink, T. (2016). Genomic variation in the FT gene family of perennial ryegrass (Lolium perenne). In I. Roldàn-Ruiz, J. Baert, & D. Reheul (Eds.), Breeding in a world of scarcity : proceedings of the 2015 meeting of the section “Forage Crops and Amenity Grasses” of Eucarpia (pp. 121–126). Presented at the 31st Symposium of Eucarpia’s “Forage Crops and Amenity Grasses” Section, Cham, Switzerland: Springer.
    The timing of fl owering is of prime importance for several agronomic traits, and its genetic control is therefore of great interest to breeders. Several signaling pathways converge on FLOWERING LOCUS T (FT) gene family members, which act as central regulators of fl owering, branching and seed dormancy. We identifi ed the complete FT gene family in the Lolium perenne genome and performed phylogenetic analysis to delineate functional clades and to identify putative functionally redundant paralogs. Five FT genes of L. perenne were selected for targeted resequencing in a genepool of 746 accessions to describe genetic diversity in wild accessions, commercial cultivars and breeding material.
  9. Van Leene, J., Blomme, J., Kulkarni, S. R., Cannoot, B., De Winne, N., Eeckhout, D., Persiau, G., et al. (2016). Functional characterization of the Arabidopsis transcription factor bZIP29 reveals its role in leaf and root development. JOURNAL OF EXPERIMENTAL BOTANY, 67(19), 5825–5840.
    Plant bZIP group I transcription factors have been reported mainly for their role during vascular development and osmosensory responses. Interestingly, bZIP29 has been identified in a cell cycle interactome, indicating additional functions of bZIP29 in plant development. Here, bZIP29 was functionally characterized to study its role during plant development. It is not present in vascular tissue but is specifically expressed in proliferative tissues. Genome-wide mapping of bZIP29 target genes confirmed its role in stress and osmosensory responses, but also identified specific binding to several core cell cycle genes and to genes involved in cell wall organization. bZIP29 protein complex analyses validated interaction with other bZIP group I members and provided insight into regulatory mechanisms acting on bZIP dimers. In agreement with bZIP29 expression in proliferative tissues and with its binding to promoters of cell cycle regulators, dominant-negative repression of bZIP29 altered the cell number in leaves and in the root meristem. A transcriptome analysis on the root meristem, however, indicated that bZIP29 might regulate cell number through control of cell wall organization. Finally, ectopic dominant-negative repression of bZIP29 and redundant factors led to a seedling-lethal phenotype, pointing to essential roles for bZIP group I factors early in plant development.
  10. Tzfadia, O., Diels, T., De Meyer, S., Vandepoele, K., Aharoni, A., & Van de Peer, Y. (2016). CoExpNetViz: comparative co-expression networks construction and visualization tool. FRONTIERS IN PLANT SCIENCE, 6.
    Motivation: Comparative transcriptomics is a common approach in functional gene discovery efforts. It allows for finding conserved co-expression patterns between orthologous genes in closely related plant species, suggesting that these genes potentially share similar function and regulation. Several efficient co-expression-based tools have been commonly used in plant research but most of these pipelines are limited to data from model systems, which greatly limit their utility. Moreover, in addition, none of the existing pipelines allow plant researchers to make use of their own unpublished gene expression data for performing a comparative co-expression analysis and generate multi-species co-expression networks. Results: We introduce CoExpNetViz, a computational tool that uses a set of query or "bait" genes as an input (chosen by the user) and a minimum of one pre-processed gene expression dataset. The CoExpNetViz algorithm proceeds in three main steps; (i) for every bait gene submitted, co-expression values are calculated using mutual information and Pearson correlation coefficients, (ii) non bait (or target) genes are grouped based on cross-species orthology, and (iii) output files are generated and results can be visualized as network graphs in Cytoscape. Availability: The CoExpNetViz tool is freely available both as a PHP web server (link: http://bioinformatics.psb.ugent.be/webtools/coexpr/) (implemented in C++) and as a Cytoscape plugin (implemented in Java). Both versions of the CoExpNetViz tool support LINUX and Windows platforms.
  11. Veeckman, Elisabeth, Ruttink, T., & Vandepoele, K. (2016). Are we there yet? : reliably estimating the completeness of plant genome sequences. PLANT CELL.
    Genome sequencing is becoming cheaper and faster thanks to the introduction of next-generation sequencing techniques. Dozens of new plant genome sequences have been released in recent years, ranging from small to gigantic repeat-rich or polyploid genomes. Most genome projects have a dual purpose: delivering a contiguous, complete genome assembly and creating a full catalog of correctly predicted genes. Frequently, the completeness of a species' gene catalog is measured using a set of marker genes that are expected to be present. This expectation can be defined along an evolutionary gradient, ranging from highly conserved genes to species-specific genes. Large-scale population resequencing studies have revealed that gene space is fairly variable even between closely related individuals, which limits the definition of the expected gene space, and, consequently, the accuracy of estimates used to assess genome and gene space completeness. We argue that, based on the desired applications of a genome sequencing project, different completeness scores for the genome assembly and/or gene space should be determined. Using examples from several dicot and monocot genomes, we outline some pitfalls and recommendations regarding methods to estimate completeness during different steps of genome assembly and annotation.
  12. Proost, Sebastian, Van Bel, M., Vaneechoutte, D., Van de Peer, Y., Inzé, D., Mueller-Roeber, B., & Vandepoele, K. (2015). PLAZA 3.0 : an access point for plant comparative genomics. NUCLEIC ACIDS RESEARCH, 43(D1), D974–D981.
    Comparative sequence analysis has significantly altered our view on the complexity of genome organization and gene functions in different kingdoms. PLAZA 3.0 is designed to make comparative genomics data for plants available through a user-friendly web interface. Structural and functional annotation, gene families, protein domains, phylogenetic trees and detailed information about genome organization can easily be queried and visualized. Compared with the first version released in 2009, which featured nine organisms, the number of integrated genomes is more than four times higher, and now covers 37 plant species. The new species provide a wider phylogenetic range as well as a more in-depth sampling of specific clades, and genomes of additional crop species are present. The functional annotation has been expanded and now comprises data from Gene Ontology, MapMan, UniProtKB/Swiss-Prot, PlnTFDB and PlantTFDB. Furthermore, we improved the algorithms to transfer functional annotation from well-characterized plant genomes to other species. The additional data and new features make PLAZA 3.0 (http://bioinformatics.psb.ugent.be/plaza/) a versatile and comprehensible resource for users wanting to explore genome information to study different aspects of plant biology, both in model and non-model organisms.
  13. Van Leene, J., Eeckhout, D., Cannoot, B., De Winne, N., Persiau, G., Van De Slijke, E., Vercruysse, L., et al. (2015). An improved toolbox to unravel the plant cellular machinery by tandem affinity purification of Arabidopsis protein complexes. NATURE PROTOCOLS, 10(1), 169–187.
    Tandem affinity purification coupled to mass spectrometry (TAP-MS) is one of the most advanced methods to characterize protein complexes in plants, giving a comprehensive view on the protein-protein interactions (PPIs) of a certain protein of interest (bait). The bait protein is fused to a double affinity tag, which consists of a protein G tag and a streptavidin-binding peptide separated by a very specific protease cleavage site, allowing highly specific protein complex isolation under near-physiological conditions. Implementation of this optimized TAP tag, combined with ultrasensitive MS, means that these experiments can be performed on small amounts (25 mg of total protein) of protein extracts from Arabidopsis cell suspension cultures. It is also possible to use this approach to isolate low abundant protein complexes from Arabidopsis seedlings, thus opening perspectives for the exploration of protein complexes in a plant developmental context. Next to protocols for efficient biomass generation of seedlings (similar to 7.5 months), we provide detailed protocols for TAP (1 d), and for sample preparation and liquid chromatography-tandem MS (LC-MS/MS; similar to 5 d), either from Arabidopsis seedlings or from cell cultures. For the identification of specific co-purifying proteins, we use an extended protein database and filter against a list of nonspecific proteins on the basis of the occurrence of a co-purified protein among 543 TAP experiments. The value of the provided protocols is illustrated through numerous applications described in recent literature.
  14. Verkest, A., Byzova, M., Martens, C., Willems, P., Verwulgen, T., Slabbinck, B., Rombaut, D., et al. (2015). Selection for improved energy use efficiency and drought tolerance in canola results in distinct transcriptome and epigenome changes. PLANT PHYSIOLOGY, 168(4), 1338–1350.
    To increase both the yield potential and stability of crops, integrated breeding strategies are used that have mostly a direct genetic basis, but the utility of epigenetics to improve complex traits is unclear. A better understanding of the status of the epigenome and its contribution to agronomic performance would help in developing approaches to incorporate the epigenetic component of complex traits into breeding programs. Starting from isogenic canola (Brassica napus) lines, epilines were generated by selecting, repeatedly for three generations, for increased energy use efficiency and drought tolerance. These epilines had an enhanced energy use efficiency, drought tolerance, and nitrogen use efficiency. Transcriptome analysis of the epilines and a line selected for its energy use efficiency solely revealed common differentially expressed genes related to the onset of stress tolerance-regulating signaling events. Genes related to responses to salt, osmotic, abscisic acid, and drought treatments were specifically differentially expressed in the drought-tolerant epilines. The status of the epigenome, scored as differential trimethylation of lysine-4 of histone 3, further supported the phenotype by targeting drought-responsive genes and facilitating the transcription of the differentially expressed genes. From these results, we conclude that the canola epigenome can be shaped by selection to increase energy use efficiency and stress tolerance. Hence, these findings warrant the further development of strategies to incorporate epigenetics into breeding.
  15. Nelissen, H., Eeckhout, D., Demuynck, K., Persiau, G., Walton, A., Van Bel, M., Vervoort, M., et al. (2015). Dynamic changes in ANGUSTIFOLIA3 complex composition reveal a growth regulatory mechanism in the maize leaf. PLANT CELL, 27(6), 1605–1619.
    Most molecular processes during plant development occur with a particular spatio-temporal specificity. Thus far, it has remained technically challenging to capture dynamic protein-protein interactions within a growing organ, where the interplay between cell division and cell expansion is instrumental. Here, we combined high-resolution sampling of the growing maize (Zea mays) leaf with tandem affinity purification followed by mass spectrometry. Our results indicate that the growth-regulating SWI/SNF chromatin remodeling complex associated with ANGUSTIFOLIA3 (AN3) was conserved within growing organs and between dicots and monocots. Moreover, we were able to demonstrate the dynamics of the AN3-interacting proteins within the growing leaf, since copurified GROWTH-REGULATING FACTORs (GRFs) varied throughout the growing leaf. Indeed, GRF1, GRF6, GRF7, GRF12, GRF15, and GRF17 were significantly enriched in the division zone of the growing leaf, while GRF4 and GRF10 levels were comparable between division zone and expansion zone in the growing leaf. These dynamics were also reflected at the mRNA and protein levels, indicating tight developmental regulation of the AN3-associated chromatin remodeling complex. In addition, the phenotypes of maize plants overexpressing miRNA396a-resistant GRF1 support a model proposing that distinct associations of the chromatin remodeling complex with specific GRFs tightly regulate the transition between cell division and cell expansion. Together, our data demonstrate that advancing from static to dynamic protein-protein interaction analysis in a growing organ adds insights in how developmental switches are regulated.
  16. Wang, F., Muto, A., Van de Velde, J., Neyt, P., Himanen, K., Vandepoele, K., & Van Lijsebettens, M. (2015). Functional analysis of the Arabidopsis TETRASPANIN gene family in plant growth and development. PLANT PHYSIOLOGY, 169(3), 2200–2214.
    TETRASPANIN (TET) genes encode conserved integral membrane proteins that are known in animals to function in cellular communication during gamete fusion, immunity reaction and pathogen recognition. In plants, functional information is limited to one of the 17 members of the Arabidopsis TET gene family and to expression data in reproductive stages. Here, the promoter activity of all 17 Arabidopsis TET genes was investigated by pAtTET::NLS-GFP/GUS reporter lines throughout the life cycle, which predicted functional divergence in the paralogous genes per clade. However, partial overlap was observed for many TET genes across the clades, correlating with few phenotypes in single mutants and therefore requiring double mutant combinations for functional investigation. Mutational analysis showed a role for TET13 in primary root growth and lateral root development, and redundant roles for TET5 and TET6 in leaf and root growth through negative regulation of cell proliferation. Strikingly, a number of TET genes were expressed in embryonic and seedling progenitor cells and remained expressed until the differentiation state in the mature plant, suggesting a dynamic function over developmental stages. cis-regulatory elements together with transcription factor binding data provided molecular insight into the site, conditions and perturbations that affect TET gene expression, and positioned the TET genes in different molecular pathways; the data represent a hypothesis-generating resource for further functional analyses.
  17. Glover, N. M., Daron, J., Pingault, L., Vandepoele, K., Paux, E., Feuillet, C., & Choulet, F. (2015). Small-scale gene duplications played a major role in the recent evolution of wheat chromosome 3B. GENOME BIOLOGY, 16.
    Background: Bread wheat is not only an important crop, but its large (17 Gb), highly repetitive, and hexaploid genome makes it a good model to study the organization and evolution of complex genomes. Recently, we produced a high quality reference sequence of wheat chromosome 3B (774 Mb), which provides an excellent opportunity to study the evolutionary dynamics of a large and polyploid genome, specifically the impact of single gene duplications. Results: We find that 27 % of the 3B predicted genes are non-syntenic with the orthologous chromosomes of Brachypodium distachyon, Oryza sativa, and Sorghum bicolor, whereas, by applying the same criteria, non-syntenic genes represent on average only 10 % of the predicted genes in these three model grasses. These non-syntenic genes on 3B have high sequence similarity to at least one other gene in the wheat genome, indicating that hexaploid wheat has undergone massive small-scale interchromosomal gene duplications compared to other grasses. Insertions of non-syntenic genes occurred at a similar rate along the chromosome, but these genes tend to be retained at a higher frequency in the distal, recombinogenic regions. The ratio of non-synonymous to synonymous substitution rates showed a more relaxed selection pressure for non-syntenic genes compared to syntenic genes, and gene ontology analysis indicated that non-syntenic genes may be enriched in functions involved in disease resistance. Conclusion: Our results highlight the major impact of single gene duplications on the wheat gene complement and confirm the accelerated evolution of the Triticeae lineage among grasses.
  18. Vriet, C., Lemmens, K., Vandepoele, K., Reuzeau, C., & Russinova, E. (2015). Evolutionary trails of plant steroid genes. TRENDS IN PLANT SCIENCE, 20(5), 301–308.
    Plant steroids - brassinosteroids (BRs) and their precursors, phytosterols-play a major role in plant growth, development, stress tolerance, and have high potential for agricultural applications. Currently, this prospect is limited by a lack of information about their evolution and expression dynamics (spatial and temporal) across plant species. The increasing number of sequenced genomes offers an opportunity for evolutionary studies that might help to prioritize functional analyses with the aim to improve crop yield and stress tolerance. In this review we provide a glimpse of the origin, evolution, and functional conservation of phytosterol and BR genes in the green plant lineage using comparative sequence and expression analyses of publicly available datasets.
  19. Gonzalez Sanchez, N., Pauwels, L., Baekelandt, A., De Milde, L., Van Leene, J., Besbrugge, N., Heyndrickx, K., et al. (2015). A repressor protein complex regulates leaf growth in Arabidopsis. PLANT CELL, 27(8), 2273–2287.
    Cell number is an important determinant of final organ size. In the leaf, a large proportion of cells are derived from the stomatal lineage. Meristemoids, which are stem cell-like precursor cells, undergo asymmetric divisions, generating several pavement cells adjacent to the two guard cells. However, the mechanism controlling the asymmetric divisions of these stem cells prior to differentiation is not well understood. Here, we characterized PEAPOD (PPD) proteins, the only transcriptional regulators known to negatively regulate meristemoid division. PPD proteins interact with KIX8 and KIX9, which act as adaptor proteins for the corepressor TOPLESS. D3-type cyclin encoding genes were identified among direct targets of PPD2, being negatively regulated by PPDs and KIX8/9. Accordingly, kix8 kix9 mutants phenocopied PPD loss-of-function producing larger leaves resulting from increased meristemoid amplifying divisions. The identified conserved complex might be specific for leaf growth in the second dimension, since it is not present in Poaceae (grasses), which also lack the developmental program it controls.
  20. Volders, P.-J., Verheggen, K., Menschaert, G., Vandepoele, K., Martens, L., Vandesompele, J., & Mestdagh, P. (2015). An update on LNCipedia : a database for annotated human lncRNA sequences. NUCLEIC ACIDS RESEARCH, 43(D1), D174–D180.
    The human genome is pervasively transcribed, producing thousands of non-coding RNA transcripts. The majority of these transcripts are long non-coding RNAs (lncRNAs) and novel lncRNA genes are being identified at rapid pace. To streamline these efforts, we created LNCipedia, an online repository of lncRNA transcripts and annotation. Here, we present LNCipedia 3.0 (http://www.lncipedia.org), the latest version of the publicly available human lncRNA database. Compared to the previous version of LNCipedia, the database grew over five times in size, gaining over 90 000 new lncRNA transcripts. Assessment of the protein-coding potential of LNCipedia entries is improved with state-of-the art methods that include large-scale reprocessing of publicly available proteomics data. As a result, a high-confidence set of lncRNA transcripts with low coding potential is defined and made available for download. In addition, a tool to assess lncRNA gene conservation between human, mouse and zebrafish has been implemented.
  21. De Witte, D., Van de Velde, J., Decap, D., Van Bel, M., Audenaert, P., Demeester, P., Dhoedt, B., et al. (2015). BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements. BIOINFORMATICS, 31(23), 3758–3766.
    Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O. sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z. mays.
  22. Zamariola, L., De Storme, N., Vannerum, K., Vandepoele, K., Armstrong, S. J., Franklin, F. C. H., & Geelen, D. (2014). SHUGOSHINs and PATRONUS protect meiotic centromere cohesion in Arabidopsis thaliana. PLANT JOURNAL, 77(5), 782–794.
    In meiosis, chromosome cohesion is maintained by the cohesin complex, which is released in a two-step manner. At meiosis I, the meiosis-specific cohesin subunit Rec8 is cleaved by the protease Separase along chromosome arms, allowing homologous chromosome segregation. Next, in meiosis II, cleavage of the remaining centromere cohesin results in separation of the sister chromatids. In eukaryotes, protection of centromeric cohesion in meiosis I is mediated by SHUGOSHINs (SGOs). The Arabidopsis genome contains two SGO homologs. Here we demonstrate that Atsgo1 mutants show a premature loss of cohesion of sister chromatid centromeres at anaphase I and that AtSGO2 partially rescues this loss of cohesion. In addition to SGOs, we characterize PATRONUS which is specifically required for the maintenance of cohesion of sister chromatid centromeres in meiosis II. In contrast to the Atsgo1 Atsgo2 double mutant, patronus T-DNA insertion mutants only display loss of sister chromatid cohesion after meiosis I, and additionally show disorganized spindles, resulting in defects in chromosome segregation in meiosis. This leads to reduced fertility and aneuploid offspring. Furthermore, we detect aneuploidy in sporophytic tissue, indicating a role for PATRONUS in chromosome segregation in somatic cells. Thus, ploidy stability is preserved in Arabidopsis by PATRONUS during both meiosis and mitosis.
  23. Vargas, L., Santa Brigida, A. B., Mota Filho, J. P., de Carvalho, T. G., Rojas, C. A., Vaneechoutte, D., Van Bel, M., et al. (2014). Drought tolerance conferred to sugarcane by association with Gluconacetobacter diazotrophicus: a transcriptomic view of hormone pathways. PLOS ONE, 9(12).
    Sugarcane interacts with particular types of beneficial nitrogen-fixing bacteria that provide fixed-nitrogen and plant growth hormones to host plants, promoting an increase in plant biomass. Other benefits, as enhanced tolerance to abiotic stresses have been reported to some diazotrophs. Here we aim to study the effects of the association between the diazotroph Gluconacetobacter diazotrophicus PAL5 and sugarcane cv. SP70-1143 during water depletion by characterizing differential transcriptome profiles of sugarcane. RNA-seq libraries were generated from roots and shoots of sugarcane plants free of endophytes that were inoculated with G. diazotrophicus and subjected to water depletion for 3 days. A sugarcane reference transcriptome was constructed and used for the identification of differentially expressed transcripts. The differential profile of non-inoculated SP70-1143 suggests that it responds to water deficit stress by the activation of drought-responsive markers and hormone pathways, as ABA and Ethylene. qRT-PCR revealed that root samples had higher levels of G. diazotrophicus 3 days after water deficit, compared to roots of inoculated plants watered normally. With prolonged drought only inoculated plants survived, indicating that SP70-1143 plants colonized with G. diazotrophicus become more tolerant to drought stress than non-inoculated plants. Strengthening this hypothesis, several gene expression responses to drought were inactivated or regulated in an opposite manner, especially in roots, when plants were colonized by the bacteria. The data suggests that colonized roots would not be suffering from stress in the same way as non-inoculated plants. On the other hand, shoots specifically activate ABA-dependent signaling genes, which could act as key elements in the drought resistance conferred by G. diazotrophicus to SP70-1143. This work reports for the first time the involvement of G. diazotrophicus in the promotion of drought-tolerance to sugarcane cv. SP70-1143, and it describes the initial molecular events that may trigger the increased drought tolerance in the host plant.
  24. Fu, Q., Fierro Gutierrez, A. C. E., Meysman, P., Sanchez Rodriguez, A., Vandepoele, K., Marchal, K., & Engelen, K. (2014). MAGIC: access portal to a cross-platform gene expression compendium for maize. BIOINFORMATICS, 30(9), 1316–1318.
    To facilitate the exploration of publicly available Zea mays expression data, we constructed a maize expression compendium, making use of an integration methodology and a consistent probe to gene mapping based on the 5b.60 sequence release of Z. mays. The compendium is made available through a web portal MAGIC that hosts a variety of analysis tools to easily browse and analyze the data. Our compendium is different from previous initiatives in combining expression values across different experiments by providing a consistent gene annotation across different platforms.
  25. Sonnhammer, E. L., Gabaldón, T., da Silva, A. W. S., Martin, M., Robinson-Rechavi, M., Boeckmann, B., Thomas, P. D., et al. (2014). Big data and other challenges in the quest for orthologs. BIOINFORMATICS.
    Given the rapid increase of species with a sequenced genome, the need to identify orthologous genes between them has emerged as a central bioinformatics task. Many different methods exist for orthology detection, which makes it difficult to decide which one to choose for a particular application. Here, we review the latest developments and issues in the orthology field, and summarize the most recent results reported at the third 'Quest for Orthologs' meeting. We focus on community efforts such as the adoption of reference proteomes, standard file formats and benchmarking. Progress in these areas is good, and they are already beneficial to both orthology consumers and providers. However, a major current issue is that the massive increase in complete proteomes poses computational challenges to many of the ortholog database providers, as most orthology inference algorithms scale at least quadratically with the number of proteomes. The Quest for Orthologs consortium is an open community with a number of working groups that join efforts to enhance various aspects of orthology analysis, such as defining standard formats and datasets, documenting community resources and benchmarking.
  26. Van de Velde, Jan, Heyndrickx, K., & Vandepoele, K. (2014). Inference of transcriptional networks in Arabidopsis through conserved noncoding sequence analysis. PLANT CELL, 26(7), 2729–2745.
    Transcriptional regulation plays an important role in establishing gene expression profiles during development or in response to (a) biotic stimuli. Transcription factor binding sites (TFBSs) are the functional elements that determine transcriptional activity, and the identification of individual TFBS in genome sequences is a major goal to inferring regulatory networks. We have developed a phylogenetic footprinting approach for the identification of conserved noncoding sequences (CNSs) across 12 dicot plants. Whereas both alignment and non-alignment-based techniques were applied to identify functional motifs in a multispecies context, our method accounts for incomplete motif conservation as well as high sequence divergence between related species. We identified 69,361 footprints associated with 17,895 genes. Through the integration of known TFBS obtained from the literature and experimental studies, we used the CNSs to compile a gene regulatory network in Arabidopsis thaliana containing 40,758 interactions, of which two-thirds act through binding events located in DNase I hypersensitive sites. This network shows significant enrichment toward in vivo targets of known regulators, and its overall quality was confirmed using five different biological validation metrics. Finally, through the integration of detailed expression and function information, we demonstrate how static CNSs can be converted into condition-dependent regulatory networks, offering opportunities for regulatory gene annotation.
  27. Lindemose, S., Jensen, M. K., Van de Velde, J., O’Shea, C., Heyndrickx, K., Workman, C. T., Vandepoele, K., et al. (2014). A DNA-binding-site landscape and regulatory network analysis for NAC transcription factors in Arabidopsis thaliana. NUCLEIC ACIDS RESEARCH, 42(12), 7681–7693.
    Target gene identification for transcription factors is a prerequisite for the systems wide understanding of organismal behaviour. NAM-ATAF1/2-CUC2 (NAC) transcription factors are amongst the largest transcription factor families in plants, yet limited data exist from unbiased approaches to resolve the DNA-binding preferences of individual members. Here, we present a TF-target gene identification workflow based on the integration of novel protein binding microarray data with gene expression and multi-species promoter sequence conservation to identify the DNA-binding specificities and the gene regulatory networks of 12 NAC transcription factors. Our data offer specific single-base resolution fingerprints for most TFs studied and indicate that NAC DNA-binding specificities might be predicted from their DNA-binding domain's sequence. The developed methodology, including the application of complementary functional genomics filters, makes it possible to translate, for each TF, protein binding microarray data into a set of high-quality target genes. With this approach, we confirm NAC target genes reported from independent in vivo analyses. We emphasize that candidate target gene sets together with the workflow associated with functional modules offer a strong resource to unravel the regulatory potential of NAC genes and that this workflow could be used to study other families of transcription factors.
  28. De Witte, D., Van Bel, M., Audenaert, P., Demeester, P., Dhoedt, B., Vandepoele, K., & Fostier, J. (2014). A parallel, distributed-memory framework for comparative motif discovery. In R. Wyrzykowski, J. Dongarra, K. Karczewski , & J. Wasniewski (Eds.), Lecture Notes in Computer Science (Vol. 8385, pp. 268–277). Presented at the 10th International Conference on Parallel Processing and Applied Mathematics (PPAM), Springer.
    The increasing number of sequenced organisms has opened new possibilities for the computational discovery of cis-regulatory elements ('motifs') based on phylogenetic footprinting. Word-based, exhaustive approaches are among the best performing algorithms, however, they pose significant computational challenges as the number of candidate motifs to evaluate is very high. In this contribution, we describe a parallel, distributed-memory framework for de novo comparative motif discovery. Within this framework, two approaches for phylogenetic footprinting are implemented: an alignment-based and an alignment-free method. The framework is able to statistically evaluate the conservation of motifs in a search space containing over 160 million candidate motifs using a distributed-memory cluster with 200 CPU cores in a few hours. Software available from http://bioinformatics.intec.ugent.be/blsspeller/
  29. Verkest, A., Abeel, T., Heyndrickx, K., Van Leene, J., Lanz, C., Van De Slijke, E., De Winne, N., et al. (2014). A generic tool for transcription factor target gene discovery in Arabidopsis cell suspension cultures based on tandem chromatin affinity purification. PLANT PHYSIOLOGY, 164(3), 1122–1133.
    Genome-wide identification of transcription factor (TF) binding sites is pivotal to our understanding of gene expression regulation. Although much progress has been made in the determination of potential binding regions of proteins by chromatin immunoprecipitation, this method has some inherent limitations regarding DNA enrichment efficiency and antibody necessity. Here, we report an alternative strategy for assaying in vivo TF-DNA binding in Arabidopsis (Arabidopsis thaliana) cells by tandem chromatin affinity purification (TChAP). Evaluation of TChAP using the E2Fa TF and comparison with traditional chromatin immunoprecipitation and single chromatin affinity purification illustrates the suitability of TChAP and provides a resource for exploring the E2Fa transcriptional network. Integration with transcriptome, cis-regulatory element, functional enrichment, and coexpression network analyses demonstrates the quality of the E2Fa TChAP sequencing data and validates the identification of new direct E2Fa targets. TChAP enhances both TF target mapping throughput, by circumventing issues related to antibody availability, and output, by improving DNA enrichment efficiency.
  30. Heyndrickx, K., Van de Velde, J., Wang, C., Weigel, D., & Vandepoele, K. (2014). A functional and evolutionary perspective on transcription factor binding in Arabidopsis thaliana. PLANT CELL, 26(10), 3894–3910.
    Understanding the mechanisms underlying gene regulation is paramount to comprehend the translation from genotype to phenotype. The two are connected by gene expression, and it is generally thought that variation in transcription factor (TF) function is an important determinant of phenotypic evolution. We analyzed publicly available genome-wide chromatin immunoprecipitation experiments for 27 TFs in Arabidopsis thaliana and constructed an experimental network containing 46,619 regulatory interactions and 15,188 target genes. We identified hub targets and highly occupied target (HOT) regions, which are enriched for genes involved in development, stimulus responses, signaling, and gene regulatory processes in the currently profiled network. We provide several lines of evidence that TF binding at plant HOT regions is functional, in contrast to that in animals, and not merely the result of accessible chromatin. HOT regions harbor specific DNA motifs, are enriched for differentially expressed genes, and are often conserved across crucifers and dicots, even though they are not under higher levels of purifying selection than non-HOT regions. Distal bound regions are under purifying selection as well and are enriched for a chromatin state showing regulation by the Polycomb repressive complex. Gene expression complexity is positively correlated with the total number of bound TFs, revealing insights in the regulatory code for genes with different expression breadths. The integration of noncanonical and canonical DNA motif information yields new hypotheses on cobinding and tethering between specific TFs involved in flowering and light regulation.
  31. Choulet, F., Alberti, A., Theil, S., Glover, N., Barbe, V., Daron, J., Pingault, L., et al. (2014). Structural and functional partitioning of bread wheat chromosome 3B. SCIENCE, 345(6194).
    We produced a reference sequence of the 1-gigabase chromosome 3B of hexaploid bread wheat. By sequencing 8452 bacterial artificial chromosomes in pools, we assembled a sequence of 774 megabases carrying 5326 protein-coding genes, 1938 pseudogenes, and 85% of transposable elements. The distribution of structural and functional features along the chromosome revealed partitioning correlated with meiotic recombination. Comparative analyses indicated high wheat-specific inter-and intrachromosomal gene duplication activities that are potential sources of variability for adaption. In addition to providing a better understanding of the organization, function, and evolution of a large and polyploid genome, the availability of a high-quality sequence anchored to genetic maps will accelerate the identification of genes underlying important agronomic traits.
  32. Vercruyssen, L., Verkest, A., Gonzalez Sanchez, N., Heyndrickx, K., Eeckhout, D., Han, S.-K., Jégu, T., et al. (2014). ANGUSTIFOLIA3 binds to SWI/SNF chromatin remodeling complexes to regulate transcription during Arabidopsis leaf development. PLANT CELL, 26(1), 210–229.
    The transcriptional coactivator ANGUSTIFOLIA3 (AN3) stimulates cell proliferation during Arabidopsis thaliana leaf development, but the molecular mechanism is largely unknown. Here, we show that inducible nuclear localization of AN3 during initial leaf growth results in differential expression of important transcriptional regulators, including GROWTH REGULATING FACTORs (GRFs). Chromatin purification further revealed the presence of AN3 at the loci of GRF5, GRF6, CYTOKININ RESPONSE FACTOR2, CONSTANS-LIKE5 (COL5), HECATE1 (HEC1), and ARABIDOPSIS RESPONSE REGULATOR4 (ARR4). Tandem affinity purification of protein complexes using AN3 as bait identified plant SWITCH/SUCROSE NONFERMENTING (SWI/SNF) chromatin remodeling complexes formed around the ATPases BRAHMA (BRM) or SPLAYED. Moreover, SWI/SNF ASSOCIATED PROTEIN 73B (SWP73B) is recruited by AN3 to the promoters of GRF5, GRF3, COL5, and ARR4, and both SWP73B and BRM occupy the HEC1 promoter. Furthermore, we show that AN3 and BRM genetically interact. The data indicate that AN3 associates with chromatin remodelers to regulate transcription. In addition, modification of SWI3C expression levels increases leaf size, underlining the importance of chromatin dynamics for growth regulation. Our results place the SWI/SNF-AN3 module as a major player at the transition from cell proliferation to cell differentiation in a developing leaf.
  33. Heyman, J., Cools, T., Vandenbussche, F., Heyndrickx, K., Van Leene, J., Vercauteren, I., Vanderauwera, S., et al. (2013). ERF115 controls root quiescent center cell division and stem cell replenishment. SCIENCE, 342(6160), 860–863.
    The quiescent center (QC) plays an essential role during root development by creating a microenvironment that preserves the stem cell fate of its surrounding cells. Despite being surrounded by highly mitotic active cells, QC cells self-renew at a low proliferation rate. Here, we identified the ERF115 transcription factor as a rate-limiting factor of QC cell division, acting as a transcriptional activator of the phytosulfokine PSK5 peptide hormone. ERF115 marks QC cell division but is restrained through proteolysis by the APC/C-CCS52A2 ubiquitin ligase, whereas QC proliferation is driven by brassinosteroid-dependent ERF115 expression. Together, these two antagonistic mechanisms delimit ERF115 activity, which is called upon when surrounding stem cells are damaged, revealing a cell cycle regulatory mechanism accounting for stem cell niche longevity.
  34. Verelst, W., Bertolini, E., De Bodt, S., Vandepoele, K., Demeulenaere, M., Pé, M. E., & Inzé, D. (2013). Molecular and physiological analysis of growth-limiting drought stress in Brachypodium distachyon leaves. MOLECULAR PLANT, 6(2), 311–322.
    The drought-tolerant grass Brachypodium distachyon is an emerging model species for temperate grasses and cereal crops. To explore the usefulness of this species for drought studies, a reproducible in vivo drought assay was developed. Spontaneous soil drying led to a 45% reduction in leaf size, and this was mostly due to a decrease in cell expansion, whereas cell division remained largely unaffected by drought. To investigate the molecular basis of the observed leaf growth reduction, the third Brachypodium leaf was dissected in three zones, namely proliferation, expansion, and mature zones, and subjected to transcriptome analysis, based on a whole-genome tiling array. This approach allowed us to highlight that transcriptome profiles of different developmental leaf zones respond differently to drought. Several genes and functional processes involved in drought tolerance were identified. The transcriptome data suggest an increased energy availability in the proliferation zones, along with an up-regulation of sterol synthesis that may influence membrane fluidity. This information may be used to improve the tolerance of temperate cereals to drought, which is undoubtedly one of the major environmental challenges faced by agriculture today and in the near future.
  35. Van Bel, M., Proost, S., Van Neste, C., Deforce, D., Van de Peer, Y., & Vandepoele, K. (2013). TRAPID: an efficient online tool for the functional and comparative analysis of de novo RNA-Seq transcriptomes. GENOME BIOLOGY, 14(12).
    Transcriptome analysis through next-generation sequencing technologies allows the generation of detailed gene catalogs for non-model species, at the cost of new challenges with regards to computational requirements and bioinformatics expertise. Here, we present TRAPID, an online tool for the fast and efficient processing of assembled RNA-Seq transcriptome data, developed to mitigate these challenges. TRAPID offers high-throughput open reading frame detection, frameshift correction and includes a functional, comparative and phylogenetic toolbox, making use of 175 reference proteomes. Benchmarking and comparison against state-of-the-art transcript analysis tools reveals the efficiency and unique features of the TRAPID system.
  36. Vandepoele, K., Van Bel, M., Richard, G., Van Landeghem, S., Verhelst, B., Moreau, H., Van de Peer, Y., et al. (2013). pico-PLAZA, a genome database of microbial photosynthetic eukaryotes. ENVIRONMENTAL MICROBIOLOGY, 15(8), 2147–2153.
    With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. PLAZA can be used to functionally characterize large-scale ES /RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylumtricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains.
  37. De Clercq, I., Vermeirssen, V., Van Aken, O., Vandepoele, K., Murcha, M. W., Law, S. R., Inzé, A., et al. (2013). The membrane-bound NAC transcription factor ANAC013 functions in mitochondrial retrograde regulation of the oxidative stress response in Arabidopsis. PLANT CELL, 25(9), 3472–3490.
    Upon disturbance of their function by stress, mitochondria can signal to the nucleus to steer the expression of responsive genes. This mitochondria-to-nucleus communication is often referred to as mitochondrial retrograde regulation (MRR). Although reactive oxygen species and calcium are likely candidate signaling molecules for MRR, the protein signaling components in plants remain largely unknown. Through meta-analysis of transcriptome data, we detected a set of genes that are common and robust targets of MRR and used them as a bait to identify its transcriptional regulators. In the upstream regions of these mitochondrial dysfunction stimulon (MDS) genes, we found a cis-regulatory element, the mitochondrial dysfunction motif (MDM), which is necessary and sufficient for gene expression under various mitochondrial perturbation conditions. Yeast one-hybrid analysis and electrophoretic mobility shift assays revealed that the transmembrane domain-containing NO APICAL MERISTEM/ARABIDOPSIS TRANSCRIPTION ACTIVATION FACTOR/CUP-SHAPED COTYLEDON transcription factors (ANAC013, ANAC016, ANAC017, ANAC053, and ANAC078) bound to the MDM cis-regulatory element. We demonstrate that ANAC013 mediates MRRinduced expression of the MDS genes by direct interaction with the MDMcis-regulatory element and triggers increased oxidative stress tolerance. In conclusion, we characterized ANAC013 as a regulator of MRR upon stress in Arabidopsis thaliana.
  38. De Smet, Riet, Adams, K. L., Vandepoele, K., Van Montagu, M., Maere, S., & Van de Peer, Y. (2013). Convergent gene loss following gene and genome duplications creates single-copy families in flowering plants. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 110(8), 2898–2903.
    The importance of gene gain through duplication has long been appreciated. In contrast, the importance of gene loss has only recently attracted attention. Indeed, studies in organisms ranging from plants to worms and humans suggest that duplication of some genes might be better tolerated than that of others. Here we have undertaken a large-scale study to investigate the existence of duplication-resistant genes in the sequenced genomes of 20 flowering plants. We demonstrate that there is a large set of genes that is convergently restored to single-copy status following multiple genome-wide and smaller scale duplication events. We rule out the possibility that such a pattern could be explained by random gene loss only and therefore propose that there is selection pressure to preserve such genes as singletons. This is further substantiated by the observation that angiosperm single-copy genes do not comprise a random fraction of the genome, but instead are often involved in essential housekeeping functions that are highly conserved across all eukaryotes. Furthermore, single-copy genes are generally expressed more highly and in more tissues than non-single-copy genes, and they exhibit higher sequence conservation. Finally, we propose different hypotheses to explain their resistance against duplication.
  39. De Witte, D., Van de Velde, J., Van Bel, M., Audenaert, P., Demeester, P., Dhoedt, B., Vandepoele, K., et al. (2013). Comparative motif discovery in the cloud. Benelux Bioinformatics Conference 2013, Abstracts. Presented at the Benelux Bioinformatics Conference 2013.
  40. Wang, F., Vandepoele, K., & Van Lijsebettens, M. (2012). Tetraspanin genes in plants. PLANT SCIENCE, 190, 9–15.
  41. Dessimoz, C., Gabaldón, T., Roos, D. S., Sonnhammer, E. L., Herrero, J., Quest Orthologs Consortium, the, Vandepoele, K., et al. (2012). Toward community standards in the quest for orthologs. BIOINFORMATICS, 28(6), 900–904.
  42. Van Bel, M., Proost, S., Wischnitzki, E., Movahedi, S., Scheerlinck, C., Van de Peer, Y., & Vandepoele, K. (2012). Dissecting plant genomes with the PLAZA comparative genomics platform. PLANT PHYSIOLOGY, 158(2), 590–600.
    With the arrival of low-cost, next-generation sequencing, a multitude of new plant genomes are being publicly released, providing unseen opportunities and challenges for comparative genomics studies. Here, we present PLAZA 2.5, a user-friendly online research environment to explore genomic information from different plants. This new release features updates to previous genome annotations and a substantial number of newly available plant genomes as well as various new interactive tools and visualizations. Currently, PLAZA hosts 25 organisms covering a broad taxonomic range, including 13 eudicots, five monocots, one lycopod, one moss, and five algae. The available data consist of structural and functional gene annotations, homologous gene families, multiple sequence alignments, phylogenetic trees, and colinear regions within and between species. A new Integrative Orthology Viewer, combining information from different orthology prediction methodologies, was developed to efficiently investigate complex orthology relationships. Cross-species expression analysis revealed that the integration of complementary data types extended the scope of complex orthology relationships, especially between more distantly related species. Finally, based on phylogenetic profiling, we propose a set of core gene families within the green plant lineage that will be instrumental to assess the gene space of draft or newly sequenced plant genomes during the assembly or annotation phase.
  43. Quimbaya Gomez, M. A., Vandepoele, K., Raspé, E., Matthijs, M., Dhondt, S., Beemster, G., Berx, G., et al. (2012). Identification of putative cancer genes through data integration and comparative genomics between plants and humans. CELLULAR AND MOLECULAR LIFE SCIENCES, 69(12), 2041–2055.
    Coordination of cell division with growth and development is essential for the survival of organisms. Mistakes made during replication of genetic material can result in cell death, growth defects, or cancer. Because of the essential role of the molecular machinery that controls DNA replication and mitosis during development, its high degree of conservation among organisms is not surprising. Mammalian cell cycle genes have orthologues in plants, and vice versa. However, besides the many known and characterized proliferation genes, still undiscovered regulatory genes are expected to exist with conserved functions in plants and humans. Starting from genome-wide Arabidopsis thaliana microarray data, an integrative strategy based on coexpression, functional enrichment analysis, and cis-regulatory element annotation was combined with a comparative genomics approach between plants and humans to detect conserved cell cycle genes involved in DNA replication and/or DNA repair. With this systemic strategy, a set of 339 genes was identified as potentially conserved proliferation genes. Experimental analysis confirmed that 20 out of 40 selected genes had an impact on plant cell proliferation; likewise, an evolutionarily conserved role in cell division was corroborated for two human orthologues. Moreover, association analysis integrating Homo sapiens gene expression data with clinical information revealed that, for 45 genes, altered transcript levels and relapse risk clearly correlated. Our results illustrate how a systematic exploration of the A. thaliana genome can contribute to the experimental identification of new cell cycle regulators that might represent novel oncogenes or/and tumor suppressors.
  44. Movahedi, S., Van Bel, M., Heyndrickx, K., & Vandepoele, K. (2012). Comparative co-expression analysis in plant biology. PLANT CELL AND ENVIRONMENT, 35(10), 1787–1798.
    The analysis of gene expression data generated by high-throughput microarray transcript profiling experiments has shown that transcriptionally coordinated genes are often functionally related. Based on large-scale expression compendia grouping multiple experiments, this guilt-by-association principle has been applied to study modular gene programmes, identify cis-regulatory elements or predict functions for unknown genes in different model plants. Recently, several studies have demonstrated how, through the integration of gene homology and expression information, correlated gene expression patterns can be compared between species. The incorporation of detailed functional annotations as well as experimental data describing proteinprotein interactions, phenotypes or tissue specific expression, provides an invaluable source of information to identify conserved gene modules and translate biological knowledge from model organisms to crops. In this review, we describe the different steps required to systematically compare expression data across species. Apart from the technical challenges to compute and display expression networks from multiple species, some future applications of plant comparative transcriptomics are highlighted.
  45. Moreau, H., Verhelst, B., Couloux, A., Derelle, E., Rombauts, S., Grimsley, N., Van Bel, M., et al. (2012). Gene functionalities and genome structure in Bathycoccus prasinos reflect cellular specializations at the base of the green lineage. GENOME BIOLOGY, 13(8).
    Background: Bathycoccus prasinos is an extremely small cosmopolitan marine green alga whose cells are covered with intricate spider's web patterned scales that develop within the Golgi cisternae before their transport to the cell surface. The objective of this work is to sequence and analyze its genome, and to present a comparative analysis with other known genomes of the green lineage. Research: Its small genome of 15 Mb consists of 19 chromosomes and lacks transposons. Although 70% of all B. prasinos genes share similarities with other Viridiplantae genes, up to 428 genes were probably acquired by horizontal gene transfer, mainly from other eukaryotes. Two chromosomes, one big and one small, are atypical, an unusual synapomorphic feature within the Mamiellales. Genes on these atypical outlier chromosomes show lower GC content and a significant fraction of putative horizontal gene transfer genes. Whereas the small outlier chromosome lacks colinearity with other Mamiellales and contains many unknown genes without homologs in other species, the big outlier shows a higher intron content, increased expression levels and a unique clustering pattern of housekeeping functionalities. Four gene families are highly expanded in B. prasinos, including sialyltransferases, sialidases, ankyrin repeats and zinc ion-binding genes, and we hypothesize that these genes are associated with the process of scale biogenesis. Conclusion: The minimal genomes of the Mamiellophyceae provide a baseline for evolutionary and functional analyses of metabolic processes in green plants.
  46. De Witte, D., Van Bel, M., Demeester, P., Dhoedt, B., Vandepoele, K., & Fostier, J. (2012). A high performance computing approach to the dicovery of conserved motifs. 20e Annual Conference on Intelligent Systems for Molecular Biology, Abstracts (pp. 1–1). Presented at the 20e Annual Conference on Intelligent Systems for Molecular Biology (ISMB - 2012).
  47. De Witte, D., Van Bel, M., Demeester, P., Dhoedt, B., Vandepoele, K., & Fostier, J. (2012). Alignment-free genome-wide comparative motif discovery in 4 Monocot species. 11th European Conference on Computational Biology, Abstracts (pp. 1–1). Presented at the 11th European Conference on Computational Biology (ECCB - 2012).
  48. Heyndrickx, K., & Vandepoele, K. (2012). Systematic identification of functional plant modules through the integration of complementary data sources. PLANT PHYSIOLOGY, 159(3), 884–901.
    A major challenge is to unravel how genes interact and are regulated to exert specific biological functions. The integration of genome-wide functional genomics data, followed by the construction of gene networks, provides a powerful approach to identify functional gene modules. Large-scale expression data, functional gene annotations, experimental protein-protein interactions, and transcription factor-target interactions were integrated to delineate modules in Arabidopsis (Arabidopsis thaliana). The different experimental input data sets showed little overlap, demonstrating the advantage of combining multiple data types to study gene function and regulation. In the set of 1,563 modules covering 13,142 genes, most modules displayed strong coexpression, but functional and cis-regulatory coherence was less prevalent. Highly connected hub genes showed a significant enrichment toward embryo lethality and evidence for cross talk between different biological processes. Comparative analysis revealed that 58% of the modules showed conserved coexpression across multiple plants. Using module-based functional predictions, 5,562 genes were annotated, and an evaluation experiment disclosed that, based on 197 recently experimentally characterized genes, 38.1% of these functions could be inferred through the module context. Examples of confirmed genes of unknown function related to cell wall biogenesis, xylem and phloem pattern formation, cell cycle, hormone stimulus, and circadian rhythm highlight the potential to identify new gene functions. The module-based predictions offer new biological hypotheses for functionally unknown genes in Arabidopsis (1,701 genes) and six other plant species (43,621 genes). Furthermore, the inferred modules provide new insights into the conservation of coexpression and coregulation as well as a starting point for comparative functional annotation.
  49. Petrov, Veselin, Vermeirssen, V., De Clercq, I., Van Breusegem, F., Minkov, I., Vandepoele, K., & Gechev, T. S. (2012). Identification of cis-regulatory elements specific for different types of reactive oxygen species in Arabidopsis thaliana. GENE, 499(1), 52–60.
  50. Proost, Sebastian, Fostier, J., De Witte, D., Dhoedt, B., Demeester, P., Van de Peer, Y., & Vandepoele, K. (2012). i-ADHoRe 3.0 : fast and sensitive detection of genomic homology in extremely large data sets. NUCLEIC ACIDS RESEARCH, 40(2).
  51. Vaulot, D., Lepere, C., Toulza, E., De la Iglesia, R., Poulain, J., Gaboyer, F., Moreau, H., et al. (2012). Metagenomes of the picoalga Bathycoccus from the Chile coastal upwelling. PLOS ONE, 7(6).
    Among small photosynthetic eukaryotes that play a key role in oceanic food webs, picoplanktonic Mamiellophyceae such as Bathycoccus, Micromonas, and Ostreococcus are particularly important in coastal regions. By using a combination of cell sorting by flow cytometry, whole genome amplification (WGA), and 454 pyrosequencing, we obtained metagenomic data for two natural picophytoplankton populations from the coastal upwelling waters off central Chile. About 60% of the reads of each sample could be mapped to the genome of Bathycoccus strain from the Mediterranean Sea (RCC1105), representing a total of 9 Mbp (sample T142) and 13 Mbp (sample T149) of non-redundant Bathycoccus genome sequences. WGA did not amplify all regions uniformly, resulting in unequal coverage along a given chromosome and between chromosomes. The identity at the DNA level between the metagenomes and the cultured genome was very high (96.3% identical bases for the three larger chromosomes over a 360 kbp alignment). At least two to three different genotypes seemed to be present in each natural sample based on read mapping to Bathycoccus RCC1105 genome.
  52. Fostier, J., Proost, S., Dhoedt, B., Saeys, Y., Demeester, P., Van de Peer, Y., & Vandepoele, K. (2011). A greedy, graph-based algorithm for the alignment of multiple homologous gene lists. BIOINFORMATICS, 27(6), 749–756.
  53. Babiychuk, E., Vandepoele, K., Wissing, J., Garcia-Diaz, M., De Rycke, R., Akbari, H., Joubès, J., et al. (2011). Plastid gene expression and plant development require a plastidic protein of the mitochondrial transcription termination factor family. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 108(16), 6674–6679.
  54. Mittler, R., Vanderauwera, S., Suzuki, N., Miller, G., Tognetti, V., Vandepoele, K., Gollery, M., et al. (2011). ROS signaling: the new wave? TRENDS IN PLANT SCIENCE, 16(6), 300–309.
    Reactive oxygen species (ROS) play a multitude of signaling roles in different organisms from bacteria to mammalian cells. They were initially thought to be toxic byproducts of aerobic metabolism, but have now been acknowledged as central players in the complex signaling network of cells. In this review, we will attempt to address several key questions related to the use of ROS as signaling molecules in cells, including the dynamics and specificity of ROS signaling, networking of ROS with other signaling pathways, ROS signaling within and across different cells, ROS waves and the evolution of the ROS gene network.
  55. Movahedi, S., Van de Peer, Y., & Vandepoele, K. (2011). Comparative network analysis reveals that tissue specificity and gene function are important factors influencing the mode of expression evolution in Arabidopsis and rice. PLANT PHYSIOLOGY, 156(3), 1316–1330.
    Microarray experiments have yielded massive amounts of expression information measured under various conditions for the model species Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa). Expression compendia grouping multiple experiments make it possible to define correlated gene expression patterns within one species and to study how expression has evolved between species. We developed a robust framework to measure expression context conservation (ECC) and found, by analyzing 4,630 pairs of orthologous Arabidopsis and rice genes, that 77% showed conserved coexpression. Examples of nonconserved ECC categories suggested a link between regulatory evolution and environmental adaptations and included genes involved in signal transduction, response to different abiotic stresses, and hormone stimuli. To identify genomic features that influence expression evolution, we analyzed the relationship between ECC, tissue specificity, and protein evolution. Tissue-specific genes showed higher expression conservation compared with broadly expressed genes but were fast evolving at the protein level. No significant correlation was found between protein and expression evolution, implying that both modes of gene evolution are not strongly coupled in plants. By integration of cis-regulatory elements, many ECC conserved genes were significantly enriched for shared DNA motifs, hinting at the conservation of ancestral regulatory interactions in both model species. Surprisingly, for several tissue-specific genes, patterns of concerted network evolution were observed, unveiling conserved coexpression in the absence of conservation of tissue specificity. These findings demonstrate that orthologs inferred through sequence similarity in many cases do not share similar biological functions and highlight the importance of incorporating expression information when comparing genes across species.
  56. Huysman, M., Martens, C., Vandepoele, K., Gillard, J., Rayko, E., Heijde, M., Bowler, C., et al. (2010). Genome-wide analysis of the diatom cell cycle unveils a novel type of cyclins involved in environmental signaling. GENOME BIOLOGY, 11(2).
    Background : Despite the enormous importance of diatoms in aquatic ecosystems and their broad industrial potential, little is known about their life cycle control. Diatoms typically inhabit rapidly changing and unstable environments, suggesting that cell cycle regulation in diatoms must have evolved to adequately integrate various environmental signals. The recent genome sequencing of Thalassiosira pseudonana and Phaeodactylum tricornutum allows us to explore the molecular conservation of cell cycle regulation in diatoms. Results : By profile-based annotation of cell cycle genes, counterparts of conserved as well as new regulators were identified in T. pseudonana and P. tricornutum. In particular, the cyclin gene family was found to be expanded extensively compared to that of other eukaryotes and a novel type of cyclins was discovered, the diatom-specific cyclins. We established a synchronization method for P. tricornutum that enabled assignment of the different annotated genes to specific cell cycle phase transitions. The diatom-specific cyclins are predominantly expressed at the G1-to-S transition and some respond to phosphate availability, hinting at a role in connecting cell division to environmental stimuli. Conclusion : The discovery of highly conserved and new cell cycle regulators suggests the evolution of unique control mechanisms for diatom cell division, probably contributing to their ability to adapt and survive under highly fluctuating environmental conditions.
  57. Takahashi, Naoki, Quimbaya Gomez, M. A., Schubert, V., Lammens, T., Vandepoele, K., Schubert, I., Matsui, M., et al. (2010). The MCM-Binding Protein ETG1 Aids Sister Chromatid Cohesion Required for Postreplicative Homologous Recombination Repair. PLOS GENETICS, 6(1).
    The DNA replication process represents a source of DNA stress that causes potentially spontaneous genome damage. This effect might be strengthened by mutations in crucial replication factors, requiring the activation of DNA damage checkpoints to enable DNA repair before anaphase onset. Here, we demonstrate that depletion of the evolutionarily conserved minichromosome maintenance helicase-binding protein ETG1 of Arabidopsis thaliana resulted in a stringent late G2 cell cycle arrest. This arrest correlated with a partial loss of sister chromatid cohesion. The lack-of-cohesion phenotype was intensified in plants without functional CTF18, a replication fork factor needed for cohesion establishment. The synergistic effect of the etg1 and ctf18 mutants on sister chromatid cohesion strengthened the impact on plant growth of the replication stress caused by ETG1 deficiency because of inefficient DNA repair. We conclude that the ETG1 replication factor is required for efficient cohesion and that cohesion establishment is essential for proper development of plants suffering from endogenous DNA stress. Cohesion defects observed upon knockdown of its human counterpart suggest an equally important developmental role for the orthologous mammalian ETG1 protein.
  58. De Bodt, Stefanie, Proost, S., Vandepoele, K., Rouzé, P., & Van de Peer, Y. (2009). Predicting protein-protein interactions in Arabidopsis thaliana through integration of orthology, gene ontology and co-expression. BMC Genomics, 10(288), 1–15.
    Background: Large-scale identification of the interrelationships between different components of the cell, such as the interactions between proteins, has recently gained great interest. However, unraveling large-scale protein-protein interaction maps is laborious and expensive. Moreover, assessing the reliability of the interactions can be cumbersome. Results: In this study, we have developed a computational method that exploits the existing knowledge on protein-protein interactions in diverse species through orthologous relations on the one hand, and functional association data on the other hand to predict and filter protein-protein interactions in Arabidopsis thaliana. A highly reliable set of protein-protein interactions is predicted through this integrative approach making use of existing protein-protein interaction data from yeast, human, C. elegans and D. melanogaster. Localization, biological process, and co-expression data are used as powerful indicators for protein-protein interactions. The functional repertoire of the identified interactome reveals interactions between proteins functioning in well-conserved as well as plant-specific biological processes. We observe that although common mechanisms (e.g. actin polymerization) and components (e.g. ARPs, actin-related proteins) exist between different lineages, they are active in specific processes such as growth, cancer metastasis and trichome development in yeast, human and Arabidopsis, respectively. Conclusion: We conclude that the integration of orthology with functional association data is adequate to predict protein-protein interactions. Through this approach, a high number of novel protein-protein interactions with diverse biological roles is discovered. Overall, we have predicted a reliable set of protein-protein interactions suitable for further computational as well as experimental analyses.
  59. Piganeau, Gwenael, Vandepoele, K., Gourbière, S., Van de Peer, Y., & Moreau, H. (2009). Unravelling cis-Regulatory Elements in the Genome of the Smallest Photosynthetic Eukaryote: Phylogenetic Footprinting in Ostreococcus. Journal of Molecular Evolution, 69(3), 249–259.
    We used a phylogenetic footprinting approach, adapted to high levels of divergence, to estimate the level of constraint in intergenic regions of the extremely gene dense Ostreococcus algae genomes (Chlorophyta, Prasinophyceae). We first benchmarked our method against the Saccharomyces sensu stricto genome data and found that the proportion of conserved non-coding sites was consistent with those obtained with methods using calibration by the neutral substitution rate. We then applied our method to the complete genomes of Ostreococcus tauri and O. lucimarinus, which are the most divergent species from the same genus sequenced so far. We found that 77% of intergenic regions in Ostreococcus still contain some phylogenetic footprints, as compared to 88% for Saccharomyces, corresponding to an average rate of constraint on intergenic region of 17% and 30%, respectively. A comparison with some known functional cis-regulatory elements enabled us to investigate whether some transcriptional regulatory pathways were conserved throughout the green lineage. Strikingly, the size of the phylogenetic footprints depends on gene orientation of neighboring genes, and appears to be genus-specific. In Ostreococcus, 5' intergenic regions contain four times more conserved sites than 3' intergenic regions, whereas in yeast a higher frequency of constrained sites in intergenic regions between genes on the same DNA strand suggests a higher frequency of bidirectional regulatory elements. The phylogenetic footprinting approach can be used despite high levels of divergence in the ultrasmall Ostreococcus algae, to decipher structure of constrained regulatory motifs, and identify putative regulatory pathways conserved within the green lineage.
  60. Van de Peer, Y., Fawcett, J., Proost, S., Sterck, L., & Vandepoele, K. (2009). The flowering world: a tale of duplications. TRENDS IN PLANT SCIENCE, 14(12), 680–688.
    Flowering plants contain many genes, most of which were created during the past 200 or so million years through small- and large-scale duplications. Paleo-polyploidy events, in particular, have been the subject of much recent research. There is a growing consensus that one or more genome doubling or merging events occurred early during the evolution of the flowering plants, and that many lineages have since undergone additional, independent and more recent duplication events. Here, we review the difficulties in determining the number of genome duplications and discuss how the completion of some additional genome sequences of species occupying key phylogenetic positions has led to a better understanding of the timing of certain duplication events. This is important if we want to demonstrate the significance of genome duplications for the evolution and radiation of (different groups of) flowering plants.
  61. Proost, Sebastian, Van Bel, M., Sterck, L., Billiau, K., Van Parys, T., Van de Peer, Y., & Vandepoele, K. (2009). PLAZA: a comparative genomics resource to study gene and genome evolution in plants. PLANT CELL, 21(12), 3718–3731.
    The number of sequenced genomes of representatives within the green lineage is rapidly increasing. Consequently, comparative sequence analysis has significantly altered our view on the complexity of genome organization, gene function, and regulatory pathways. To explore all this genome information, a centralized infrastructure is required where all data generated by different sequencing initiatives is integrated and combined with advanced methods for data mining. Here, we describe PLAZA, an online platform for plant comparative genomics (http://bioinformatics.psb.ugent.be/plaza/). This resource integrates structural and functional annotation of published plant genomes together with a large set of interactive tools to study gene function and gene and genome evolution. Precomputed data sets cover homologous gene families, multiple sequence alignments, phylogenetic trees, intraspecies whole-genome dot plots, and genomic colinearity between species. Through the integration of high confidence Gene Ontology annotations and tree-based orthology between related species, thousands of genes lacking any functional description are functionally annotated. Advanced query systems, as well as multiple interactive visualization tools, are available through a user-friendly and intuitive Web interface. In addition, detailed documentation and tutorials introduce the different tools, while the workbench provides an efficient means to analyze user-defined gene sets through PLAZA's interface. In conclusion, PLAZA provides a comprehensible and up-to-date research environment to aid researchers in the exploration of genome information within the green plant lineage.
  62. Vandepoele, K., Quimbaya Gomez, M. A., Casneuf, T., De Veylder, L., & Van de Peer, Y. (2009). Unraveling Transcriptional Control in Arabidopsis Using cis-Regulatory Elements and Coexpression Networks. Plant Physiology, 150(2), 535–546.
    Analysis of gene expression data generated by high-throughput microarray transcript profiling experiments has demonstrated that genes with an overall similar expression pattern are often enriched for similar functions. This guilt-by-association principle can be applied to define modular gene programs, identify cis-regulatory elements, or predict gene functions for unknown genes based on their coexpression neighborhood. We evaluated the potential to use Gene Ontology (GO) enrichment of a gene's coexpression neighborhood as a tool to predict its function but found overall low sensitivity scores (13%-34%). This indicates that for many functional categories, coexpression alone performs poorly to infer known biological gene functions. However, integration of cis-regulatory elements shows that 46% of the gene coexpression neighborhoods are enriched for one or more motifs, providing a valuable complementary source to functionally annotate genes. Through the integration of coexpression data, GO annotations, and a set of known cis-regulatory elements combined with a novel set of evolutionarily conserved plant motifs, we could link many genes and motifs to specific biological functions. Application of our coexpression framework extended with cis-regulatory element analysis on transcriptome data from the cell cycle-related transcription factor OBP1 yielded several coexpressed modules associated with specific cis-regulatory elements. Moreover, our analysis strongly suggests a feed-forward regulatory interaction between OBP1 and the E2F pathway. The ATCOECIS resource (http:// bioinformatics.psb.ugent.be/ATCOECIS/) makes it possible to query coexpression data and GO and cis-regulatory element annotations and to submit user-defined gene sets for motif analysis, providing an access point to unravel the regulatory code underlying transcriptional control in Arabidopsis (Arabidopsis thaliana).
  63. Naouar, N., Vandepoele, K., Lammens, T., Casneuf, T., Zeller, G., Van Hummelen, P., Weigel, D., et al. (2009). Quantitative RNA expression analysis with Affymetrix Tiling 1.0R arrays identifies new E2F target genes. Plant Journal, 57(1), 184–194.
    The Affymetrix ATH1 array provides a robust standard tool for transcriptome analysis, but unfortunately does not represent all of the transcribed genes in Arabidopsis thaliana. Recently, Affymetrix has introduced its Arabidopsis Tiling 1.0R array, which offers whole-genome coverage of the sequenced Col-0 reference strain. Here, we present an approach to exploit this platform for quantitative mRNA expression analysis, and compare the results with those obtained using ATH1 arrays. We also propose a method for selecting unique tiling probes for each annotated gene or transcript in the most current genome annotation, TAIR7, generating Chip Definition Files for the Tiling 1.0R array. As a test case, we compared the transcriptome of wild-type plants with that of transgenic plants overproducing the heterodimeric E2Fa-DPa transcription factor. We show that with the appropriate data pre-processing, the estimated changes per gene for those with significantly different expression levels is very similar for the two array types. With the tiling arrays we could identify 368 new E2F-regulated genes, with a large fraction including an E2F motif in the promoter. The latter groups increase the number of excellent candidates for new, direct E2F targets by almost twofold, from 181 to 334.
  64. Dhaese, Stien, Vandepoele, K., WATERSCHOOT, D., Vanloo, B., Vandekerckhove, J., Ampe, C., & Van Troys, M. (2009). The Mouse Thymosin Beta15 Gene Family Displays Unique Complexity and Encodes A Functional Thymosin Repeat. JOURNAL OF MOLECULAR BIOLOGY, 387(4), 809–825.
    We showed earlier that human beta -thymosin 15 (Th15) is up-regulated in prostate cancer, confirming Studies from others that propagated Tb15 as a prostate cancer biomarker. In this first report on mouse Tb15, we show that, unlike in humans, four Tb15-like isoforms are present in Mouse. We used phylogenetic analysis of deuterostome beta-thymosins to show that these four new isoforms cluster within the vertebrate Tb15-clade. Intriguingly, one of these Mouse beta-thymosins, Th15r, consists of two beta-thymosin domains. The existence of such a repeat beta-thymosin is so far unique in vertebrates, though common in lower eukaryotes. Biochemical data indicate that Tb15r potently sequesters actin. In a cellular context, Tb15r behaves as a bona fide beta-thymosin, lowering central stress fibre content. We reveal that a complex genomic organization underlies Tb15r expression: Tb15r results from read-through transcription and alternative splicing of two tandem duplicated mouse Tb15 genes. Transcript profiling of all Mouse beta-thymosin isoform (Th15s, Tb4 and Tb10) reveals that two isoform switches occur between embryonic and adult tissues, and indicates Th15r as the major mouse Tb15 isoform in adult cells. Tb15r is present also in mouse prostate cancer cell lines. This insight into the mouse Tb15 family is fundamental for future studies on Tb15 in mouse (prostate) cancer models.
  65. Vandenbroucke, Korneel, Robbens, S., Vandepoele, K., Inzé, D., Van de Peer, Y., & Van Breusegem, F. (2008). Hydrogen peroxide-induced gene expression across kingdoms: a comparative analysis. MOLECULAR BIOLOGY AND EVOLUTION, 25(3), 507–516.
    Cells react to oxidative stress conditions by launching a defense response through the induction of nuclear gene expression. The advent of microarray technologies allowed monitoring of oxidative stress-dependent changes of transcript levels at a comprehensive and genome-wide scale, resulting in a series of inventories of differentially expressed genes in different organisms. We performed a meta-analysis on hydrogen peroxide (H2O2)-induced gene expression in the cyanobacterium Synechocystis PCC 6803, the yeast Saccharomyces cerevisiae and Schizosaccharomyces pombe, the land plant Arabidopsis thaliana, and the human HeLa cell line. The H2O2-induced gene expression in both yeast species was highly conserved and more similar to the A. thaliana response than that of the human cell line. Based on the expression characteristics of genuine antioxidant genes, we show that the antioxidant capacity of microorganisms and higher eukaryotes is differentially regulated. Four families of evolutionarily conserved eukaryotic proteins could be identified that were H2O2 responsive across kingdoms: DNAJ domain-containing heat shock proteins, small guanine triphosphate-binding proteins, Ca2+-dependent protein kinases, and ubiquitin-conjugating enzymes.
  66. Bowler, Chris, Allen, A. E., Badger, J. H., Grimwood, J., Jabbari, K., Kuo, A., Maheswari, U., et al. (2008). The Phaeodactylum genome reveals the evolutionary history of diatom genomes. NATURE, 456(7219), 239–244.
    Diatoms are photosynthetic secondary endosymbionts found throughout marine and freshwater environments, and are believed to be responsible for around one- fifth of the primary productivity on Earth(1,2). The genome sequence of the marine centric diatom Thalassiosira pseudonana was recently reported, revealing a wealth of information about diatom biology(3-5). Here we report the complete genome sequence of the pennate diatom Phaeodactylum tricornutum and compare it with that of T. pseudonana to clarify evolutionary origins, functional significance and ubiquity of these features throughout diatoms. In spite of the fact that the pennate and centric lineages have only been diverging for 90 million years, their genome structures are dramatically different and a substantial fraction of genes (similar to 40%) are not shared by these representatives of the two lineages. Analysis of molecular divergence compared with yeasts and metazoans reveals rapid rates of gene diversification in diatoms. Contributing factors include selective gene family expansions, differential losses and gains of genes and introns, and differential mobilization of transposable elements. Most significantly, we document the presence of hundreds of genes from bacteria. More than 300 of these gene transfers are found in both diatoms, attesting to their ancient origins, and many are likely to provide novel possibilities for metabolite management and for perception of environmental signals. These findings go a long way towards explaining the incredible diversity and success of the diatoms in contemporary oceans.
  67. Martens, Cindy, Vandepoele, K., & Van de Peer, Y. (2008). Whole-genome analysis reveals molecular innovations and evolutionary transitions in chromalveolate species. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 105(9), 3427–3432.
    The chromalveolates form a highly diverse and fascinating assemblage of organisms, ranging from obligatory parasites such as Plasmodium to free-living ciliates and algae such as kelps, diatoms, and dinoflagellates. Many of the species in this monophyletic grouping are of major medical, ecological, and economical importance. Nevertheless, their genome evolution is much less well studied than that of higher plants, animals, or fungi. In the current study, we have analyzed and compared 12 chromalveolate species for which whole-sequence information is available and provide a detailed picture on gene loss and gene gain in the different lineages. As expected, many gene loss and gain events can be directly correlated with the lifestyle and specific adaptations of the organisms studied. For instance, in the obligate intracellular Apicomplexa we observed massive loss of genes that play a role in general basic processes such as amino acid, carbohydrate, and lipid metabolism, reflecting the transition of a free-living to an obligate intracellular lifestyle. In contrast, many gene families show species-specific expansions, such as those in the plant pathogen oomycete Phytophthora that are involved in degrading the plant cell wall polysaccharides to facilitate the pathogen invasion process. In general, chromalveolates show a tremendous difference in genome structure and evolution and in the number of genes they have lost or gained either through duplication or horizontal gene transfer.
  68. Lessa Alvim Kamei, C., Boruc, J., Vandepoele, K., Van Den Daele, H., Maes, S., Russinova, E., Inzé, D., et al. (2008). The PRA1 gene family in Arabidopsis. PLANT PHYSIOLOGY, 147(4), 1735–1749.
    Prenylated Rab acceptor 1 (PRA1) domain proteins are small transmembrane proteins that regulate vesicle trafficking as receptors of Rab GTPases and the vacuolar soluble N-ethylmaleimide-sensitive factor attachment receptor protein VAMP2. However, little is known about PRA1 family members in plants. Sequence analysis revealed that higher plants, compared with animals and primitive plants, possess an expanded family of PRA1 domain-containing proteins. The Arabidopsis ( Arabidopsis thaliana) PRA1 (AtPRA1) proteins were found to homodimerize and heterodimerize in a manner corresponding to their phylogenetic distribution. Different AtPRA1 family members displayed distinct expression patterns, with a preference for vascular cells and expanding or developing tissues. AtPRA1 genes were significantly coexpressed with Rab GTPases and genes encoding vesicle transport proteins, suggesting an involvement in the vesicle trafficking process similar to that of their animal counterparts. Correspondingly, AtPRA1 proteins were localized in the endoplasmic reticulum, Golgi apparatus, and endosomes/prevacuolar compartments, hinting at a function in both secretory and endocytic intracellular trafficking pathways. Taken together, our data reveal a high functional diversity of AtPRA1 proteins, probably dealing with the various demands of the complex trafficking system.
  69. Van Roy, Frans, Vandepoele, K., Van Roy, N., Andries, V., Staes, K., Vandesompele, J., Laureys, G., et al. (2008). A constitutional translocation t(1;17)(p36.2;q11.2) in a neuroblastoma patient disrupts the human NBPF1 and ACCN1 genes. EJC SUPPLEMENTS (Vol. 6, pp. 14–14). Presented at the 20th Meeting of the European Association for Cancer Research.
  70. Sterck, L., Rombauts, S., Vandepoele, K., Rouzé, P., & Van de Peer, Y. (2007). How many genes are there in plants (... and why are they there)? CURRENT OPINION IN PLANT BIOLOGY, 10(2), 199–203.
    Annotation of the first few complete plant genomes has revealed that plants have many genes. For Arabidopsis, over 26 500 gene loci have been predicted, whereas for rice, the number adds up to 41 000. Recent analysis of the poplar genome suggests more than 45 000 genes, and partial sequence data from Medicago and Lotus also suggest that these plants contain more than 40 000 genes. Nevertheless, estimations suggest that ancestral angiosperms had no more than 12 000-14 000 genes. One explanation for the large increase in gene number during angiosperm evolution is gene duplication. It has been shown previously that the retention of duplicates following small- and large-scale duplication events in plants is substantial. Taking into account the function of genes that have been duplicated, we are now beginning to understand why many plant genes might have been retained, and how their retention might be linked to the typical lifestyle of plants.
  71. Rymen, B., Fiorani, F., Kartal, F., Vandepoele, K., Inzé, D., & Beemster, G. (2007). Cold nights impair leaf growth and cell cycle progression in maize through transcriptional changes of cell cycle genes. PLANT PHYSIOLOGY, 143(3), 1429–1438.
    Low temperature inhibits the growth of maize (Zea mays) seedlings and limits yield under field conditions. To study the mechanism of cold-induced growth retardation, we exposed maize B73 seedlings to low night temperature (25 degrees C/4 degrees C, day/night) from germination until the completion of leaf 4 expansion. This treatment resulted in a 20% reduction in final leaf size compared to control conditions (25 degrees C/18 degrees C, day/night). A kinematic analysis of leaf growth rates in control and cold-treated leaves during daytime showed that cold nights affected both cell cycle time (165%) and cell production (222%). In contrast, the size of mature epidermal cells was unaffected. To analyze the effect on cell cycle progression at the molecular level, we identified through a bioinformatics approach a set of 43 cell cycle genes and analyzed their expression in proliferating, expanding, and mature cells of leaves exposed to either control or cold nights. This analysis showed that: (1) the majority of cell cycle genes had a consistent proliferation-specific expression pattern; and (2) the increased cell cycle time in the basal meristem of leaves exposed to cold nights was associated with differential expression of cell cycle inhibitors and with the concomitant down-regulation of positive regulators of cell division.
  72. Velasco, R., Zharkikh, A., Troggio, M., Cartwright, D. A., Cestaro, A., Pruss, D., Pindo, M., et al. (2007). A high quality draft consensus sequence of the genome of a heterozygous grapevine variety. PLOS ONE, 2(12).
    Background. Worldwide, grapes and their derived products have a large market. The cultivated grape species Vitis vinifera has potential to become a model for fruit trees genetics. Like many plant species, it is highly heterozygous, which is an additional challenge to modern whole genome shotgun sequencing. In this paper a high quality draft genome sequence of a cultivated clone of V. vinifera Pinot Noir is presented. Principal Findings. We estimate the genome size of V. vinifera to be 504.6 Mb. Genomic sequences corresponding to 477.1 Mb were assembled in 2,093 metacontigs and 435.1 Mb were anchored to the 19 linkage groups (LGs). The number of predicted genes is 29,585, of which 96.1% were assigned to LGs. This assembly of the grape genome provides candidate genes implicated in traits relevant to grapevine cultivation, such as those influencing wine quality, via secondary metabolites, and those connected with the extreme susceptibility of grape to pathogens. Single nucleotide polymorphism ( SNP) distribution was consistent with a diffuse haplotype structure across the genome. Of around 2,000,000 SNPs, 1,751,176 were mapped to chromosomes and one or more of them were identified in 86.7% of anchored genes. The relative age of grape duplicated genes was estimated and this made possible to reveal a relatively recent Vitis-specific large scale duplication event concerning at least 10 chromosomes (duplication not reported before). Conclusions. Sanger shotgun sequencing and highly efficient sequencing by synthesis (SBS), together with dedicated assembly programs, resolved a complex heterozygous genome. A consensus sequence of the genome and a set of mapped marker loci were generated. Homologous chromosomes of Pinot Noir differ by 11.2% of their DNA (hemizygous DNA plus chromosomal gaps). SNP markers are offered as a tool with the potential of introducing a new era in the molecular breeding of grape.
  73. Polet, D., Lambrechts, A., Vandepoele, K., Vandekerckhove, J., & Ampe, C. (2007). On the origin and evolution of vertebrate and viral profilins. FEBS LETTERS, 581(2), 211–217.
  74. Peres, A., Churchman, M. L., Hariharan, S., Himanen, K., Verkest, A., Vandepoele, K., Magyar, Z., et al. (2007). Novel plant-specific cyclin-dependent kinase inhibitors induced by biotic and abiotic stresses. JOURNAL OF BIOLOGICAL CHEMISTRY, 282(35), 25588–25596.
    The EL2 gene of rice ( Oryza sativa), previously classified as early response gene against the potent biotic elicitor N-acetylchitoheptaose and encoding a short polypeptide with unknown function, was identified as a novel cell cycle regulatory gene related to the recently reported SIAMESE ( SIM) gene of Arabidopsis thaliana. Iterative two-hybrid screens, in vitro pull-down assays, and fluorescence resonance energy transfer analyses showed that Orysa; EL2 binds the cyclin-dependent kinase ( CDK) CDKA1; 1 and D-type cyclins. No interaction was observed with the plant-specific B-type CDKs. The amino acid motif ELERFL was identified to be essential for cyclin, but not for CDK binding. Orysa; EL2 impaired the ability of Orysa; CYCD5; 3 to complement a budding yeast ( Saccharomyces cerevisiae) triple CLN mutant, whereas recombinant protein inhibited CDK activity in vitro. Moreover, Orysa; EL2 was able to rescue the multicellular trichome phenotype of sim mutants of Arabidopsis, unequivocally demonstrating that Orysa; EL2 operates as a cell cycle inhibitor. Orysa; EL2 mRNA levels were induced by cold, drought, and propionic acid. Our data suggest that Orysa; EL2 encodes a new type of plant CDK inhibitor that links cell cycle progression with biotic and abiotic stress responses.
  75. Blomme, T., Vandepoele, K., De Bodt, S., Simillion, C., Maere, S., & Van de Peer, Y. (2006). The gain and loss of genes during 600 million years of vertebrate evolution. GENOME BIOLOGY, 7(5).
    Background: Gene duplication is assumed to have played a crucial role in the evolution of vertebrate organisms. Apart from a continuous mode of duplication, two or three whole genome duplication events have been proposed during the evolution of vertebrates, one or two at the dawn of vertebrate evolution, and an additional one in the fish lineage, not shared with land vertebrates. Here, we have studied gene gain and loss in seven different vertebrate genomes, spanning an evolutionary period of about 600 million years. Results: We show that: first, the majority of duplicated genes in extant vertebrate genomes are ancient and were created at times that coincide with proposed whole genome duplication events; second, there exist significant differences in gene retention for different functional categories of genes between fishes and land vertebrates; third, there seems to be a considerable bias in gene retention of regulatory genes towards the mode of gene duplication ( whole genome duplication events compared to smaller-scale events), which is in accordance with the so-called gene balance hypothesis; and fourth, that ancient duplicates that have survived for many hundreds of millions of years can still be lost. Conclusion: Based on phylogenetic analyses, we show that both the mode of duplication and the functional class the duplicated genes belong to have been of major importance for the evolution of the vertebrates. In particular, we provide evidence that massive gene duplication ( probably as a consequence of entire genome duplications) at the dawn of vertebrate evolution might have been particularly important for the evolution of complex vertebrates.
  76. Vandepoele, K., Casneuf, T., & Van de Peer, Y. (2006). Identification of novel regulatory modules in dicotyledonous plants using expression data and comparative genomics. GENOME BIOLOGY, 7(11).
    Background: Transcriptional regulation plays an important role in the control of many biological processes. Transcription factor binding sites (TFBSs) are the functional elements that determine transcriptional activity and are organized into separable cis-regulatory modules, each defining the cooperation of several transcription factors required for a specific spatio-temporal expression pattern. Consequently, the discovery of novel TFBSs in promoter sequences is an important step to improve our understanding of gene regulation. Results: Here, we applied a detection strategy that combines features of classic motif overrepresentation approaches in co-regulated genes with general comparative footprinting principles for the identification of biologically relevant regulatory elements and modules in Arabidopsis thaliana, a model system for plant biology. In total, we identified 80 TFBSs and 139 regulatory modules, most of which are novel, and primarily consist of two or three regulatory elements that could be linked to different important biological processes, such as protein biosynthesis, cell cycle control, photosynthesis and embryonic development. Moreover, studying the physical properties of some specific regulatory modules revealed that Arabidopsis promoters have a compact nature, with cooperative TFBSs located in close proximity of each other. Conclusion: These results create a starting point to unravel regulatory networks in plants and to study the regulation of biological processes from a systems biology point of view.
  77. Vandepoele, K. (2005). Mode and tempo of gene and genome evolution in plants. Ghent University. Faculty of Sciences, Ghent, Belgium.
  78. Vandepoele, K., Vlieghe, K., Florquin, K., Hennig, L., Beemster, G., Gruissem, W., Van de Peer, Y., et al. (2005). Genome-wide identification of potential plant E2F target genes. PLANT PHYSIOLOGY, 139(1), 316–328.
    Entry into the S phase of the cell cycle is controlled by E2F transcription factors that induce the transcription of genes required for cell cycle progression and DNA replication. Although the E2F pathway is highly conserved in higher eukaryotes, only a few E2F target genes have been experimentally validated in plants. We have combined microarray analysis and bioinformatics tools to identify plant E2F-responsive genes. Promoter regions of genes that were induced at the transcriptional level in Arabidopsis ( Arabidopsis thaliana) seedlings ectopically expressing genes for the E2Fa and DPa transcription factors were searched for the presence of E2F- binding sites, resulting in the identification of 181 putative E2F target genes. In most cases, the E2F- binding element was located close to the transcription start site, but occasionally could also be localized in the 5'untranslated region. Comparison of our results with available microarray data sets from synchronized cell suspensions revealed that the E2F target genes were expressed almost exclusively during G1 and S phases and activated upon reentry of quiescent cells into the cell cycle. To test the robustness of the data for the Arabidopsis E2F target genes, we also searched for the presence of E2F-cis-acting elements in the promoters of the putative orthologous rice ( Oryza sativa) genes. Using this approach, we identified 70 potential conserved plant E2F target genes. These genes encode proteins involved in cell cycle regulation, DNA replication, and chromatin dynamics. In addition, we identified several genes for potentially novel S phase regulatory proteins.
  79. Vandepoele, K., & Van de Peer, Y. (2005). Exploring the plant transcriptome through phylogenetic profiling. PLANT PHYSIOLOGY, 137(1), 31–42.
    Publicly available protein sequences represent only a small fraction of the full catalog of genes encoded by the genomes of different plants, such as green algae, mosses, gymnosperms, and angiosperms. By contrast, an enormous amount of expressed sequence tags (ESTs) exists for a wide variety of plant species, representing a substantial part of all transcribed plant genes. Integrating protein and EST sequences in comparative and evolutionary analyses is not straightforward because of the heterogeneous nature of both types of sequence data. By combining information from publicly available EST and protein sequences for 32 different plant species, we identified more than 250,000 plant proteins organized in more than 12,000 gene families. Approximately 60% of the proteins are absent from current sequence databases but provide important new information about plant gene families. Analysis of the distribution of gene families over different plant species through phylogenetic profiling reveals interesting insights into plant gene evolution, and identifies species- and lineage-specific gene families, orphan genes, and conserved core genes across the green plant lineage. We counted a similar number of approximately 9,500 gene families in monocotyledonous and eudicotyledonous plants and found strong evidence for the existence of at least 33,700 genes in rice (Oryza sativa). Interestingly, the larger number of genes in rice compared to Arabidopsis (Arabidopsis thaliana) can partially be explained by a larger amount of species-specific single-copy genes and species-specific gene families. In addition, a majority of large gene families, typically containing more than 50 genes, are bigger in rice than Arabidopsis, whereas the opposite seems true for small gene families.
  80. Simillion, C., Vandepoele, K., & Van de Peer, Y. (2004). Recent developments in computational approaches for uncovering genomic homology. BIOESSAYS, 26(11), 1225–1235.
  81. Simillion, C., Vandepoele, K., Saeys, Y., & Van de Peer, Y. (2004). Building genomic profiles for uncovering segmental homology in the twilight zone. Belgian Bioinformatics Conference, 4th, Abstracts. Presented at the 4th Belgian Bioinformatics Conference (BBC 2004).
  82. Landrieu, I., da Costa, M., De Veylder, L., Dewitte, F., Vandepoele, K., Hassan, S., Wieruszeski, J.-M., et al. (2004). A small CDC25 dual-specificity tyrosine-phosphatase isoform in Arabidopsis thaliana. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 101(36), 13380–13385.
    The dual-specificity CDC25 phosphatases are critical positive regulators of cyclin-dependent kinases (CDKs). Even though an antagonistic Arabidopsis thaliana WEE1 kinase has been cloned and tyrosine phosphorylation of its CDKs has been demonstrated, no valid candidate for a CDC25 protein has been reported in higher plants. We identify a CDC25-related protein (Arath;CDC25) of A. thaliana, constituted by a sole catalytic domain. The protein has a tyrosine-phosphatase activity and stimulates the kinase activity of Arabidopsis CDKs. Its tertiary structure was obtained by NMR spectroscopy and confirms that Arath;CDC25 belongs structurally to the classical CDC25 superfamily with a central five-stranded beta-sheet surrounded by helices. A particular feature of the protein, however, is the presence of an additional zinc-binding loop in the C-terminal part. NMR mapping studies revealed the interaction with phosphorylated peptidic models derived from the conserved CDK loop containing the phosphothreonine-14 and phosphotyrosine-15. We conclude that despite sequence divergence, Arath;CDC25 is structurally and functionally an isoform of the CDC25 superfamily, which is conserved in yeast and in plants, including Arabidopsis and rice.
  83. Gevers, D., Vandepoele, K., Simillion, C., & Van de Peer, Y. (2004). Gene duplication and biased functional retention of paralogs in bacterial genomes. TRENDS IN MICROBIOLOGY, 12(4), 148–154.
    Gene duplication is considered an important prerequisite for gene innovation that can facilitate adaptation to changing environments. The analysis of 106 bacterial genome sequences has revealed the existence of a significant number of paralogs. Analysis of the functional classification of these paralogs reveals a preferential enrichment in functional classes that are involved in transcription, metabolism and defense mechanisms. From the organization of paralogs in the genome we can conclude that duplicated genes in bacteria appear to have been mainly created by small-scale duplication events, such as tandem and operon duplications.
  84. Vercammen, D., Van De Cotte, B., De Jaeger, G., Eeckhout, D., Casteels, P., Vandepoele, K., Vandenberghe, I., et al. (2004). Type II metacaspases Atmc4 and Atmc9 of Arabidopsis thaliana cleave substrates after arginine and lysine. JOURNAL OF BIOLOGICAL CHEMISTRY, 279(44), 45329–45336.
    Nine potential caspase counterparts, designated metacaspases, were identified in the Arabidopsis thaliana genome. Sequence analysis revealed two types of metacaspases, one with ( type I) and one without ( type II) a proline- or glutamine-rich N-terminal extension, possibly representing a prodomain. Production of recombinant Arabidopsis type II metacaspases in Escherichia coli resulted in cysteine-dependent autocatalytic processing of the proform into large and small subunits, in analogy to animal caspases. A detailed biochemical characterization with a broad range of synthetic oligopeptides and several protease inhibitors of purified recombinant proteins of both metacaspase 4 and 9 showed that both metacaspases are arginine/lysine-specific cysteine proteases and did not cleave caspase-specific synthetic substrates. These findings suggest that type II metacaspases are not directly responsible for earlier reported caspase-like activities in plants.
  85. Simillion, C., Vandepoele, K., Saeys, Y., & Van de Peer, Y. (2004). Building genomic profiles for uncovering segmental homology in the twilight zone. GENOME RESEARCH, 14(6), 1095–1106.
    The identification of homologous regions within and between genomes is all essential prerequisite for Studying genome structure and evolution. Different methods already exist that allow detecting homologous regions ill all automated manner. These methods are based either oil finding sequence similarities at the DNA level or on identifying chromosomal regions showing conservation of gene order and content. Especially the latter approach has proven useful for detecting homology between highly divergent chromosomal regions. However, until now, such map-based approaches required that candidate homologous regions show significant collinearity with other segments to be considered as being homologous. Here, we present a novel method that creates profiles combining the gene order and content information of multiple mutually homologous genomic segments. These profiles can be used to scan one or more genomes to detect segments that show significant collinearity with the entire profile but not necessarily with individual segments. When applying this new method to the combined genomes of Arabidopsis and rice, we find additional evidence for ancient duplication events in the rice genome.
  86. Vandepoele, K., Simillion, C., & Van de Peer, Y. (2004). The quest for genomic homology. CURRENT GENOMICS, 5(4), 299–308.
  87. Vandepoele, K., De Vos, W., Taylor, J. S., Meyer, A., & Van de Peer, Y. (2004). Major events in the genome evolution of vertebrates: paranome age and size differ considerably between ray-finned fishes and land vertebrates. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 101(6), 1638–1643.
    It has been suggested that fish have more genes than humans. Whether most of these additional genes originated through a complete (fish-specific) genome duplication or through many lineage-specific tandem gene or smaller block duplications and family expansions continues to be debated. We analyzed the complete genome of the pufferfish Takifugu rubripes (Fugu) and compared it with the paranome of humans. We show that most paralogous genes of Fugu are the result of three complete genome duplications. Both relative and absolute dating of the complete predicted set of protein-coding genes suggest that initial genome duplications, estimated to have occurred at least 600 million years ago, shaped the genome of all vertebrates, In addition, analysis of >150 block duplications in the Fugu genome clearly supports a fish-specific genome duplication (approximate to320 million years ago) that coincided with the vast radiation of most modern ray-finned fishes. Unlike the human genome, Fugu contains very few recently duplicated genes; hence, many human genes are much younger than fish genes. This lack of recent gene duplication, or, alternatively, the accelerated rate of gene loss, is possibly one reason for the drastic reduction of the genome size of Fugu observed during the past 100 million years or so, subsequent to the additional genome duplication that ray-finned fishes but not land vertebrates experienced.
  88. Breyne, Peter, Dreesen, R., Cannoot, B., Rombaut, D., Vandepoele, K., Rombauts, S., Vanderhaeghen, R., et al. (2003). Quantitative cDNA-AFLP analysis for genome-wide expression studies. MOLECULAR GENETICS AND GENOMICS, 269(2), 173–179.
    An improved cDNA-AFLP method for genome-wide expression analysis has been developed. We demonstrate that this method is an efficient tool for quantitative transcript profiling and a valid alternative to microarrays. Unique transcript tags, generated from reverse-transcribed messenger RNA by restriction enzymes, were screened through a series of selective PCR amplifications. Based on in silico analysis, an enzyme combination was chosen that ensures that at least 60% of all the mRNAs were represented by an informative sequence tag. The sensitivity and specificity of the method allows one to detect poorly expressed genes and distinguish between homologous sequences. Accurate gene expression profiles were determined by quantitative analysis of band intensities, and subtle differences in transcriptional activity were revealed. A detailed screen for cell cycle-modulated genes in tobacco demonstrates the usefulness of the technology for genome-wide expression analysis.
  89. Raes, J., Vandepoele, K., Simillion, C., Saeys, Y., & Van de Peer, Y. (2003). Investigating ancient duplication events in the Arabidopsis genome. JOURNAL OF STRUCTURAL AND FUNCTIONAL GENOMICS, 3(1-4), 117–129.
  90. Raes, J., Vandepoele, K., Simillion, C., Saeys, Y., & Van de Peer, Y. (2003). Investigating ancient duplication events in the Arabidopsis genome. In Axel Meyer & Y. Van de Peer (Eds.), Genome evolution : gene and genome duplications and the origin of novel gene functions (pp. 117–129). Dordrecht, The Netherlands: Kluwer Academic.
  91. Vandepoele, K., Simillion, C., & Van de Peer, Y. (2003). Evidence that rice and other cereals are ancient aneuploids. PLANT CELL, 15(9), 2192–2202.
    Detailed analyses of the genomes of several model organisms revealed that large-scale gene or even entire-genome duplications have played prominent roles in the evolutionary history of many eukaryotes. Recently, strong evidence has been presented that the genomic structure of the dicotyledonous model plant species Arabidopsis is the result of multiple rounds of entire-genome duplications. Here, we analyze the genome of the monocotyledonous model plant species rice, for which a draft of the genomic sequence was published recently. We show that a substantial fraction of all rice genes (similar to15%) are found in duplicated segments. Dating of these block duplications, their nonuniform distribution over the different rice chromosomes, and comparison with the duplication history of Arabidopsis suggest that rice is not an ancient polyploid, as suggested previously, but an ancient aneuploid that has experienced the duplication of one-or a large part of one-chromosome in its evolutionary past, similar to70 million years ago. This date predates the divergence of most of the cereals, and relative dating by phylogenetic analysis shows that this duplication event is shared by most if not all of them.
  92. Breyne, Peter, Dreesen, R., Vandepoele, K., De Veylder, L., Van Breusegem, F., Callewaert, L., Rombauts, S., et al. (2002). Transcriptome analysis during cell division in plants. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 99(23), 14825–14830.
    Using synchronized tobacco Bright Yellow-2 cells and cDNA-amplified fragment length polymorphism-based genomewide expression analysis, we built a comprehensive collection of plant cell cycle-modulated genes. Approximately 1,340 periodically expressed genes were identified, including known cell cycle control genes as well as numerous unique candidate regulatory genes. A number of plant-specific genes were found to be cell cycle modulated. Other transcript tags were derived from unknown plant genes showing homology to cell cycle-regulatory genes of other organisms. Many of the genes encode novel or uncharacterized proteins, indicating that several processes underlying cell division are still largely unknown.
  93. Vandepoele, K., Saeys, Y., Simillion, C., Raes, J., & Van de Peer, Y. (2002). The automatic detection of homologous regions (ADHoRe) and its application to microcolinearity between Arabidopsis and rice. GENOME RESEARCH, 12(11), 1792–1801.
    It is expected that one of the merits of comparative genomics lies in the transfer of structural and functional information from one genome to another. This is based on the observation that, although the number of chromosomal rearrangements that occur in genomes is extensive, different species still exhibit a certain degree of conservation regarding gene content and gene order. It is in this respect that we have developed a new software tool for the Automatic Detection of Homologous Regions (ADHoRe). ADHoRe was primarily developed to find large regions of microcolinearity, taking into account different types of microrearrangements such as tandem duplications, gene loss and translocations, and inversions. Such rearrangements often complicate the detection of colinearity, in particular when comparing more anciently diverged species. Application of ADHoRe to the complete genome of Arabidopsis and a large collection of concatenated rice BACs yields more than 20 regions showing statistically significant microcolinearity between both plant species. These regions comprise from 4 up to 11 conserved homologous gene pairs. We predict the number of homologous regions and the extent of microcolinearity to increase significantly once better annotations of the rice genome become available.
  94. Vandepoele, K., Raes, J., De Veylder, L., Rouzé, P., Rombauts, S., & Inzé, D. (2002). Genome-wide analysis of core cell cycle genes in Arabidopsis. PLANT CELL, 14(4), 903–916.
    Cyclin-dependent kinases and cyclins regulate with the help of different interacting proteins the progression through the eukaryotic cell cycle. A high-quality, homology-based annotation protocol was applied to determine the core cell cycle genes in the recently completed Arabidopsis genome sequence. In total, 61 genes were identified belonging to seven selected families of cell cycle regulators, for which 30 are new or corrections of the existing annotation. A new class of putative cell cycle regulators was found that probably are competitors of E2F/DP transcription factors, which mediate the G1-to-S progression. In addition, the existing nomenclature for cell cycle genes of Arabidopsis was updated, and the physical positions of all genes were compared with segmentally duplicated blocks in the genome, showing that 22 core cell cycle genes emerged through block duplications. This genome-wide analysis illustrates the complexity of the plant cell cycle machinery and provides a tool for elucidating the function of new family members in the future.
  95. Simillion, C., Vandepoele, K., Van Montagu, M., Zabeau, M., & Van de Peer, Y. (2002). The hidden duplication past of Arabidopsis thaliana. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 99(21), 13627–13632.
    Analysis of the genome sequence of Arabidopsis thaliana shows that this genome, like that of many other eukaryotic organisms, has undergone large-scale gene duplications or even duplications of the entire genome. However, the high frequency of gene loss after duplication events reduces colinearity and therefore the chance of finding duplicated regions that, at the extreme, no longer share homologous genes. In this study we show that heavily degenerated block duplications that can no longer be recognized by directly comparing two segments because of differential gene loss, can still be detected through indirect comparison with other segments. When these so-called hidden duplications in Arabidopsis are taken into account, many homologous genomic regions can be found in five to eight copies. This finding strongly implies that Arabidopsis has undergone three, but probably no more, rounds of genome duplications. Therefore, adding such hidden blocks to the duplication landscape of Arabidopsis sheds light on the number of polyploidy events that this model plant genome has undergone in its evolutionary past.
  96. Vandepoele, K., Simillion, C., & Van de Peer, Y. (2002). Detecting the undetectable: uncovering duplicated segments in Arabidopsis by comparison with rice. TRENDS IN GENETICS, 18(12), 606–608.
  97. Vandepoele, K., Saeys, Y., Simillion, C., RAES, J., & Van de Peer, Y. (2002). Detecting microcolinearity between Arabidopsis and Rice. Proceedings of the 6th Gatersleben Research Conference (2002), “Plant Genetic Resources in the Genomic Era: Genetic Diversity, Genome Evolution and New Applications”.