Nonrandom divergence of gene expression following gene and genome duplications in the flowering plant Arabidopsis thaliana.

Casneuf, T., De Bodt, S., Raes, J., Maere, S., Van de Peer, Y.

Corresponding author:


Genome analyses have revealed that gene duplication in plants is rampant. Furthermore, many of the duplicated genes seem to have been created through ancient genome-wide duplication events. Recently, we have shown that gene loss is strikingly different for large- and small-scale duplication events and highly biased towards the functional class to which a gene belongs. Here, we study the expression divergence of genes that were created during large- and small-scale gene duplication events by means of microarray data and investigate both the influence of the origin (mode of duplication) and the function of the duplicated genes on expression divergence.

Duplicates that have been created by large-scale duplication events and that can still be found in duplicated segments have expression patterns that are more correlated than those that were created by small-scale duplications or those that no longer lie in duplicated segments. Moreover, the former tend to have highly redundant or overlapping expression patterns and are mostly expressed in the same tissues, while the latter show asymmetric divergence. In addition, a strong bias in divergence of gene expression was observed towards gene function and the biological process genes are involved in.

By using microarray expression data for Arabidopsis thaliana, we show that the mode of duplication, the function of the genes involved, and the time since duplication play important roles in the divergence of gene expression and, therefore, in the functional divergence of genes after duplication.

Supplementary Data

  • Additional File 1Description of dataset 1
    This file contains the names of the microarrays that were included in the first dataset, together with the description of the experimental conditions (i.e. to what series of experiments the microarrays belong, from what type of plant the samples were taken and to what wildtype the slide should be compared to).
  • Additional File 2Description of dataset 2
    This file contains the names of the microarrays that were included in the second dataset, together with the description of what tissue the samples were taken from and the conditions in which the plant was grown.
  • Additional File 3Scatterplots of genes belonging to different functional classes.
    This file contains the scatterplots of the Spearman correlation coefficient in function of the Ks value of all genes in the 67 different functional classes of genes. The loess smoother that was fitted to the data is depicted by a full black line, together with its 95% confidence interval.

VIB / UGent
Bioinformatics & Evolutionary Genomics
Technologiepark 927
B-9052 Gent
+32 (0) 9 33 13807 (phone)
+32 (0) 9 33 13809 (fax)

Don't hesitate to contact the in case of problems with the website!