Major events in the genome evolution of vertebrates: paranome age and size differs considerably between fishes and land vertebrates.

Vandepoele, K., De Vos, W., Taylor, J.S., Meyer, A., Van de Peer, Y.

Corresponding authors: ,

Abstract

It has been suggested that fish have more genes than humans. Whether most of these additional genes originated through a complete (fish-specific) genome duplication or through many lineage-specific tandem gene or smaller block duplications and family expansions continues to be debated. We analyzed the complete genome of the pufferfish Takifugu rubripes (Fugu) and compared it with the paranome of humans. We show that most paralogous genes of Fugu are the result of three complete genome duplications. Both relative and absolute dating of the complete predicted set of protein-coding genes suggest that initial genome duplications, estimated to have occurred at least 600 million years ago, shaped the genome of all vertebrates. In addition, analysis of >150 block duplications in the Fugu genome clearly supports a fish-specific genome duplication (approximately equal to 320 million years ago) that coincided with the vast radiation of most modern ray-finned fishes. Unlike the human genome, Fugu contains very few recently duplicated genes; hence, many human genes are much younger than fish genes. This lack of recent gene duplication, or, alternatively, the accelerated rate of gene loss, is possibly one reason for the drastic reduction of the genome size of Fugu observed during the past 100 million years or so, subsequent to the additional genome duplication that ray-finned fishes but not land vertebrates experienced.

Supplementary Data

Alignments and trees

In order to date all duplication events in the genomes of Fugu and Human, phylogenetic trees were constructed based on all gene families containing 2 to 10 members in both organisms. Neighbor joining trees were used for relative dating. The topologies of these trees allow us to distinguish between duplications that occurred prior to or after the split of ray-finned fishes (represented by Fugu) and land vertebrates (represented by Human). Afterwards, absolute dating was inferred by construction of linearized trees, in which branchlength is directly proportional to the divergence time. In order to create these linearized trees, faster and slower evolving sequences (than the average) were removed from the gene family datasets. The combination of both relative and absolute dating allowed us to make a distribution of all duplication events in Fugu and Human (Figure 3 in the mansucript).

For all gene families containing Fugu genes, multi-fasta files, initial and stripped sequence alignments, Neighbour Joining and Linearized (if constructed) trees are available for browsing in the Data processing table.

All linearized trees, used for absolute dating of the Fugu duplication events, and having a bootstrap larger than or equal to 700 (out of 1000) and being congruent with the results of relative dating, can be found at Phylogenetic trees Fugu

All linearized trees, used for absolute dating of the Human duplication events, and having a bootstrap larger than or equal to 700 (out of 1000) and being congruent with the results of relative dating, can be found at Phylogenetic trees Human



Block duplication data

An overview of the Fugu data set (scaffolds, genes, positions, longuest transcript, etc...) used for the detection of duplicated blocks can be found here (tab-delimited text file). For sequence data, please visit the Ensembl Fugu website

Duplicated blocks in the Fugu genome: All duplicated blocks detected with the ADHoRe software tool (Vandepoele et al., 2002) can be browsed here. Note that, as mentioned in the manuscript, only scaffolds with five or more genes were retained for this part of the analysis.



Sequence similarity search

Sequences can be blasted and compared with sequences used in this study by a sequence similarity search (BLASTP or BLASTX).


We are encouraging people to send us any suggestions for collaboration on this topic, or to ask for additional functionality to this supplementary data website.










Contact:
VIB / UGent
Bioinformatics & Evolutionary Genomics
Technologiepark 927
B-9052 Gent
BELGIUM
+32 (0) 9 33 13807 (phone)
+32 (0) 9 33 13809 (fax)

Don't hesitate to contact the in case of problems with the website!