i-ADHoRe 3.0 - Fast and Sensitive Detection of Genomic Homology in Extremely Large Data Sets

Comparative genomics is a powerful means to gain insight into the evolutionary processes that shape the genomes of related species. As the number of sequenced genomes increases rapidly, the development of efficient software applications to perform accurate cross-species analyses becomes indispensable. However, many implementations that have the ability to compare multiple genomes exhibit unfavorable computational and memory requirements, effectively limiting the number of genomes that can be analyzed in one single run. Here, we present a software package to unveil genomic homology based on the identification of conservation of gene content and gene order (collinearity), i-ADHoRe 3.0, and its application to eukaryotic genomes. The use of efficient algorithms and support for a parallel computing environment enable the analysis of large-scale data sets. Unlike other tools, i-ADHoRe is capable of processing the complete Ensembl data set, containing 49 species, in less than one hour. Furthermore, the profile search is more sensitive to detect degenerate genomic homology than chaining pairwise collinearity information based on transitive homology. From ultra-conserved collinear regions between mammals and birds, by integrating coexpression information and experimental protein-protein interactions, we identified more than 400 regions in the human genome showing significant functional coherence. Finally, the impact of low coverage and fragmented genomes on the detection of collinearity was examined. The different algorithmic improvements ensure that i-ADHoRe 3.0 will continue to be a powerful tool to study genome evolution.

* Proost, S., * Fostier, J., De Witte, D., Dhoedt, B., Demeester, P., Van de Peer, Y., Vandepoele, K. (2012) i-ADHoRe 3.0 - Fast and Sensitive Detection of Genomic Homology in Extremely Large Data Sets. Nucleic Acids Res. 40(2):e11. *contributed equally

